1 Questions about GALEN

1.1 What is GALEN?
1.2 What does it do?
1.3 How did it develop?
1.4 Who is it for?
1.5 When will it be finished?
1.6 What will it look like?
1.7 What is wrong with using ICD?
1.8 What makes it different? Is it yet another coding system?
1.9 How is GALEN different from the UMLS ?
1.10 Can GALEN and existing coding and classification schemes co-exist?
1.11 How does GALEN relate to existing standards activities?
1.12 What do we mean by Application Independence and Re-Use ?
1.13 What about Medical Records?
1.14 "Oh, so it's all about smart cards, is it?"
1.15 What about KBS and Decision Support?
1.16 What about natural language?
1.17 What about speech?
1.18 What happens next?

1.1 What is GALEN?

GALEN is the name given to a technology that is designed to represent clinical information in a new way, and is intended to put the clinical into the clinical workstation . GALEN produces a computer-based multilingual coding system for medicine, using a qualitatively different approach from those used in the past.

The GALEN Programme represents the overall development of the technology, which has included several EU-funded Research Projects (the GALEN & GALEN-IN-USE projects).

OpenGALEN is a not-for-profit organisation that has been formed to enable the widest possible exploitation of some of the results of the GALEN Programme.

1.2 What does it do?

GALEN is trying to make it easier to build useful and usable clinical applications, to support clinicians in their day-to-day work. We have identified one of the factors that makes it hard to build and integrate such applications - the need for representation of the medical concepts that such applications have to store and manipulate.

The GALEN Programme is developing a clinical terminology - the GALEN Common Reference Model - in which we can represent all and only sensible medical concepts. The medical concepts represented using that scheme must be accessible and manipulable by computers, as well as being accessible to clinicians. The representation scheme that we are using to build the GALEN Common Reference Model is known as GRAIL - the GALEN Representation And Integration Language. The GALEN Common Reference Model is delivered and used via a software device called a GALEN Terminology Server; we have found that, to be usable, clinical terminologies must now be software rather than data-sets.

The GALEN Technology - the GALEN Common Reference Model written in GRAIL, and its delivery by GALEN Terminology Servers - intended to support the requirements of individual applications for multilingual clinical terminology, to mediate amongst existing and new systems, and to facilitate the development of new systems.

1.3 How did it develop?

The impetus for the development of GALEN emerged at the end of the 1980s, during user-interface prototyping for a clinical workstation (PEN&PAD), and exploratory work funded by the EU (the SESAME & OAR projects).

Early phases of the GALEN Programme developed the GRAIL concept modelling language, experimented with different structures for the GALEN Common Reference Model, and, in parallel, tested the usefulness of the approach with a series of clinical demonstrator projects.

Later phases of the GALEN Programme, during the late 1990s, have concentrated on commercially-robust implementations of GRAIL and the Terminology Server, development of the GALEN Common Reference Model in both scope and detail, and development of tools and techniques to enable the further development, scaling-up and maintenance of the model. An important additional focus has been in developing tools and techniques with which we can map the information found in existing coding and classification schemes to the GALEN Common Reference Model.

Now, as the programme moves out of the reach of EU research-funding, members of the consortia involved in the earlier phases of the programme have formed OpenGALEN: a foundation to enable the exploitation and further development of the GALEN Technology.

1.4 Who is it for?

The GALEN Technology provides language, terminology, and coding services for clinical applications. It is intended for use by clinical application builders, both when developing clinical applications, and as a run-time resource when those applications are in service. The Terminology Server provides mechanisms for applications to reference medical concepts from the GALEN Common Reference Model, and, for example, to store those references as part of a patient's electronic medical record.

In addition, the Terminology Server provides access to a number of facilities designed to cope with the real-world situation in which numerous - generally incompatible - traditional coding schemes already exist. We aim to map between such schemes via the GALEN Common Reference Model, and hence provide mechanisms to mediate between heterogeneous systems using such schemes. The GALEN Common Reference Model acts as an underlying structural interlingua for the domain, and can therefore translate between different external representation schemes, be they existing coding schemes or different natural languages.

The tools attached to the Terminology Server, together with special applications, also provide support for those developing or extending coding systems, either within major national centres or when tailoring systems to local needs.

We intend that many different sorts of applications can be enhanced by the use of the GALEN Technology — for example, medical record applications, knowledge-based systems, natural language processing systems, bibliographic systems.

1.5 When will it be finished?

As with all pieces of software, or existing kinds of clinical terminologies, the GALEN Technology will never be finished per se: there will always be additional functionality to add, refinements or efficiency improvements to be made. The question should be rephrased to When can GALEN be used?

The answer to that question is now ! There are already commercial products being developed and sold using the GALEN Technology. There are now commercially-robust implementations of GALEN Terminology Servers, available on the Win32 platform. There is an important caveat, however: the GALEN Common Reference Model does not yet cover the entirety of medicine.

The development of the GALEN Common Reference Model has been mainly funded so far by the research projects within the GALEN Programme, and has been developed in areas of direct interest to those projects. There is a very well-developed overall structure in place, and it has been populated in depth in specific areas to serve specific purposes. In those areas, the complexity and depth found in the GALEN Common Reference Model is much greater than that found in existing kinds of classification systems; it has been designed to support day-to-day clinical care. Furthermore, we now have mechanisms and tools that enable us to scale up the population of the model as funding permits.

Part of OpenGALEN's interest in further exploitation is to continue to populate the GALEN Common Reference Model, for example in collaboration with new partners in the healthcare industry.

1.6 What will it look like?

Paradoxically, if it goes as planned, GALEN will be essentially invisible to end users - they should see its effects because they get better clinical information systems. GALEN is about using and extending the limits of computer technology to develop better clinical applications. End users may come to recognise that certain styles applications and user interface - we hope the ones they like - have GALEN Terminology Servers inside, but the servers themselves will remain behind the scenes.

Developers and systems managers, on the other hand, will have a new range of tools to use and a new server, analogous to a database server, on their network, or as part of their clinical applications. Those developing coding systems will see a new language in which to express their ideas and a new range of tools on their desks designed specifically to make it easier to develop, validate, and cross reference coding and classification systems (and other, more advanced terminology systems).

1.7 What's wrong with using ICD?

...or insert your favourite coding scheme instead of ICD.

Nothing, provided that you are using it for the purpose for which it was intended. In fact we hope that the GALEN Technology, and specialised tools such as the Classification Workbench will help those building coding and classification systems such as the ICD to make them better and more consistent.

However, increasingly people want to use codes or terms much more widely and to share them between different applications. This is difficult with traditional systems such as ICD.

One reason is that individual coding schemes were typically developed for specific purposes. It is difficult or impossible to re-organise them to fit another purpose — for example to take a scheme organised according to aetiology and reorganise it according to anatomical or functional descriptions.

Another reason is that none of these schemes is sufficiently detailed to support the level of description needed to describe clinical care for medical records, radiographic reports, discharge summaries, or most other applications concerned with patient care. Attempts to scale up existing systems to make them more detailed have had mixed success at best.

All of the schemes embed a great deal of information in a single code or term so that they are difficult to use or manipulate for other applications, such as decision support systems. If the application needs a slightly different point of view, it is almost impossible to find it.

All of these schemes are expensive to translate and maintain in multi-lingual form because every rubric must be translated individually. The effort is just sustainable at their current scale but threatens to become completely unmanageable as they become larger still.

GALEN attempts to provide a reference model - the GALEN Common Reference Model - as a unifying framework for various coding schemes old and new. Because it contains much more information than a traditional coding system, much more can be done automatically. Using the GALEN Common Reference Model, the GALEN Terminology Server can mediate between the multitude of existing coding schemes or they can extract, rearrange and transform the information they contain into forms that other applications can use. They can generate plausible, if not perfect, expressions in any of the natural languages supported, based on the structure of the expression in the GALEN Common Reference Model.

1.8 What makes it different? Is it yet another coding system?

GALEN is not producing yet another coding system. We are aiming to provide a qualitative difference in usefulness. We believe that existing coding and classification schemes have reached (and exceeded) the limits of manageability, and have several inherent deficiencies that limit their usefulness and inter-convertibility. We believe a more formal system for representing medical concepts is now essential so that computer technology can be harnessed to manage and represent the inherent complexity of the domain.

GALEN is not alone in this view. Other groups are also now working on different ways of representing a clinical terminology, spurred on by exactly the same issues that have motivated the development of the GALEN Technology. However, we believe that none offer the range of experience, depth of understanding, and practical demonstration, in both software technology (the GALEN Terminology Server) or clinical content (the GALEN Common Reference Model) that is found within the GALEN Technology.

A GRAIL model is more than an enumerated list or tree of concepts - it can represent arbitrarily deep levels of clinical complexity. A GALEN Terminology Server allows the concepts in a model to be re-organised to give many alternative points of view on the same model e.g. organised by function or organised by anatomy. It can classify new concepts which have never been encountered before (provided the concepts are consistent with the model), and can transform between different points of view on the same or related concepts e.g. between viral hepatitis ; and hepatitis virus .

This means that we usually do not provide applications with the model in a simple text file, or even a series of database tables. Rather, we encapsulate the model, and the implementation of the GRAIL formalism, inside a software system (the server) that provides services to applications — it answers questions about concepts ("what sort of thing is this?", "what can I say about it to describe it further?"); it provides references for external applications to store ("please give me a single reference number for ‘severe breathlessness made worse by lying down at night’"), and interpretations of those references ("give me a French phrase and an ICD code for the concept represented by the magic number 6347891328").

An analogy we have found useful is that the difference between a traditional classification scheme and GALEN is likened to the difference between a text file of information and a database management system.

1.9 How is GALEN different from the UMLS?

The United States National Library of Medicine’s Unified Medical Language System project has produced the Metathesaurus — an extensive cross referencing of a large number of existing coding systems. The primary use of the Metathesaurus is to convert between coding systems. Conversion between coding systems is one important function of GALEN, but only one.

For example, although the Metathesaurus cross references the codes in each system, it does not represent their meaning nor does it provide a means to extend the overall system. Only concepts which have been encountered before in one or another coding system can be dealt with.

The UMLS also provides a semantic network which classifies each concepts into one of a number of categories. However, whereas the GALEN model provides taxonomies which contain thousands of categories in a complex hierarchy, the UMLS provides a relatively flat network of under 200 categories, roughly equivalent to the top-most layers of the GALEN hierarchies.

UMLS and GALEN can and do co-exist, however, as described in the answer to the next question...

1.10 Can GALEN and existing coding and classification schemes co-exist?

Yes, they can (and they do, in typical use of the GALEN Technology), in two different ways.

The first way is by using GALEN Technology to help encode clinical information into existing coding and classification schemes, for example as part of a clinical workstation. This is often required to support statutory requirements that are in force in many countries. This encoding depends on mapping elements of those schemes to the GALEN Common Reference Model. Typically, information captured using the GALEN Common Reference Model is of greater detail than that found in existing schemes (as the GALEN model is designed to support day-to-day clinical care). The automatic classification functionality within the GALEN Terminology Server can be used to find the nearest suitable match in an existing scheme for any, arbitrarily complex, piece of clinical information in the GALEN Common Reference Model.

The second way that the GALEN Common Reference Model co-exists with existing coding and classification schemes is during the authoring and maintenance of both kinds of system: by comparing the results gained from an automatic classification (using GALEN Technology) of terms and properties from a traditionally built scheme, the GALEN Technology can be used to validate, quality-assure and extend the manually-enumerated inheritance in that traditionally-built scheme.

1.11 How does GALEN relate to existing standards activities?

Closely. Members of the GALEN Programme are involved with CEN TC251, and have been involved with work developing out of the OMG’s CORBAMed group.

1.12 What do we mean by Application Independence and Re-Use ?

Existing traditional coding systems have each been developed for a particular purpose. For example, the ICD was developed primarily for collecting international statistics on morbidity and mortality, and as a result contains little relating to the symptomatology of live patients. The purpose, or model of use, of a particular scheme shows up in the various choices made about its construction and organisation. Examples of decisions include the major axes along which a coding system is divided — e.g. whether by anatomy, aetiology or function — and how much detail it holds about each concept.

The choices which make a coding system effective for one purpose usually render it awkward for other purposes. For example, the anatomical organisation required by the surgeon may be awkward for the more functionally oriented physician, not to mention for the nurse who is concerned primarily with care needs and level of dependency. Cross references may help, but maintaining them manually is time consuming and error prone even in limited systems. With traditional systems, computers can provide only minimal help in this process because the information about what the code ‘means’ is not available in a form that computers can use— usually it is only available as a free text rubric which must be interpreted manually.

The GALEN Common Reference Model represents concepts compositionally in terms of relatively simple elements — e.g. "fracture of the femur" is represented in the model not by a single code but by a GRAIL expression: Fracture which hasLocation Femur. Because the GRAIL expression contains Fracture, Femur and the link hasLocation explicitly, the computer can reorganise it either according to the function — in a taxonomy of kinds of trauma — or by anatomy — in a taxonomy of conditions of bones. The developers do not have to choose in advance which organisation to use, nor do they have to maintain the alternative organisations themselves. The process is automatic given the model and the expression. There are many other similar transformations which allow developers to avoid having to make choices which limit the coding system to a specific class of applications.

Of course, it is only relative. There are choices made in developing the GALEN Common Reference Model, and there are limits to application independence and re-use. However, we expect the model to be re-usable for a much wider range of applications than existing coding and classification systems.

Paradoxically, for an individual application, it may be more difficult to use a generic solution rather than a specific one, as the generic solution will offer facilities to satisfy the union of all requirements placed on it by different kinds of applications. To make this ‘application-independence’ really usable, an important part of the GALEN Technology allows the configuration and customisation of the GALEN Common Reference Model for the specific needs of an individual application.

1.13 What about Medical Records?

One of the most important uses of GALEN is as part of medical record systems. GALEN provides the concepts; other applications link them together. GALEN aims to make it possible to make medical records easier to use, more detailed, and more manageable. GALEN is an essential part of an advanced electronic patient record system.

1.14 "Oh, so it's all about smart cards, is it?"

One example of how GALEN relates to other developing technologies is that the GALEN Terminology Server might provide the references — magic numbers — for clinical concepts to encode information on ‘smart-cards’. The GALEN Terminology Server allows a wide range of detailed information to be encapsulated flexibly in a single reference. A terminology server can later, on request, present that information in the local language or provide the relevant codes in the appropriate local coding system. Either the language or the coding system or both may be different from the one used where the information originated.

However, use in smart cards are just one potential application of the GALEN technology.

1.15 What about KBS and Decision Support?

Two of the hardest problems in developing a Knowledge Based System (KBS) are getting the information from medical records and working out the terminology and concept system (the ‘ontology’). A third major problem is that it has proved very difficult to get different KBSs to work together. GALEN aims to help all three problems by providing much more flexible models of concepts and better ways to transform the information in medical records into forms useful for decision support. As the use of GALEN widens, most KBSs will be able to start from part of the GALEN Common Reference Model and extend it confident that they will be able to exchange concepts with other KBSs which have also used the GALEN Common Reference Model.

1.16 What about natural language?

The GALEN Terminology Server is multilingual: a crucial and inherent part of it is the ability to generate, for any arbitrarily complex concept, a natural language expression to be viewed by clinical users (for example as it might appear as part of an electronic patient record), or to aid validation during an authoring phase. Internally, there is a sharp and well-maintained distinction between a concept and its language; this makes it relatively easy to produce generic natural-language generation facilities which can be configured to produce output in a number of different natural languages. Current implementations of the GALEN Terminology Server have demonstrated language support in: English, French, Italian, Dutch, German, Finnish and Swedish.

Extending language support does not rely on manually translating every conceivable concept; it would in any case be impossible to do this. Instead, we exploit the compositional nature of the GALEN Technology – complex concepts are only created and classified when they are first required. It is only necessary, therefore, to have previously manually translated the relatively small set of building-block concepts with which complex compositions might be put together. Together with some grammar rules, language can then be produced from any given collection, or composition, of these basic blocks. Tools and techniques have been developed to support this process.

One of the major motivations of GALEN’s approach and one of its expected additional applications is as part of natural language understanding systems. OpenGALEN includes expertise in natural language understanding and several experiments with other research groups are under way. These techniques have so far been used to help us map elements from existing coding and classification schemes into the GALEN Common Reference Model; a natural extension, but one which we have not yet funded, would permit automatic analysis of more free-form text, such as appears in existing free-text discharge summaries, or which might be entered as part of a clinical workstation interface.

1.17 What about speech?

We are interested in applying speech technology within GALEN, but have not yet had the resources to do so. We think there may be a good fit, as an important function provided by GALEN Terminology Servers is that of retrieving what might sensibly be said about any clinical concept. This function is used now to dynamically build graphical data-entry forms, but might also be used to prune the search-space for a speech-recognition algorithm.

1.18 What happens next?

The GALEN Programme continues. It is being integrated with an architecture for clinical information systems and infrastructures (the HISA standard) as part of the EU-funded SynEx Project. GALEN Technology is being used to define an ontology of drugs, their contra-indications, effects and properties, as part of the UK's Prodigy Drug Prescribing Project.

Commercial implementations of the GALEN Terminology Server are available, and the GALEN Common Reference Model can be licensed under open source through OpenGALEN ( www.opengalen.org ), and we are interested to develop relationships with other healthcare system vendors, for example to further develop the scope of the GALEN Common Reference Model.

In parallel with these direct exploitation and commercialisation activities, further research is underway to develop and explore the underlying technologies.

Making the impossible very difficult, ©OpenGALEN.org, All rights reserved