Download Introduction to the WHO Drug Dictionary

Document related concepts

Polysubstance dependence wikipedia , lookup

Pharmaceutical marketing wikipedia , lookup

Neuropharmacology wikipedia , lookup

Compounding wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Drug design wikipedia , lookup

Medication wikipedia , lookup

Bad Pharma wikipedia , lookup

Prescription drug prices in the United States wikipedia , lookup

Pharmaceutical industry wikipedia , lookup

Drug interaction wikipedia , lookup

Biosimilar wikipedia , lookup

Prescription costs wikipedia , lookup

Pharmacokinetics wikipedia , lookup

Drug discovery wikipedia , lookup

Pharmacognosy wikipedia , lookup

Transcript
Introduction to Chapter 1
The main focus in this chapter is:



A short introduction to the WHO Programme for International Drug Monitoring
An introduction to the origin of the WHO Drug Dictionaries
Practical examples of the use of the dictionaries.
Learning objectives:

have a basic understanding of what the WHO Drug Dictionaries are and how
they can be used.

No drug is inherently safe
Consequently, no treatment involving medicines is free from
the possibility of harm.

A person taking a medicine is exposed not only to the likely
benefits of such treatment, but also the risk of unwanted
effects.
The following pages will explain the origins of the WHO
Programme and how the drug dictionary evolved from the
international drug safety reporting system in the early days of
pharmacovigilance.
Thalidomide
In 1960, a large increase in newborns with rare limb malformations
was observed in West Germany. The affected children suffered from
various degrees of reduction of the long bones of their limbs
(phocomelia), congenital heart disease, ocular, intestinal, and renal
anomalies, and malformations of the external and inner ears.
However, the limb defects were characteristic. Limb reduction
anomalies of this nature are exceedingly rare. At the university in
Hamburg, for example, no cases of phocomelia were reported
between 1940 and 1959. In 1959 there was a single case; in 1960
there were 30 cases; and in 1961 a total of 154 cases.
Thalidomide cont.
The unusual nature of the malformations was a key in unravelling
the epidemic. In 1961 the sedative thalidomide was identified as the
causative agent. Thalidomide had been introduced in 1956 as a
sedative/hypnotic and was used as a sleep aid and for nausea and
vomiting in pregnancy.
It had no apparent toxicity or addictive properties in human or adult
animals at therapeutic doses and these fetal malformations were
not revealed during the clinical trials. Following the association with
birth defects, thalidomide was withdrawn from the market. In total,
it is estimated that almost 6000 children were born with
malformations. *
* Most of the malformations were reported in Europe since Thalidomide had
not received marketing approval in the USA.
The WHO Programme for International Drug Monitoring
The WHO Programme for International Drug Monitoring started as a
small project with a few countries wishing to pool their resources
and information from spontaneous reporting of adverse drug
reactions (ADRs). The ten countries first participating in the project
had organized national pharmacovigilance systems at that time.
The programme was established 1968 following the thalidomide
disaster and the intention was to develop international collaboration
to make it easier to detect rare ADRs not revealed during clinical
trials. For 40 years now, the WHO Programme for International
Drug Monitoring has collected spontaneous reports of ADRs in a
standard format from the national centres. The basic idea is that the
scrutiny of such pooled data may detect some signals earlier than
by looking at purely national data.
In the mid seventies a discontinuation of the Programme was
considered mainly due to financial constraints and changes in
priorities within the WHO. The Government of Sweden offered to
meet the operating cost of the Programme and in 1978 the
operational activities were transferred to Uppsala, Sweden, where a
WHO Collaborating Centre for International Drug Monitoring was
established. The centre was later named the Uppsala Monitoring
Centre (UMC).
Member countries
The WHO Programme started with ten countries (USA, Canada, UK,
Ireland, Australia, New Zealand, Netherlands, West Germany,
Denmark, Sweden )
May 2013 the Programme includes 112 member countries and 32
associate members countries (not yet actively contributing data).
VigiBase
Within the WHO Programme for International Drug Monitoring
individual case reports of suspected adverse reactions are collected
and stored in a database – VigiBase.
As of September 2010 VigiBase contained approximately 5,4 million
case reports.
All drugs mentioned in these case reports are coded with the WHO
Drug Dictionaries.
Origin of the WHO Drug Dictionary
One of the core functions of the WHO Programme since the start
and still today, is to collect, store and analyze reports regarding
suspected adverse drug reactions.
Early reports were generated and sent to the UMC from all over the
world and it soon became evident that the product information
required:



consistency
structure to allow easy and flexible data retrieval and to allow
analysis
classification of chemicals and of indications

hierarchical structure to allow different levels of precision and
to facilitate navigation and aggregation.
Already in the initial phase of the WHO Programme a common
reporting form was developed, together with a terminology for
coding of adverse reactions (WHO Adverse Reaction Terminology)
and a system for coding of information on drugs occurring in
adverse reaction reports.
The latter came to be the WHO Drug Dictionary.
Origin of the WHO Drug Dictionary cont.
The WHO Drug Dictionary was developed to meet the needs
specified on the previous page. Since 1968, the following main data
elements have been recorded for each drug product:






product name
name source and source version
company
country
active ingredient(s)
CAS numbers
The therapeutic indication according to Anatomical Therapeutic
Chemical (ATC) classification was introduced in the WHO Drug
Dictionary in 1982.
In 2002 the drug database was extended and more detailed
information on medicinal products could be included, e.g.
pharmaceutical form and strength.
Why a WHO Drug Dictionary?
The text in a clinical trial report - the verbatim - includes
information about the drugs taken by the patient. This information
can be of various qualities – a trade name, a substance or a very
imprecise name of a group of products.
The WHO Drug Dictionaries are used to find the names mentioned in
the verbatim and translate it to a code, which is entered into the
clinical data set. Sometimes other pieces of information are
important in order to identify the correct entry in the dictionary –
country, pharmaceutical form etc.
The WHO Drug Dictionaries are used by pharmaceutical companies,
Contract Research Organisations (CROs) and regulatory agencies in
order to code and analyse medical products that are mentioned in
clinical data, case reports in post marketing surveillance and other
sources.
Why a WHO Drug Dictionary?
The text in a clinical trial report - the verbatim - includes
information about the drugs taken by the patient. This information
can be of various qualities – a trade name, a substance or a very
imprecise name of a group of products.
The WHO Drug Dictionaries are used to find the names mentioned in
the verbatim and translate it to a code, which is entered into the
clinical data set. Sometimes other pieces of information are
important in order to identify the correct entry in the dictionary –
country, pharmaceutical form etc.
The WHO Drug Dictionaries are used by pharmaceutical companies,
Contract Research Organisations (CROs) and regulatory agencies in
order to code and analyse medical products that are mentioned in
clinical data, case reports in post marketing surveillance and other
sources.
Practical example 2
From a clinical trial you receive a reported verbatim of a
concomitant medication that says 'Zomig'.
Using the WHO Drug Dictionaries you can find out that:

Zomig is the trade name for a drug that contains Zolmitriptan




It
It
It
It
is
is
is
is
a drug that acts on the nervous system
an analgesic
an antimigraine preparation
selective serotonin (5HT1) agonist
Instead of entering all data about Zomig into the clinical trial
database a unique code is used. By entering this code it is possible
to retrieve all information about a product from the WHO Drug
Dictionary.
Practical example 3
From a clinical trial you receive a reported verbatim of a
concomitant medication that says 'Questran'.
Using the WHO Drug Dictionaries you can find out that:





Questran is the trade name for a drug that contains
Colestyramine
It is a drug that acts on the bile acid system
It is a lipid modifying agent
It is a bile acid sequestrant
It is used in cardiovascular therapy because of its lipid
lowering characteristics.
Instead of entering all data about Questran into the clinical trial
database a unique code is used. By entering this code it is possible
to retrieve all information about at product from the WHO Drug
Dictionary.
Benefits of using the WHO Drug Dictionaries
The WHO Drug Dictionaries are also optimized to let you analyse the
coded data. The analysis helps companies and regulators identify
substances that may interact with the products under investigation,
and to make the best decisions based on the full understanding of
the drugs and how they are used.
When the code is entered into the clinical data set it makes it
possible to retrieve all information about the drug when assessing
and analysing the data. It also makes it possible to produce reports
– regulatory reports or internal.
An important reason for coding clinical data is to be able to optimize
the analysis of the data – and to make the best decisions. In order
to understand the positive and negative effects of medicinal
products in patients or clinical trial subjects, one must consider all
medicines they are using, not only a particular target product under
study or evaluation.
Why sponsors use lists of Medications of Interest
When analysing concomitant medication data coded with the Who
Drug Dictionaries many sponsors use lists of Medictions of Interest.
Examples of how lists of Medications of Interest could be used:







Search for protocol deviations
Search for protocol compliance
Signal detection (adverse events - whether adverse
events are caused by the Investigational Product or not)
Disease specific medications
Historical and baseline use of specific medications
Average amount of drug use (within the same family, drug
class or a single ingredient)
Underline diseases not otherwise reported
Introduction to Chapter 2
2.1 General Information
The focus of this chapter is:
• What medicinal products are found in the dictionaries?
• What are the differences between the WHO Drug Dictionary and
the WHO Drug Dictionary Enhanced extended with the WHO Herbal
Dictionary?
Learning objectives:
• understand what types of medicinal products are included in the
dictionaries
• understand the differences between the dictionaries provided by
the UMC
This guide describes the WHO Drug Dictionaries in a general way,
the features, codes and nomenclature as provided by the Uppsala
Monitoring Centre.
The dictionary data is possible to upload into any software system
for coding, reporting or analysis. Some features in the dictionary
may not be included in the system you are using, or other
nomenclature may be used.
The WHO Drug Dictionary
The WHO Drug Dictionaries are used within two main areas –
pharmacovigilance and clinical trials. There are many similarities in
the use in these two areas – but there are also differences.
The similarities are that the dictionary is used to identify and code
concomitant medication both in clinical trials and drug safety
reporting.
The differences are how the collected and coded data is used, but
the mode of coding is also often different; coding in clinical trials is
often done in batch – using auto-encoders, whereas in
pharmacovigilance the process is often manual.
In clinical trials a common use is to identify protocol violations,
whereas the use in pharmacovigilance is to find signals – either
interactions or co-suspects.
In pharmacovigilance the main use of concomitant medication
information is to identify potential interactions or possible cosuspected drugs.
Historical data



The WHO Drug Dictionary contains data from 1968 onwards
A majority of the products have been added in the 90s and the
00s
No entries are deleted even though they are withdrawn, since
old case reports might be coded with these products
Historically the drugs that have been recorded in the dictionaries
are those which have occurred in adverse reaction reports sent to
the UMC, but as all drugs taken by patients are included (whether
they are suspected of having caused the reaction or not), the
database covers a large number of drugs used in countries in the
Programme. The data is taken from official data from drug
regulators, national drug compendia or other trustworthy sources.
The UMC provides three dictionary types:



WHO
WHO
WHO
WHO
Drug Dictionary Enhanced extended with the
Herbal Dictionary
Drug Dictionary Enhanced
Drug Dictionary
The next pages will describe the three dictionaries.
WHO Drug Dictionary Enhanced extended with the WHO Herbal
Dictionary is the most comprehensive dictionary type. In
the September 1, 2010 release the C format of
this dictionary contained 1,729,540 drug entries and 230 604
unique names.
WHO Drug Dictionary Enhanced
WHO Drug Dictionary Enhanced is by far the most comprehensive
dictionary type. All entries in the original WHO Drug Dictionary are
included together with additional data provided by IMS Health. The
WHO Drug Dictionary Enhanced contains medicinal products
from 113 countries (September 1, 2010) of which an extra high
percentage of the drugs available on the market in the 65 countries
provided by IMS Health.
IMS Health is an organisation that collects global drug-utilisation
data, including product information. The IMS Health data has been
processed and put into the structure of the WHO Drug Dictionaries
and it has been coded with the dictionary's codes and IDs.
The data from IMS Health also keeps the dictionary up-to-date with
new drug launches and modifications to the existing drugs. The fast
and frequent update of the dictionary ensures that drugs are
entered soon after (and sometimes before) their launch. This can be
compared with the original WHO Drug Dictionary which includes
drugs when they have appeared in ADR reports sent to the WHO
ADR database.
It is the additional data from IMS Health that gives this version of
the dictionary the additional name 'Enhanced' compared with the
WHO Drug Dictionary which doesn't contain this data. New
subscribers will automatically receive the WHO Drug Dictionary
Enhanced. From 2005, WHO Drug Dictionary is available only to
customers that have started their subscription before 2005 and
have not yet upgraded to WHO Drug Dictionary Enhanced.
WHO Drug Dictionary
The WHO Drug Dictionary was the only dictionary available until
2005 when the WHO Drug Dictionary Enhanced and WHO Herbal
Dictionary were introduced.
The WHO Drug dictionary is only available for customers who
subscribed to the dictionary before 2005 and have not yet upgraded
to the WHO Drug Dictionary Enhanced. All entries in the WHO Drug
Dictionary are also included in WHO Drug Dictionary Enhanced.
The WHO Drug Dictionary does not contain the data from IMS
Health, but drugs from other sources and drugs that have appeared
in ADR reports from any of the countries in the WHO Programme for
International Drug Monitoring, will be added.
WHO Herbal Dictionary
The WHO Herbal Dictionary contains drug names of herbal remedies
together with their active ingredients and therapeutic use.
The WHO Herbal Dictionary was introduced in 2005 in order to
better represent drugs of natural origin – herbals. The UMC has
entered herbal products into the WHO Drug Dictionary for many
years.
The WHO Herbal Dictionary contains all the herbal entries that have
been entered into WHO Drug Dictionary over the years. From the
introduction of the new dictionary all herbals will be included
exclusively in the WHO Herbal Dictionary.
The products entered in the WHO Herbal Dictionary are those that
only contain herbal ingredients. Products that contain a mix of
herbal ingredients and conventional medicines appear in the WHO
Drug Dictionary Enhanced.
Example

The product Valerian contains the herbal Valeriana officinalis
and is included in the WHO Herbal Dictionary.

The product Neurinase contains the herbal Valeriana
officinalis and Barbital (conventional drug) and will therefore
be included in the WHO Drug Dictionary.
Practical issues






The dictionaries are produced on a quarterly basis The release
dates are:
• March 1
• June 1
• September 1
• December 1
Changes to the dictionaries are made once per year in the first
version each year the March release. Read more about
changes in the next chapter.
The WHO Herbal Dictionary is only available in combination
with the WHO Drug Dictionary Enhanced. Since 2010 the WHO
Herbal Dictionary is also released quarterly.
The dictionaries are distributed as ASCII text files and are
available for downloading on the Customer login - file
downloading area or on CD.
Geographic coverage

Products are entered from 113 countries (September 2010).

A majority of products are from Europe, USA, Japan and
India.

For historical reasons some entries remain from countries that
no longer exist- e.g. former East Germany (DDR).
2.2 Different Dictionary Formats
Two Formats
The WHO Drug Dictionaries – the WHO Drug Dictionary, WHO Drug
Dictionary Enhanced and the WHO Drug Dictionary Enhanced
extended with the WHO Herbal Dictionary – are distributed in two
formats. The formats are different ways of representing the data –
using different table structures and different data fields. The
different data fields also make it possible to allow many entries to
be included in the C format that cannot be included in the B
format since the additional data fields allow individual dictionary
entries for specific form and strength variations of the drugs, and
country specific information.
Both formats are available to all customers.
Explaining Formats
In this web course we frequently refer to ‘dictionary types’ and
‘dictionary formats’. It is important that you don’t confuse the two
concepts. The difference between the dictionary types is the drug
entries that are included – the type of drugs or the number of
drugs. The difference between the dictionary formats is the amount
of information available per drug.
Three dictionary types are available:



WHO Drug Dictionary
WHO Drug Dictionary Enhanced
WHO Drug Dictionary Enhanced extended with the WHO Herbal
Dictionary
Two main dictionary formats are available:


B is a dictionary of drug names – and their corresponding
substances and therapeutic use, etc.
C is a dictionary of Medicinal products – including form,
strength, country, etc.
C format
C format enables optimized coding and analysis of clinical and
regulatory data by distinguishing between products in different
countries and in different dosage forms and strengths. This may be
of importance for an in-depth analysis of a safety problem. Also, the
more detailed identification is sometimes important to help coders
choose the right entry – thus increasing data quality.
Since each medicinal product in the C format is assigned an ATC
code, in addition to the ATC code(s) assigned at the generic level of
the ingredient, any analysis using the ATC hierarchy is more
precise. It is possible to identify reactions that are caused by
specific pharmaceutical forms or related to the route of
administration, and to compare them with other forms of the same
product or route of administration.
In the B format each drug will be coded with all ATC codes assigned
to the preferred level of the ingredient.
The differences between the dictionary formats will be discussed
further in the course.
2.3 Content
Content
The WHO Drug Dictionaries contain the trade names of hundreds of
thousands of marketed products. The dictionary also contains
entries on the generic level.
The majority of the entries refer to conventional medicinal products
but a number of other types of products are also included. Since the
WHO Drug Dictionaries contain only medicinal products intended for
use in humans, veterinary products are not included.
Product types in the WHO Drug Dictionaries:
•
•
•
•
•
•
•
Medicinal product
Herbal remedy
Vaccine
Dietary supplement
Radio-pharmaceutical
Blood product
Diagnostic agent
All product types above can be found in the dictionary – but
Medicinal products are the most common.
The WHO Herbal Dictionary contains products with active
ingredients of natural origin only. Apart from that the same
principles apply as to all dictionaries. The WHO Herbal Dictionary is
only available in comabination with the WHO Drug Dictionary
Enhanced.
Only active ingredients
The substances included in the dictionaries are only active
ingredients.
Example:
Drug name: Alvedon
Pharmaceutical form: tablets
Contains: Paracetamol, pregelatinised starch, povidone, stearic
acid
The active substance in the tablets is Paracetamol, the other
ingredients are excipients. The only ingredient that will be added
into the WHO Drug Dictionary is Paracetamol.

Some substances can be seen as both active ingredient and
additive. For example Ascorbic Acid (Vitamin C) can be used as
a preservative but also, in vitamin products, as active
ingredient. When Ascorbic Acid is used in a pharmaceutical
formulation because of its preservative properties it will not be
added in the WHO Drug Dictionary.

Some substances can be seen as active ingredients in some
countries, but as excipients, colouring agents or preservatives
in others. To avoid inconsistencies we try to harmonize these
'grey zone' substances with the focus on non unique product
names in order to avoid unnecessary duplication of product
names in the dictionaries.
The different entries in WHO Drug Dictionaries


a) Trade name entries
b) Preferred name entries
c) Generic entries
d) Umbrella entries

The next two pages will describe each type of entry in more
detail.
a) Trade name entries


The entries in the dictionaries are mainly commercial products
– trade names.

b) Preferred name entries
When a trade name is entered it is linked to a Preferred name
entry. If the product only contains one active ingredient the
Preferred name entry is the generic name of the active
ingredient or plant. If the product contains more than one
active ingredient the Preferred name is the name of the first
entry in the system with this unique combination of
ingredients.

c) Generic entries





Sometimes additional synonym substance names can
be entered as generic drugs into the dictionaries. The
ingredient name is the Established name in English, mostly an
INN name.
Example: Acetaminophen is entered as a synonym generic
name to the Preferred name Paracetamol.
d) Umbrella entries (also known as NOS and ATC
entries)
Sometimes concomitant medication is reported without any
reference to substances or trade names but to a chemical or
therpeutic group of substances. A trade name can also be
reported with unspecific substance information or with a
substance group as ingredient. These entries still contain
important information and needs to be coded. The Umbrella
entries have been created in order to allow coding of these
imprecise verbatims. Umbrella entries are added on request by
the users or when found necessary in the coding of ADRs at
the UMC.
A few entries are also included with less specific information,
such as Penicillin NOS (Not Otherwise Specified) and Beta
Blocking Agents. Many of these correspond to ATC terms, but
some additional umbrella entries are also included.
The Umbrella entries that are derived from ATC terms are
entered as a preferred name but they are not coded with any
active ingredients, since they refer to a group of substances.
Umbrella entries with NOS are coded to an ingredient and the
substance name always contains NOS last in the name (e.g.
Penicillin NOS).
Please note that it is not possible to link trade names to
umbrella entries derived from ATC codes the same way that
trade names are linked to conventional preferred names.
Umbrella entries with 'NOS' are coded with an ingredient and
can therefore be linked to trade names.
Generic levels

The dictionaries contain trade names of medicinal products.

They also contain entries with information on the generic level.
These should be used if you only know the active ingredient of
the products you want to code.

A few entries are also included with even higher level of
information, such as Penicillin NOS and Beta Blocking Agents.
Levels
Using the WHO Drug Dictionaries coding can be made on
different levels. The code system in the WHO drug dictionaries
is structured in a hierarchical way which allows coding and
analysis on different levels of precision.
Example
Base level:
Ibuprofen
Salt level:
Ibuprofen lysinate, Ibuprofen sodium








References used in the dictionaries
IMS Health is the main source of proprietary names, i.e. trade
names of drugs, in WHO Drug Dictionary Enhanced and WHO
Herbal Dictionary. National drug compendia, selected by the
countries participating in the WHO International Drug
Monitoring Programme, or international reference books are
used as references for double/triple-checking of the
information.
The main references to generic drug names are:
• INN (International Non-proprietary Names)
• Martindale
• Index Nominum
• Negwer
• Stoffliste

• USAN
• Merck Index
All data fields – names, manufacturers etc – are entered in the
dictionaries to be as similar as possible as the texts in the
references.
Name sources


The primary name source for non-proprietary names of
substances is the INN (International Non-proprietary Names).
Scientific plant names are an international language used by
botanists from all over the world for naming and describing
plants. Besides the accepted scientific plant names and their
synonyms, vernacular names for plants are also used. These
are understandable only in the language where they are used.
The accepted scientific plant names have been assigned
through the collaboration between the UMC and the Royal
Botanic Gardens, Kew, United Kingdom. The scientific plant
synonyms and the vernacular names for the accepted scientific
plant names have been assigned in collaboration with the
Department of Systematic Botany at Uppsala University.
Introduction to Chapter 3
The focus of this chapter is:

Why are there two coding systems in the dictionaries and what
is the difference between the two systems?

How are codes used in the dictionaries?

How do I interpret information from the Drug Code?
Learning objectives:

have a basic understanding of the codes used in the
dictionaries

understand the intrinsic information in the Drug Code
Preface


The WHO Drug Dictionaries contain a large number of data
fields with information about the Medicinal Products. This
information is used to identify a product that should be coded
in a database, e.g. concomitant medication in clinical trial data
or drug safety data.

The WHO Drug Dictionaries contain two types of IDs that can
be used as links between the case reports in the Clinical trials
database/Drug Safety database and the WHO Drug
Dictionaries. The Medicinal Product ID identifies an entry in the
C format of the dictionaries. The Drug Code is the unique
identifier in the B format.
Two coding systems
The WHO Drug Dictionary contains two types of identifiers (IDs)
that can be used as links between the case reports in the Clinical
trials database/Drug Safety database and the WHO Drug Dictionary:

Medicinal Product ID – unique identifier in the C format

Drug Code – unique identifier in the B format
In the C format both codes are used. Many users implement Drug
Code in the database, as a way to improve querying performance.
Medicinal Product ID
The Medicinal Product ID is the unique identifier that identifies an
entry in the C format. The ID is just a numeric name of the
medicinal product and it has no intrinsic meaning (as the Drug
Code).
The Medicinal Product ID identifies a unique combination of the
following data:

Medicinal Product name

Name Specifier

Marketing Authorisation Holder

Country

Pharmaceutical form (dosage form)

Strength

Drug Code (Drug Record Number + Seq 1 + Seq 2)
Please note that all entries in the C format do not contain
information in all fields.
Example
Med Prod ID 123645 identifies the following entry:






Medicinal Product name: Bezalip
Name Specifier: sr 400mg
Market Authorisation Holder: Hoffmann-la roche limited
Country: Canada
Pharmaceutical form: Coated tablets
Strength: 400 mg

Drug Code: 005441 01 003
sr=slow release
If Bezalip becomes available in, for example another pharmaceutical
form, that will lead to a new entry and consequently a new Med
Prod ID.
So what do the numbers representing the Drug Code stand
for?
Drug Code
The term Drug Code refers to the unique numeric key in the B
format of the dictionary. The B format is the older of the dictionary
formats and since it is a dictionary of drug names a Drug Code
identifies a name, either a trade name, a Preferred name or a
generic name. The Drug Code is used also in the C format, where it
is not a unique key but still has the same intrinsic meaning as in the
B format.
The Drug Code is aggregated from Drug Record Number (drecno),
Sequence number 1, Sequence number 2. The code differs from the
Medicinal Product ID in that it has a meaning; the code is not only a
unique identifier of a name – it also gives information about the
active ingredient(s) and salt/ester form of the substance.
Example
Drug code for Ibuprofen is 001092 01 001
Drug Record Number
The Drug Record number (drecno) identifies a generic identification
level.
In most cases the generic identification level is the one active
ingredient, but it can also identify a unique combination of active
ingredients.
Example showing drecno in red font colour:
Drecno
seq 1 seq 2
001092
01
001
Ibuprofen
000200
01
001
Paracetamol
010502
HCL)
01
001
Co-advil (Ibuprofen w/pseudoephedrine
034074
01
001
bromide/Ibuprofen)
Espasmo motrax (Ciclonium
The four examples above have different Drecnos which means that
they contain different active ingredients – Ibuprofen, Paracetamol,
Ibuprofen w/pseudoephedrine HCL, Ciclonium bromide/Ibuprofen
Sequence number 1
Sequence number 1 (seq 1) identifies the salt or the ester of the
active ingredient in single ingredient drecnos. The number 01
identifies the base substance (e.g. Ibuprofen) and values above 01
will identify salts and esters (e.g. Ibuprofen lysinate).
Example showing seq 1 in red font colour:
Drecno
seq 1 seq 2
001092
01
001
Ibuprofen
001092
02
001
Ibuprofen lysinate
001092
05
001
Ibuprofen sodium
The three entries all have the same active ingredient – Ibuprofen,
but in different 'salts', lysinate and sodium.
Sequence number 1 for multi ingredient products – products with
more than one active ingredient- are described on next page.
Sequence number 1 cont.
Drecnos that identify more than one active ingredient (multi
ingredient or combinations) can not have any salt or esters and
therefore combinations will have only one seq 1 with value 01.
Example showing seq 1 in red font colour:
Drecno
010502
HCL)
seq 1
01
seq 2
001
034074
01
001
bromide/Ibuprofen)
Co-advil (Ibuprofen/Pseudoephedrine
Espasmo motrax (Ciclonium
Sequence number 2
Sequence number 2 (seq 2) identifies trade names and in some
cases a synonym to a generic name (e.g. Acetaminophen as a
synonym to Paracetamol). The entry with seq 2 value 001 identifies
the name of the generic Drecno level – the Preferred Name. In
single ingredient Drecnos this will be the substance name, in multi
ingredient Drecnos it will be the trade name of the first product with
the given combination that was entered into the dictionary.
Example showing seq 2 in red font colour:
Drecno
seq1
seq2
001092
01
001
Ibuprofen
000200
01
001
Paracetamol
001092
01
044
Ipren (trade name containing Ibuprofen)
000200
01
Paracetamol)
017
Panodil (trade name containing
Ipren and Panodil are trade names and are therefore given seq 2
higher than 001.
Sequence number 2 cont.
000200
01
002
Acetaminophen (synonym to Paracetamol)
Acetaminophen is a synonym to Paracetamol. It is a generic name
but it is not the preferred name since it is not the name used in the
INN standard. The Acetaminophen entry is therefore given a seq 2
higher than 001.
Multi ingredient products need to have a separate coding
principle. It is not always possible to include the names of all active
ingredients in the name field, so the first entry with the unique
combination of ingredients will be the preferred name even though
it is not always a generic name.
Example
The entry 010502 (Co-advil) is a multi-ingredient product. The entry
with seq 1 = 01 and seq 2 = 001 is the preferred name but it is not
generic. The generic entry for the combination of ingredients in Coadvil is Ibuprofen w/pseudoephedrine hydrochloride, with seq 2
=016 (the entry is generic but not preferred).
010502
01
hydrochloride)
001
Co-advil (Ibuprofen/Pseudoephedrine
010502
01
hydrochloride
016
Ibuprofen w/ pseudoephedrine
Interpret information from the Drug Code
Example of one Drug code: 000005 01 001
Using the first six positions will give you the Drecno. With the
Drecno you can identify the Preferred name, in this example the
Drecno is 000005 and that stands for Ampicillin.
Position number 7-8 gives you information if the substance is a base
or if it is a salt (or an ester). If these positions are 01 you have the
base, if the number is higher than 01 you have a salt or an ester.
The Drug Code can be used in data retrieval and analysis. The
identification levels are useful for querying and analysis of
aggregated data and that is (as mentioned before) one of the
reasons that the C format also has the Drug Code system even
though it is not the unique identifier.



Drecno identifies all products with the same active
ingredient(s)
Drecno + seq 1 identifies the active ingredient base or
salt/ester forms
Drecno + seq 1 + seq 2 identifies trade names and generic
names
Introduction to Chapter 4
The main focus in this chapter is:

How are the dictionaries structured?

Why are there two dictionary formats and what are the
differences between them?
Learning objectives:

have a basic understanding of the dictionaries’ structure

understand the differences between the C format and the B
format
Preface
The WHO Drug Dictionaries contain main tables and look-up tables
and is built as a relational database.
In a computerized clinical data management system or
pharmacovigilance system, information must be recorded in a
structured way to allow for easy and flexible retrieval and analysis
of the data. The information that goes into a database can be
divided into two main categories: numerical data and alphabetical
data (text or codes).
Numerical and Alphabetical data
Numerical data is typically the result of counts or measurements
and is recorded as the number of what is counted or the amount of
what is measured. The unit of measurement should be added to the
value.
Alphabetical data poses more of a problem, in that it is usually more
complex and difficult to record in a systematic way. Some textual
data falls into natural categories with clear divisions and a limited
number of possible entries. Consistency is however not
automatically achieved, in that there are many ways of expressing
the same thing. Therefore, data entry must be restricted to a
selection from a list containing only predefined, allowed terms
expressed as formatted text or codes.
Terminologies
The simplest form of a terminology is a straightforward enumeration
of terms, commonly listed alphabetically, e.g. a list of countries or
pharmaceutical dosage forms.
When a larger number of terms are involved, it should be
considered whether the list could be organized in a more structured
way. By grouping the terms and assigning them to classes or
categories, a logical classification can be formed. If the classes can
be ranked one above the other, the classification can be structured
in a hierarchical way.
Terminologies cont.
The advantage with a hierarchical classification is that it enables the
use of different levels of precision and detail, both at data entry and
retrieval. A complication occurs when a term belongs to more than
one class. There are two ways of dealing with this:


allowing polyhierarchy, i.e. assigning or linking terms to more
than one class
choosing one preferred class for each term
The former structure can be useful for retrieval purposes (less
chance of 'missing' a term); however, in the presentation of results
of calculations one must be aware of the risk of the same term
being included under several headings and, therefore, counted more
than once. Using the second option, this risk is eliminated. However
this method is more restricting, and there must be clear guidelines
as to what goes where in the system.
The next four pages contain some basic database knowledge to read
prior the chapter describing the structure of WHO Drug Dictionaries.
Basic database knowledge 1/4
What is a dictionary?












A book explaining or translating, usually in alphabetical order,
words of a language or languages
The vocabulary or whole list of words used or admitted by
someone
An ordered list stored in and used by a computer
A book of information or reference on any subject in which the
entries are arranged alphabetically
A person or a thing regarded as a repository of knowledge,
convenient for consultation.
WHO Drug Dictionary is a dictionary because:
It is a repository of information about medicinal products
It contains lists of medicinal products and information related
to them
The information is structured
The information is stored in a database.
What is a database?
Structured sets of stored data e.g. on a CD or in a computer
(usually associated with software to update and retrieve the
data)
A simple database might be a single file containing many rows
(records). This is sometimes called a flat file database and an
example of this kind of database can be an MS Excel sheet.
Each record contains the same set of columns (fields).
Basic database knowledge 2/4
What is a Relational Database?

A database containing more than one table

A database where the data and relations between them are
organized in tables

The tables are linked together with relations

A relation is data in a column that are the same in two or more
tables defined as a key

Certain fields may be designated as keys.

It allows the definition of data structures, storage and retrieval
operations and integrity constraints.
WHO Drug Dictionary is a Relational Database because:

It contains tables having relations between them.

Example: an ATC code has a link (relation) to a medicinal
product. A medicinal product has a link to its active
ingredients.
Basic database knowledge 3/4
‘Parent’ and ‘Child’ relationships:
One parent can have one or more children.
• One to one relationship
- a car has one registration number and one year of production
• One to many relationship
- a car can have several different drivers
Medicinal Products relationships
• One to one relationship (one table)
- a medicinal product has one set of attributes, e.g. name, form and
strength
- these data fields are stored in the Medicinal product table (in the C
format of the WHO Drug Dictionaries)
• One to many relationship (linked tables)
- a medicinal product can have one or more different active
ingredients
- information on ingredients is stored in the Ingredients table, linked
to the Medicinal Product table.
Basic database knowledge 4/4
What is a Key?





A relation (link) between two tables are defined as a key
A field or a group of fields in a table that uniquely identifies each record preventing the table from containing duplicate records
Keys are used to sort records –in alphabetical or numerical order
Keys from a table index – so that searches for a specific values of the key filed(s)
will go faster
Keys ensure referential integrity between tables – so that a ‘child’ table cannot
refer to records which are not in the ‘parent’ table
What is a Look-up table?
Look-up tables contain pre-defined, allowed values, expressed as formatted text or
codes. When linked to a field in a database table, the look-up table ensures that a value
entered in that field matches an existing value in the look-up table. Look-up tables also
allows for translation of values into different languages, as well as short and long text
versions for each value stored.
Look-up tables
In many fields standardised look-up tables are used. The code that
is entered into the tables – in both the C format and the B formatscan be found together with the corresponding texts in the look-up
tables.
Some look-up tables contain values for Not Specified, None and Not
applicable. These values may seem similar, but they have the
following meaning:

Not Specified. This value is used when there should be a value
– but the information isn’t available in the sources.


None. This value is used when a coded value would have been
irrelevant. For example a generic name (like the substance
paracetamol) cannot have a Market Authorisation Holder.
Not applicable (N/A). This value is used in the country field in
the Organization table. The companies coded in the C field
Company are not bound to a specific country – they are the
international owner of the product. The company field is set to
Not Applicable.
Since the C format contains more information than the B format, the C format
requires more look-up tables. In this introduction course the look-up tables will
not be dealt with. We will focus on the main tables.
Dictionary formats
In the 1960's when the WHO Drug Dictionary was created it only
contained basic information about each product. However, in 2005 a
new format was introduced which gives you access to more
information about the products in the dictionary – which will be
useful in the coding process and the analysis of the coded data. The
old (B-format) and the new (C-format) are both available in
parallel.
The Dictionary types are available in both the B and the C-format.
The B-format is a dictionary of drug names. A verbatim name can
be compared to this list of names – and the dictionary will return
information that is useful for coding and analysis: a unique code
(the Drug Code), the active ingredient(s) and the Anatomical
Therapeutic Chemical class(-es) that the drug belongs to.
Unfortunately names that appear in verbatims often refer to more
than one entry in the B-format – it can be available with different
ingredients in different countries, in different pharmaceutical forms
etc. The fact that the same drug name can refer to different
products with different active ingredients is known as non-unique
names.
The C-format was introduced to help coders and other users of the
dictionary to understand the differences between the non-unique
names.
The more detailed information in the C-format is useful in the
coding of medical data. Additional information such as, country,
pharmaceutical form and strength – is sometimes important in order
to chose the right entry and get the codes and classifications that
best represents the drug.
The C-format enables optimized coding and analysis of clinical and
regulatory data by distinguishing between products in different
countries and in different dosage forms and strengths. This may be
of importance for an in-depth analysis of a safety problem. It is
possible to identify reactions that are caused by specific
pharmaceutical forms or related to the route of administration, and
to compare them with other forms of the same product or route of
administration.
Two formats
A
The old format from the old database (no longer produced)
B
The new name for the Old format produced from the new database
(Previously called B-2)
C
The name of the format introduced in 2002
Different format - various focus
The B format is a dictionary of Drug Names
The C format is a dictionary of Medicinal Products
The concepts with the different formats will be described on the
next pages.
Concept C format

The C format is a dictionary of Medicinal Products

Each Trade name always has one Preferred name

The same Trade name can be available in several countries

A Trade name in one country may have several pharmaceutical
forms (e.g. cream, oral suspension, tablet)

Each of the pharmaceutical forms (e.g. tablet) in one country
may have several strengths and all of these entries are unique
medicinal products

Each Preferred name has at least one ingredient (except
umbrella entries)

Preferred names can be linked to many Trade names

Preferred Names are linked to all ATC codes assigned to the
ingredient(s)

Trade names are linked to the ATC code assigned for the
specific trade name
Concept B format

The B format is a dictionary of drug names

Each Preferred name has at least one ingredient (except
umbrella entries)

Preferred names can be linked to many Trade names

Trade names always have one Preferred name

Preferred Names are linked to all ATC codes assigned to the
ingredient(s)

Trade names will have all ATC codes linked to Preferred name.

The WHO Drug Dictionary Browser
The WHO Drug Dictionary Browser make the contents of
the dictionaries – including information from the C format –
available to all users.
The WHO Drug Dictionary Browser allows you to get access to
all features of the dictionaries without having to modify and revalidate your systems.
Simply enter the verbatim text into the browser - and it will
return the entries made in the dictionary.
If you are using the B format you can use the browser as a
reference to understand the difference between non-uniquename entries, information that is often available in the C
format.
The WHO Drug Dictionary Browser is also a good educational
tool. It will illustrate most important features that are covered
in this course





Search page
The first page you reach after login is the Search page. The
following pages will guide you through an authentic search query of
a non unique product name called "Dolmen", mentioned in 8.3.
There are several search options in the WHO Drug Dictionary
Browser:





Product Name
Drug Code
Medicinal Product ID
Substance or combinations of substances
Filter the query result
Condensed search result









If we would use the browser to search the trade
name "Dolmen" we would get this search result:
The Condensed search result of the browser resembles the B
format.
The product name Dolmen could contain three different sets of
ingredients/drug codes. The fact that one of them is marked as
Preferred ("Yes") does not mean that it is the preferred coding
choice, it just implies that it is the name of the first entered
product in the dictionary with that unique combination of
substances (for more information read 8.3.5).
To be able to choose between the three ‘Dolmen’ drug
codes you can either view the Result page (which resembles
the C-format) or you can use Compare to view the entries
side-by-side.
Result
The screen shots show the search result of each "Dolmen"
drug code, C-format information; Name specifier, Country,
MAH, Pharmaceutical form, strength and a unique MP-id for
each entry.
To get information about the preferred base/salt, the ATC
code(s) etcetera click your way to the Product page (4.3.5).
To get an overview of the information use the Compare
functionality which concentrates the information of each drug
code and presents it side by side (4.3.7).
Product page
All information about the individual medicinal product is
presented on the Product page.
Click on the preferred base/salt to view its Product
information. The ATC code links to the ATC tab where the code
is presented according to the organ or system the substance
acts on.
Compare
To be able to choose the correct entry, compare all relevant
data for the selected products side by side in one interface.
Compare may help you choose the correct entry faster.
Compare shows a combination of the B- and C-format
information of the drug codes/medicinal products.


ATC
ATC show how the substances are grouped according to the
organ or system upon which they act and their chemical,
pharmacological and therapeutic properties.
For example M01AE ATC code of the Dolmen (DC: 013638 02
008) entry.
M MUSCULO-SKELETAL SYSTEM
M01 ANTIINFLAMMATORY AND ANTIRHEUMATIC PRODUCTS
M01A ANTIINFLAMMATORY AND ANTIRHEUMATIC PROD., NONSTEROIDS
M01AE Propionic acid derivatives
Substance search

Search for product names and drug codes of an
ingredient(s) by using the Substance search.
To view all available codeine phosphate salts of the dictionary:
Write the base name codeine and pick the specific salt when
verifying the substance against the substance register.
WHO DD Browser administration tool
To customize the browser settings, use the WHO DD Browser
administration tool. Select which dictionary version you would like
to use. If you are coding concomitant medication of a clinical trial,
use the same browser version as the dictionary version of which you
are coding the clinical trial.
The possibility to select which dictionary type to browse is intended
for CROs that have to adapt the coding of concomitant medication
to their sponsors WHO Drug Dictionary subscription.
Export
Save your browser query results and export the data to an XML file
that can be opened and analysed in e.g. MS Excel and SAS. When
evaluating protocol inclusion and exclusion criteria’s, or listing
medications of interest you can make the lists in the browser.
Make one or a series of queries and save the search results of
interest in 'Export', a 'shopping cart' functionality, and export the
data of interest in a single download.
Request
New drug request: If you can't find a product name in the
dictionary and would like it to be entered please send us the
information. If we can verify the information the product will be
entered to the dictionary the following quarter.
Change request: Is any of the product information incorrect please
report this to the UMC by filling in the form and specify your source
of information. Changes of the dictionaries are only made once per
year, in the 1st of March release.
Would you like to try the WHO DD Browser?
If you already have the browser but haven't got the features Export,
Compare and Request, contact us if you would like to upgrade.
If you have any questions regarding the WHO DD Browser or if you
would like to apply for a one week test account, please contact
[email protected]
Introduction to Chapter 5
The main focus in this chapter is:



What is a name specifier?
Presenting an overview over the most important tables in the dictionaries.
Presenting ways of distributing/presenting the dictionary data depending on your
own system.
Learning objectives:


understand how to interpret the elements of a verbatim (name, name specifier,
pharmaceutical form etc)
understand where in the dictionary the corresponding information can be found.

Preface
As already described in the previous chapter the WHO Drug
Dictionaries contain main tables and look-up tables and are
built as a relational database. This chapter will explain the
main tables of the C format and the B format. Important fields
in the Medicinal Product table in the C format will be described
in more detail.

This chapter will increase the understanding of how to
interpret the elements of a verbatim (name, name specifier,
pharmaceutical form etc), and understand where in the
dictionary the corresponding information can be found.
The WHO Drug Dictionary Browser (Chapter 4.3) can be used
as a reference to access C format information.
Main Tables in the C format:





Medicinal Product
Pharmaceutical Product
Ingredient
Therapeutic Group
Substance
Each table will be described on the next pages.
The Medicinal Product table will be described in more detail.
Medicinal Product table
This is the main file in which most information about the product is
recorded. A medicinal product entry represents one product as it is
marketed in a specific country. The Medicinal Product file contains
the unique identifier and the Trade Name. It also contains
information about the company responsible for the product
internationally and the company that markets the product in the
given country, the Marketing Authorisation Holder. The type of
product is also recorded, a majority of the entries are regular
medicinal products, but for example an increasing number of
vaccines are recorded.
The table’s unique identifier is the Medicinal Product ID and it is
used to tie together other tables that belong to the same drug
entry.
Pharmaceutical Product table
The database format allows for the use of a two level structure of
the product’s information. Some medicinal products contain more
than one pharmaceutical form, or more than one type of the same
pharmaceutical form. The two level structure makes it possible to
record both the Medicinal Product (product name, manufacturer etc)
and the Pharmaceutical Products with their individual ingredients.
Examples:


a suppository is packaged together with a cream
oral contraceptives often contain three different types of
tablets
Therapeutic Group table
The Therapeutic Group table is a pointer to the ATC table, allowing
one Medicinal Product to have one or more ATC code, and one ATC
code to be used by several Medicinal Products.
In the Anatomical Therapeutic Chemical (ATC) classification system,
the drugs are divided into different groups according to the organ or
system on which they act and their chemical, pharmacological and
therapeutic properties. ATC is covered in the next chapter.
The table also contains information about the type of ATC
(conventional ATC or herbal ATC) and if the assignment of this
particular Medicinal Product complies with the ATC guides, or if it is
an individual assignment.
Ingredient table
This table links one or more substances to the Pharmaceutical
Product table. It contains the identifier of the substance as well as
information about the amount of the substance. All medicinal
products should have at least one ingredient (exceptions are
umbrella entries like Beta blockers NOS).
One Ingredient entry is created for each active ingredient in the
product.
Substance table
The Substance table is the look-up table for the substance name. It
is linked from the Ingredient table, the substance_id is translated
into the substance name. Currently only names in English are
included.
Important fields in the Medicinal Product table
The following pages give more facts behind the important fields in
the Medicinal Product table.
The fields that will be described are:







Drug name
Name Specifier (8 pages)
Sequence number 3
Sequence number 4
Generic (2 pages)
Medicinal Product ID
Drug Code
Drug name
This field can contain trade names or generic names.
The names are entered as written in the source, with the exception
of the special characters
– such as ä, Å, ü – which need to be transcribed.
For example: The trade name 'Kåvepenin' will be transcribed to
Kavepenin.
The product names are coded as closely as possible to the original
name.
The general rule is that the first letter of the name is in upper case,
while the rest is lower case. Sometimes you can find exceptions if
the name has another format – like Act-HIB. Some name sources
are only written in upper case which makes it difficult to be
consistent.
Name Specifier
Sometimes drugs are given names that consist of a ‘core name’ and
a name extension. This name extension is called a ‘name specifier’
in the WHO Drug Dictionaries. The medicinal product name specifier
is an additional element added to the medicinal product name in
order to distinguish medicinal products with the same medicinal
product name. The full product name contains this field and Name
Specifier together.
Information about the following properties of a Medicinal Product is
entered in the Name Specifier field:




Form
Strength
Route of administration
Manufacturer/MAH
Note that Name Specifiers are used inconsistently in different
countries. The entries are made according to the customs of the
country where the drug is marketed.
Example:
Clavucid is the Belgian name of a product that contains 500 mg
amoxicillin and 125 mg of clavulanic acid. The same product is
known in Switzerland as Clavucid 625.
Name Specifiers related to the product’s pharmaceutical
form and pharmacokinetic properties






enteric coated
slow release
CR
etc
Name Specifiers related to strength
The strength related name specifiers can be either a
precise/numeric description of the strength, or an imprecise
term – such as Forte, Plus, Mite.
Multi ingredient products may also have name specifier
information that describes the amount of the different
ingredients. There is no international standard for how these
are written.
Name Specifiers based on company related properties
The example shows entries where the company name is considered
to be a name specifier. There are some exceptions where the
manufacturer’s name is defined as a part of the trade name in the
reference. Generic products are often marketed with the name of
the substance together with the company name. These are entered
into the dictionary with the combined name intact in the name field.
Sometimes non-generic trade names also marketed with the trade
name together with the company name, e.g. parallel imported drugs
often have the name of the company that makes the import as a
part of their names. These products are entered into the dictionaries
with the company name in the name specifier.
Name Specifiers based on company related properties cont.
In the entry Ampicillin-ratiopharm® 250 TS both 250 and TS are
considered name specifiers, but the company name is considered to
be part of the trade name.
Since Methyldopa is a generic name the full name Methyldopa
STADA is considered to belong to the name field.
Since Neupogen is a trade name AMGEN is considered to be a name
specifier.
Important Exception – Name Specifiers related to
composition
For Name Specifiers that are based on the composition, the
combination of active ingredients are treated as a part of the name
field. The reason for this is that the users of the B format wouldn’t
otherwise be able to distinguish between the entries since the name
specifier isn’t included in the B format.
Notice that some commonly used name specifiers have different
meanings in different products. The name specifier ‘Plus’ can
sometimes refer to 'extra strong' or to 'includes additional
ingredients'.
When the first entry containing ‘Plus’ is entered it is sometimes not
possible to tell if it refers to a strength or a composition. The ‘Plus’
is therefore entered in the name field if the meaning of the ‘Plus’ in
unknown.
Name specifier - Old Form
Sometimes products change composition without changing trade
name or name specifier.
In order to distinguish between the old and the new the flag /OLD
FORM/ is added to the name specifier field of the old entry.
Example:
Treo Comp /OLD FORM/
Acetylsalicylic acid
Aprobarbital
Caffeine
Codeine phosphate
Treo Comp
Acetylsalicylic acid
Caffeine
Codeine phosphate
The ‘old form flag’ is only included in country specific entries, since
the drug name sometimes keeps the old composition in one country
while a new composition is introduced in another country.
Information about Old Form was previously only shown in the C
format, since 2010 this information is also available in an additional
table for B format users (OldForm DrugCode List.txt). The
information has also been refined to make it possible to distinguish
between products that are flagged as /OLD FORM/ in some or all
listed countries, useful information for both B and C format users.
(e.g. in chapter 8.6)
Sequence number 3
This sequence number shows which pharmaceutical form the
product has – you can find the corresponding text in the
Pharmaceutical form look-up table. This information is also recorded
in the Pharmaceutical Product table. If a Medicinal Product has more
than one pharmaceutical form this field will be set to 900 Combination.
Note that this field is a code field. The same code is always used for
a specific pharmaceutical form.
Sequence number 4
This sequence number shows the strength – amount of active
ingredient. The sequence number 4 corresponds to a text
description of the strength - this is found in the Strength look-up
table.
If the product contains more than one active ingredient the
strengths will be separated by a "/" in the Strength look-up table.
This information is also available in the Ingredient table.
Note that this field is a code field. The same code is always used for
a specific strength information.
Generic
This field contains either a Y or a N.
Y – Yes if the name of this entry is considered generic.
N - No if the name of this entry is considered not generic
For single substance entries all Preferred Name entries are set as
generic. Sometimes additional generic entries are added as a
complement to the preferred names.
Acetaminophen is a synonym to Paracetamol. It is a generic name
but it is not the preferred name since it is not the name used in the
INN standard. The Acetaminophen entry is therefore given a seq 2
higher than 001.
Multi ingredient products are explained on the next page
Generic - Multi ingredient products
Multi ingredient products need to have a separate coding principle.
It is not always possible to include the names of all active
ingredients in the name field, so the first entry with the unique
combination of ingredients will be the preferred name even though
it is not always a generic name.
The entries in the example are combination products all three with
the same ingredient (ampicillin and cloxacillin). The entry with seq 1
= 01 and seq 2 = 001 is the preferred name but it is not generic.
Sometimes generic entries for combination products are entered on
request. This is only possible if the combined name is shorter than
45 characters. Some abbreviations are acceptable, but it must be
possible to understand the text without risk of mistakes. The
substance name should as far as possible remain complete.
Notice that the generic entry for a multi ingredient product may also
be the preferred name entry – but this is an exception rather than a
rule.
Medicinal Product ID
This field is a serial number with no intrinsic meaning, and it is
generated automatically.
The Medicinal Product ID is unique for each entry in the database
and the C format. The ID does not change when information in the
entry changes.
To read more about the Medicinal Product ID please see chapter 2.
Drug Code
A Drug Code identifies a name, either a trade name or a generic
name. The Drug Code is the unique key in the B format and it is
used also in the C Format. In the C format the drug code is not the
unique key, but it still has the same intrinsic meaning as in the B
formats.
The Drug Code is an aggregation of Drug Record Number (Drecno),
Sequence number 1 (seq1) and Sequence number 2 (seq2). The
code differs from the Medicinal Product ID in that it has a meaning.
The code is not only a unique identifier of a name – it also gives
information about the active ingredient(s) and salt/ester form of the
substance.
The Drug Code of a drug is generated automatically based on the
name, the substance(s) and salt/ester of the product.
To read more about the Drug Code please see chapter 2.
5.3 Main tables in the B-format
Main tables:




The
The
The
The
DD table
DDA table
ING table
BNA table
Each table will be described on the next pages.
The DD table
The DD table is the main table in the B format. It contains the basic
information about the drug - name, company etc. The other tables
are linked to the DD table through the Drug Code which is the
unique identifier in the DD table.
The DDA table
The DDA table is a pointer to the ATC table, allowing one DD entry
to have one or more ATC codes, and one ATC code to be used by
several DD entries.
The table also contains information about the type of ATC - the
conventional ATC or the Herbal ATC, and if the assignment of this
particular drug complies with the ATC guides, or if it is an individual
assignment.
The ING table
The Ingredient (ING) table links the DD entry to one or more
ingredients - found in the BNA table.
The substances included in the dictionaries are only active
ingredients.
References to ‘multi-ingredient products’ in this document means
products with more than one active ingredient.
The BNA table
The Substance name table (BNA) translates the CAS number in the
ING table to the substance name. Currently only names in English
are included.
Introduction to Chapter 6
The main focus in this chapter is:

Did you know that there are many benefits with using the C
format?
Learning objectives:

have more insight about the differences between the C format
and the B format and the benefits with using the C format.
Introduction
The C format makes it possible to distinguish between form and
strength when coding. Using the C format the quality of your coded
information will be higher and that will benefit your analysis and
retrieval of data. This chapter will describe the benefits of the C
format!
First we will explain how the B format relates to the C format?
C format in relation to B format
The C Format is very similar to the database in which the UMC
enters and maintains the dictionary data. Since the B format is
produced from a database that is closely related to the C format
there is mapping between the two dictionaries as well as inclusion
criteria for which of the database entries that should be included in
the B format. Some look-up tables are also converted, e.g. the two
formats have different codes for countries. The B format only have
the Drug Codes while the C format has both the drug codes and the
Medicinal Product IDs.
The C format contains entries with different levels of information
(see next page). The B formats do not contain information about
pharmaceutical form or strength – so the C format will contain a
large number of entries that can’t be included in any of the B
formats.
The C format contains many entries with different levels of
information. The same name can appear with different
pharmaceutical forms, different strengths, in different countries etc.
Information levels in C format
When an entry is made into the WHO Drug Dictionaries it contains
information about the drug name, active ingredients,
pharmaceutical form, strength, country etc. All this information is
available in the C format of the dictionary
When coding drug information in clinical data all this data is not
always available. Therefore the C format needs to include several
levels of precision. When a drug is recorded in the database a
number of less specific entries are generated automatically.
Based on the recorded data a number of additional entries are
created – one where the information about strength is set to
unknown, one with the pharmaceutical form set to unknown etc.
The least specific entry includes only information about the drug
name and the active ingredients.
Note that the table above illustrates all possible information levels.
The complete data entry contains the data in information level 6.
Based on this entry a number of less specific entries are generated
automatically – levels 1, 4 and 5.
Level 3 is not generated automatically but some entries appear in
the dictionaries for other reasons. It contains country information
but no MAH.
Level 2 is often not generated since the name specifier is left empty
in the levels where the corresponding information (strength,
pharmaceutical form or company) is no longer included. For
example a name specifier that refers to the pharmaceutical form will
not be used in levels 4 and below.
Not all products are available in all six information levels. Entries
made before 2002 were entered in levels 1-4.
Inherited information
As described above the entries with the data content that
corresponds to information level 6 will be used as a basis for the
less specific entries. These less specific entries may inherit
information from more than one higher level entry, e.g. if a product
is available in two pharmaceutical forms with different ATC codes,
the entry without any information about pharmaceutical form will be
coded with both these codes.
Comparison between the B and the C-formats. Additional
information not shown in these screenshots from the WHO Drug
Dictionary Browser are the respective ATC codes of the entries. The
ways of representing the ATC codes in the different formats are
described in chapter 7.4.
Name, Form, Strength, Country
The C format has a number of advantages over the alternative – the
B format.
The B format is the format in which the WHO Drug Dictionary has
been distributed for over 20 years. It is a dictionary of drug names,
where a name can be looked up and translated to coded information
– mainly active ingredients, drug codes (which represent active
ingredients and salts/esters) and the Anatomical Therapeutic
Chemical classification.
In the B format the name should only appear once, so the first time
a name is found it will be entered: e.g. the first time the drug
Dolmen was found it was in a Spanish case report sent to the UMC.
This entry is included in the B format without country information. It
was later found that the same trade name is also used in Italy – but
with other active ingredients. This new entry has to be added to the
dictionaries. In order to include it in the B format the names of the
entries have to be made unique. This is done by adding the drug
record number and the sequence 1 to the name in the B format.
If it is found that the name is used in a third country – but with one
of the existing compositions, it will be added to C, but not to B.
ATC coding
Both the B formats and the C format contain the ATC classification
(more about the ATC Classification in next chapter). In the B
formats all products are coded with the same ATC codes as its
preferred name - an active ingredient or unique combination of
active ingredients. For example, all products containing Aciclovir will
be coded with the following ATC codes:
In the C format a specific product is coded with the ATC code that
reflects the most common use of the product, e.g. an Aciclovir
product used mainly topically would be coded with only the D06BB
code.
Coding
As described above the C format contains data that makes the
analysis of the clinical data more reliable. It also makes the coding
easier since more information is available when investigations are
necessary.
As described above the name needs to be unique in the B format.
This causes problems since the same name can appear with
different active ingredients for a number of reasons:
The same drug name is used in different countries with different
sets of ingredients.
The same drug name is used in different dosage forms which
contain different sets of active ingredients.
A product has changed its composition without changing its trade
name.
In the B format all this is solved by adding the drug record number
and the sequence 1 to the name. This gives information about the
active ingredient(s) and the salt or ester of the ingredient.
The C format makes this information available for the coder – it
contains information about the countries in which the products are
available as well as the different dosage forms – and the
corresponding ingredients/drug code. This helps the coder choose
the right entry by allowing them to see all relevant information for a
drug entry.
Introduction to Chapter 7
The main focus in this chapter is:


What is an ATC term?
How and why is the ATC classification integrated in the
dictionaries?
Learning objectives:


get a basic understanding of the Anatomical Therapeutic
Chemical classification
understand how and why drugs are assigned ATC codes
Preface
ATC stands for Anatomical Therapeutic Chemical.
In the ATC classification drugs are divided into different groups
according to the organ or system on which they act and their
chemical, pharmacological and therapeutic properties.
Basic facts
The Anatomical Therapeutic Chemical (ATC) classification is an
integrated part of the WHO Drug Dictionaries. The ATC classification
is maintained by the WHO Collaborating Centre for Drug Statistics
Methodology in Oslo.
Anatomical
The organ or system on which a drug acts
Therapeutic (and Pharmacological)
Indication for typical use(s)
Pharmacological Form
Chemical
Compound structure and properties
Through the hierarchical ATC classification it is possible to analyze
the data – not only by comparing products containing the same
substances but to aggregate statistics on different levels using the
hierarchies built into the dictionaries, and to compile line listings.
The UMC has developed a specific Herbal ATC which is used in the
WHO Herbal Dictionary - it follows the same principles as the
conventional ATC classification. Herbal ATC will be covered in the
last chapter of this course.
Levels
The ATC classification hierarchical levels:
1.
14 anatomical groups designated by the letters A – V
2.
Therapeutic main groups
3.
Therapeutic/pharmacological subdivision which is designated
by letters
4.
Therapeutic/pharmacological/chemical subgroup which is
designated by letters. In this level the pharmacological properties
and the chemical nature of the substance are taken into account
5.
Individual substance which is designated by numbers. This level
is not used in the WHO Drug Dictionaries
Structure
In the Anatomical Therapeutic Chemical (ATC) classification system,
the drugs are divided into different groups according to the organ or
system on which they act and their chemical, pharmacological and
therapeutic properties.
Drugs are classified in groups at five different levels.
The drugs are divided into fourteen main groups (1st level) with one
pharmacological/therapeutic subgroup (2nd level). The 3rd and 4th
levels are chemical/pharmacological/therapeutic subgroups and the
5th level is the chemical substance. The 2nd, 3rd and 4th levels are
often used to identify pharmacological subgroups when that is
considered more appropriate than therapeutic or chemical
subgroups.
Note that the drugs in the WHO Drug Dictionaries are coded on the
fourth level. The Drug Record number in the WHO Drug Dictionaries
has the corresponding meaning as the fifth level, but with higher
precision.
ATC classification of the substance Metformin - an example
The substance Metformin is a biguanide antidiabetic. It is given
orally in the treatment of type 2 diabetes mellitus.
The ATC code for Metformin is A10BA 02
Thus, in the ATC system all plain metformin preparations are given
the code A10BA 02. Since the drugs in the WHO Drug Dictionaries
are coded on the fourth level, products with Metformin are recorded
with the ATC code A10BA.
Browse the ATC Classification!
Please go to https://dictionaries.who-umc.org/dd_browser_atc
(the web site will open in a new window)
Login
User name: [email protected]
Password: Webtraining_2010
Navigate the ATC classification, understand the hierarchical
structure in more detail, and compare different ATC codes.
ATC coding in the WHO Drug Dictionaries
Products are coded on the fourth level (five characters)
Level five in ATC identifies a substance or combination – overlaps
with the Drecno system, and the drecno system is more precise.
Examples:
The ATC coding of combination products is sometimes limited
– e.g. 'Paracetamol in combination', 'Codeine in combination'
The Drecno system identifies the unique combination of substances
– e.g. Paracetamol and Codeine, Codeine and Ibuprofen
Nomenclature
International non-proprietary names (INN) are preferred.
If INN names are not assigned, USAN (United States Adopted
Name) or BAN (British Approved Name) names are usually chosen.
WHO’s list of drug terms is used when naming the different ATC
levels.
Principles of classification
Medicinal products are classified according to the main therapeutic
use of the main active ingredient, on the basic principle of only one
ATC code for each pharmaceutical formulation (i.e. similar
ingredients, strength and pharmaceutical form).
A medicinal product can be given more than one ATC code if it is
available in two or more strengths or formulations with clearly
different therapeutic uses.
A medicinal product may be used for two or more equally important
indications, and the main therapeutic use of a drug may differ from
one country to another. This will often give several classification
alternatives. Such drugs are usually only given one code, the main
indication being decided on the basis of the available literature.
Problems are discussed in the WHO International Working Group for
Drug Statistics Methodology where the final classification is decided.
Cross-references will be given in the guidelines to indicate the
various uses of such drugs.
Example:
The substance Aciclovir can be coded with any of the following ATC
codes:
In the C format a specific product is coded with the ATC code that
reflects the most common use of the product, e.g. an Aciclovir
product mainly used ophthalmologically would be coded with
the S01AD code.
Principles of classification cont.
The ATC system is not strictly a therapeutic classification system. At
all ATC levels, ATC codes can be assigned according to the
pharmacology of the product. Subdivision on the mechanism of
action will, however, often be rather broad, since a too detailed
classification according to mode of action often will result in having
one substance per subgroup which as far as possible is avoided.
Some ATC groups are subdivided in both chemical and
pharmacological groups. If a new substance fits in both a chemical
and pharmacological 4th level, the pharmacological group should
normally be chosen.
Substances classified in the same ATC 4th level cannot be
considered pharmacotherapeutically equivalent since their mode of
action, therapeutic effect, drug interactions and adverse drug
reaction profile may differ.
Multiaxiality
A medicinal product can be given more than one ATC code if it is
available in two or more strengths or formulations with clearly
different therapeutic uses.
Example:
Aciclovir can have three different ATC codes – depending on the
main use of the product
Principles for changes to ATC Classification
As the drugs available and their uses are continually changing and
expanding, regular revisions of the ATC system will always be
necessary.
Changes in the ATC classification should be kept to a minimum.
Before alterations are made, difficulties arising for the users of the
ATC system are considered and related to the benefits achieved by
the alteration.
The ATC classification is revised on a yearly basis – as a part of the
March 1 release. The revision mainly consists of additions, but
sometimes some groups are moved or split.
Alterations in ATC classification are made when the main use of a
drug has clearly changed, and when new groups are required to
accommodate new substances or to achieve better specificity in the
groupings. An example of this is shown in chapter 8.6 'Keeping your
dictionary up to date'.
Preferred name or trade name level assignments of ATC
Codes
The ATC guidelines describe how active substances are assigned
ATC codes. The ATC classification of drugs is made according to the
most common use – the ATC code is not synonymous to indication.
The classification has been developed to allow different types of
analysis on drug utilization data, clinical and safety data.
ATC assignments can be made on two levels of precision in the drug
dictionaries:
1. Preferred name level - Shows all ATC codes assigned to the
products containing the specific substance(s).
2. Trade name level - In the default settings of the dictionary
formats (B&C) the precision of the ATC assignment differs on
the trade name level. Products are assigned only with the ATC
codes that correspond to the most common use of this
particular product in the C format.
Sometimes it is important to know the primary use of a specific
medicinal product – and sometimes it is more important to
know the many potential areas of use (and effects) of the product.
There are differences in how ATC codes are assigned and presented
in the B and the C format. But by using the ATC tool which will be
described later in this chapter, you can code in either format but
use the benefits of the ATC code precision of both formats.
ATC code assignment in the B & C formats
B- format: All trade names sharing the same Drug Record Number
and Sequence 1 (active ingredient and salt) are assigned all ATC
codes of the preferred entry:
C-format: In the C format ATC codes can be assigned with higher
precision:
As you can see in the C format example above the ATC assignment
depends on the level of information of the entry.
Preferred name level: The least specified entry (info level 1) of
the preferred name is the top of the hierarchy and inherits all ATC
codes of all entries sharing the same drecno and sequence 1. In this
example (ketoprofen DC: 003127 01 001) there are about 350
different trade names sharing the same drecno and sequence
1. Each of the 350 trade names could potentially have different ATC
codes. All ATC codes of the trade names are inherited to the info
level 1 entry of ketoprofen MP ID 23054.
Trade name level: All individual trade names sharing the same
drecno, sequence 1 and having sequence 2 = 002-999. The table
shows one of these trade names - Orudis DC: 003127 01 002.


Unspecified trade name level (info level 1): The
unspecified trade name level inherits all ATC codes of the
specified trade name level entries (info level 2-6), in the
Orudis example there are almost 200 specified Orudis entries
having either of the two ATC codes M01AE or M02AA
depending of area of use.
Specified trade name level (info level 2-6): Comprises the
most specific coding alternatives in the dictionaries and is only
available in the C format. Most entries at this level just have
one ATC code assigned but they can have several.
Trade name level assignment

Until 2008 only the C format had a higher precision of ATC
code assignment where each drug entry is assigned a
minimum of ATC codes, preferably only one. Trade name level
assignment is relevant when individual cases are analysed and
when certain reports are produced.
From 2008 a similar precision in ATC code assignment is also
available for B format users. An additional table (DDA
Exclusive.txt) is available in the ATC Tool making it possible to
map the Drug Code to the ATC table with higher precision.
Preferred name level assignment
When analyzing large data sets and when identifying protocol
violations it is more important to know all the ATC terms of the
active ingredients of a drug.
Since 2008 C format users can adapt their coded data to
make broader analyses by using the additional extensive ATC
assignment table (ThG Extensive.txt) to map the coded data to the
ATC table for more extensive analysis - similar to the B format ATC
assignment.
ATC Tool
The files that have been described in this chapter can be found in
the 'WHO DD ATC Tool'-folder in your downloaded WHO drug
dictionary:
DDA Exclusive.txt (B format)
ThG Extensive.txt (C format)
Introduction to Chapter 8
The main focus in this chapter is:


What is the general principle of coding drug verbatims found in
CRFs and ADR reports?
How do I deal with difficult verbatims?
Learning objectives:



understand how to code with highest possible precision (form,
strength if available)
facilitate the investigation of omission lists
have basic understanding how to use ATC in the coding
process, to give input to the company's optimal use of coding
resources.
General principle of coding co-medication
1. The verbatim text in a case report form (CRF) contains
information about drugs taken by the patient
2. Identify each drug name
3. Find the corresponding terms in the WHO Drug Dictionary and
code the verbatim term.
The general principle is easy and straightforward. Often,
unfortunately, coding co-medication is much more complicated…
Auto-encoding
The WHO Drug Dictionary is often used for auto-encoding of CRFs.
Auto-encoding means that the clinical data is coded by a program,
which matches original text to predetermined dictionary terms. The
match has to be exact, otherwise an omission is created and the
term must subsequently be manually coded
Example:
If 'Paracetamol' is reported - Paracetamol will be auto-encoded
If 'Parecetamol' is reported - an omission will be created and
manually coding is needed
Sometimes the data on the CRF is not precise enough to find only
one hit e.g. when a trade name is available in several countries and
pharmaceutical forms. If so an omission will be created and manual
coding is needed.
Manual coding
When coding manually, good clinical judgment should be used in the
case where an exact match to a verbatim term cannot be found.
The term that is chosen from the WHO Drug Dictionaries should be
as precise as possible.
Example 1:
If the verbatim is 'Paracetamol', the term Paracetamol should be
coded.
Example 2:
If the verbatim is 'Dolprone', the term Dolprone should be coded.
(Dolprone is a trade name containing paracetamol).
Example 3:
If the verbatim is 'Dolprone tablets 500 mg', the entry in the WHO
Drug Dictionaries that contains all the information (trade name,
form, strength) should be chosen, in this case the trade name
Dolprone with strength '500' and form 'tabletten'.
This is possible if using the C-format where form and strength can
be found. If using the B-format the entry with the correct trade
name will be the most precise term (in this example the term
Dolprone).
Manual coding cont.
Preferably the coder should code the verbatim term to the medicinal
product name/trade name, but if this is not possible it is appropriate
to select the Preferred name. When this is the case it is important to
make sure that the chosen Preferred name reflects the substance
(or the combination of substances), salt or ester of the verbatim
term. The UMC is constantly updating the WHO Drug Dictionaries
with new medicinal products but some trade names among the
verbatims in the CRFs can still be missing.
Example 1:
If 'Paracetamol/Codein' is reported, select Paracetamol with codein
Example 2:
If the verbatim is 'Kickan (paracetamol)', the trade name Kickan is
not to be found in the WHO Drug dictionary.
The verbatim implies that the medicinal product with the trade
name Kickan contains paracetamol as an ingredient. The term can
be coded to the generic entry Paracetamol
Non-unique names
The WHO Drug Dictionary B format is a dictionary of drug names.
Each name must be unique. Sometimes an additional code is added
to the drug name in order to make the name unique:
1. The same drug name is used in different countries with different
sets of ingredients.
2. The same drug name is used in different pharmaceutical forms
which contain different sets of active ingredients.
3. A product has changed its composition without changing its trade
name.
In the B format this is solved by adding the drug record
number and the sequence 1 to the name. This gives
information about the active ingredient(s) and the salt or
ester of the ingredient
Example:
The text 'Dolmen' in a verbatim text does not return a direct hit if
you query the Drug Name field in the DD table in the B format.
The system returns the following entries that contain the text
'Dolmen':
Dolmen /00649901/
Dolmen /00685301/
Dolmen /01363802/
The next pages will show a way to deal with Drug names like
these. An alternative way of dealing with non unique names
is to use the WHO Drug Dictionary Browser (section 4.3).
Non-unique names cont.
1. Find what the codes mean
In order to understand the difference between the two Dolmen
entries you can first find out what the codes mean. This step
depends on how your dictionary management system is set up. You
may be able to look into the full drug entry – where you can find the
active ingredients.
Further investigation is necessary. Looking up the active ingredients
corresponding to the code you can see that:
Dolmen /00649901/ contains Acetylsalicylic acid/Ascorbic
acid/Codeine phosphate
Dolmen /00685301/ contains Tenoxicam
Dolmen /01363802/ contains Dexketoprofen trometamol
Non-unique names cont.
2. Are the drugs available in different countries?
Unfortunately the B format does not contain information about the
country where the product is marketed. In the C format it is
possible to find that
Dolmen /00649901/ is available in Spain and
Dolmen /00685301/ is available in Italy and Czech Republic
Dolmen /01363802/ is available in Estonia, Latvia and Lithuania
This information may be useful – if you know which country the
case report comes from.
Non-unique names cont.
3. Are the drugs available in different pharmaceutical forms?
Sometimes the same drug name is associated with different
ingredients even within a country.
If the difference between the two entries is the pharmaceutical form
this can be found in the Pharmaceutical Form table in the C format.
Multi ingredient products
For single substance products the preferred name (seq 1 = 01 and
seq 2 = 001) is straightforward – it is the same as the substance
name.
The multi-ingredient products also have preferred name entries, but
the names are generally not the substance names. The general
principle is that the preferred name of a multi ingredient product is
the name of the first entered product with that unique combination
of substances. It is not possible to use the same solution as for
single substance products since the multi ingredient products often
have a large number of substances and the name field only allows
names to be up to 45 characters long. The UMC have a more
straight forward solution to this problem in the pipeline.
The medicinal product shown in the picture is not included in any of
the WHO Drug Dictionaries.
Coding to Preferred name
If the verbatim says something like 'paracetamol and codeine' the
dictionary gives no direct hit using autoencoders.
The coder can identify which drug record number corresponds to the
unique combination of the two substances.
If your dictionary management system has been set up to allow
substance querying it may be possible to find the preferred name of
the two substances. Notice that there may exist preferred name
entries that contain the two substances in combination with other
substances, so the query needs to be limited.
The preferred name (sequence number 1 and 2 = 01 001) for this
combination is the trade name Panadeine Co.
The disadvantage of this solution is that it will appear as the patient
has taken the product Panadeine Co – when the coder actually
doesn’t know the trade name of the product.
Coding to Generic entry
The dictionaries contain a number of multi-ingredient product
entries whose name field contains the names of all active
ingredients.
In the B formats these can be identified by the designation field –
where X stands for generic multi-ingredient product.
In the C format these can be identified by the Generic field – which
is set to Y.
The reason why not all multi ingredient products have a generic
entry is the limitations of the Name field – 45 characters.
Example:
The combination of Captopril and hydrochlorothiazide has the
preferred name entry Capozide – which was the first entry to be
entered into the dictionary with this unique combination of
ingredients. This entry has the Generic flag in the C format set to
No, and the designation in the B formats is M which stands for nongeneric multi-ingredient product.
An additional entry has been made with the text Captopril
w/hydrochlorothiazide and the Generic flag set to Yes, and
designation X.
It is possible to code a 'generic verbatim text' to this entry. Please
note that this is an exception rather than a rule. Most drug record
numbers for multi ingredient products do not have a generic entry.
Coding to more than one preferred name
An alternative is to code the case report to each substance
individually. This is possible for the combinations where all
ingredients also appear as individual preferred names.
The disadvantage of this method is that the case report will give the
impression that the patient has taken a large number of drugs. Each
of the individual ingredients will have a number of ATC codes – so
analysis may be difficult. The ATC codes and the therapeutic use of
the individual substances may not be the same as the combined
products.
Subtraction or addition of information
The coder should not add or subtract information to a verbatim term
Example 1:
If 'Paracetamol C' is reported the coder should not assume that it is
Paracetamol with Codeine. The coder should request follow-up
information.
Example 2:
If 'Paracetamol and Ibuprofen' is reported the coder should not
select Paracetamol or just Ibuprofen. Both Paracetamol and
Ibuprofen should be chosen. Depending on the system for coding
terms it might be necessary for the coder to request the verbatim
term to be split.
Manufacturer
Terms in WHO Drug Dictionary with manufacturer in the drug name
should be used with some precaution. It should not be used unless
the verbatim term specifically expresses the manufacturer name.
The reason for this is that different manufacturers may have
different ingredients in their product.
Example:
If 'Paracetamol STADA' is reported, select Paracetamol STADA.
But if 'Paracetamol' is reported it is not appropriate to select
Paracetamol STADA.
Misspellings
If the reported drug in the verbatim term is misspelled and the
misspelling does not give any reason for confusion or doubt the
coder should code the verbatim term.
Example:
If 'Parecetamol' is reported, select Paracetamol.
If the misspelling causes confusion or is incomprehensible the coder
should request follow-up information.
If a drug name is missing – how can I add it to the
dictionary?
The WHO Drug Dictionaries are kept up to date with newly launched
drugs and with modifications to existing drugs. However, some
types of drugs can appear as co-medication in clinical data before
they appear in the dictionaries. If you find a drug name missing in
the dictionaries – please report it.
The UMC provide a service called New Drug Request which makes it
possible for customers using the WHO Drug Dictionary Enhanced to
propose new entries to the dictionary.
To submit a request please contact
[email protected]
Introduction
Each product entry in the dictionary has one or more ATC codes.
The ATC codes are assigned according to the ATC guidelines in
which it states that each product should have a minimum of ATC
codes – preferably only one.
In the WHO Drug Dictionaries many entries are coded with all ATC
codes of its preferred name, or generic group. For most substances
this is not a major concern, but some substances can have very
different indications and thereby ATC codes. In most cases a trade
name is used for only one of these indications. The C format of the
dictionaries is to a large extent coded with only one ATC per
medicinal product entry. In the B formats the entries are coded with
all ATC codes of their active ingredient(s).
Users of the WHO Drug Dictionaries have developed different coding
principles with regard to the ATC codes. Some choose to use the
ATC for coding indication on a patient and case report level. Others
chose not to code the ATC on case level. This section will outline the
pros and cons of the two methods.
ATC coding
For the products that only have one ATC code this method is
straightforward – the ATC code is entered into the case report
database to give the indication for which the patient has taken the
drug. If the drug taken by the patient has more than one ATC code
there must be a manual assessment of the indication of the drug
and selection of the ATC that closest reflects the indication. This
process needs to be manual since auto-encoders normally do not
pick up this information. The plain text attached to the ATC code
cannot be used directly by the autoencoders since it often depends
on its context in the hierarchic structure. If the coding is done
manually the coder needs to have an understanding of the ATC and
its structure.
When an ATC code has been chosen for each case report it can be
used for analysis. It is possible to group all case reports that
contains co-medication within a given ATC group – on any of the
levels of the hierarchic ATC classification.
ATC coding cont.
The main disadvantage of this solution is that important signals may
be missed. If a substance has three ATC codes and the case reports
are evenly distributed between the three, none of the three ATC
code groups may be noticeably large.
Another disadvantage of coding ATC in case reports is that the ATC
classification is revised on a yearly basis. The revisions are mainly
additions and the number of changes is kept to a minimum. The
companies that choose to code ATC at case level must however be
aware that a chosen code might be changed. This might lead to
difficulties if the case is used in analysis after the code has been
discontinued unless the codes are kept up-to-date with version
management of the ATC codes.
The ATC codes do not reflect all possible uses of a drug. The
indication for which the drug has been taken in a case report may
not be included as an ATC code, so the coders may have to make
assumptions, which increases the risk of errors.
No ATC coding
The coding of ATC codes on case report level is time-consuming.
Many companies do not find it cost efficient so they choose only to
enter the Drug Code and/or the Medicinal Product ID into the case
report. From these codes it is possible to derive the ATC codes for
the product.
The disadvantage of not coding the ATC code is that it makes
analysis and aggregation of statistics less specific. If a plain listing
of cases per co-medication is produced, some case reports will
appear under more than one ATC group. E.g. if a product has three
ATC codes the case report will be listed under each of these – even
though the patient only has taken one drug.
This can be a disadvantage if the staff analysing the statistics are
not aware of this, but on the other hand it might be more likely that
a signal is discovered if the cases appear under several ATC codes.
When a patient gets an ADR or an interaction with a co-medication
the indication is not of major importance. It is the substance’s
presence in the body that causes the effect – not the indication for
which the patient has taken the drug. The different indications may
indicate a route of administration, a specific dosage regimen or a
special type of patient (with an underlying disease) that makes the
patient more likely to react in a special way. Specific dosage
regimens can be coded in a specific field in the case report, and
route of administration and medical history can be coded in other
fields in the case report database.
Preface
To make sure that the dictionaries contain the most up to date
information drugs are continously entered into the dictionaries. The
dictionaries are kept up-to-date by adding new entries, making
modifications to existing entries and by deleting redundant entries.
This section describes the different types of changes that may affect
existing data.
Changes during the year
The WHO Drug Dictionary, WHO Drug Dictionary Enhanced and the
WHO Herbal Dictionary are updated on a quarterly basis. Each
quarter a large number of drugs are added to the dictionaries, but
once per year (in the March release) changes are also made to
existing entries.
Together with the dictionary files there are files that describe all
inserts, changes and deletes.
Principles for changes
To avoid constant changes to the dictionary data changes to entries
will only be made once per year, in the March 1 release. In the
remaining releases additions will be made.
All modifications of any of the following pieces are considered a
major change and will lead to a new entry with a new Medicinal
Product ID:






Medicinal Product Name
Name Specifier
Drug Code (drecno + Seq 1 + Seq 2)
Marketing Authorisation Holder
Country
Pharmaceutical form (available as Sequence Number 3 and
Pharmaceutical Form in the Pharmaceutical Product table)

Strength (available as Sequence Number 4 and quantity and
unit of active ingredient/s in the Ingredient table)
There are a few exceptions to this principle. One example of an
exception is that a misspelled name can be corrected. If the new
correct name doesn't already exist in the dictionary the entry can be
changed without any modifications of the Drug Code or the
Medicinal Product ID. The name change will be listed in the Changes
files. If the new name already exists the old (misspelled) entry will
be deleted and a reference will be made to the correct entry. Please
note that ATC codes can be added to the drug entries during the
year.
Reasons for changes
The data in the dictionaries is changed for the following reasons:
a) It is discovered that a drug contains incorrect information
b) More up-to-date information is available
c) A drug changes its composition
d) The ATC classification has changed
Special files are produced together with the WHO Drug Dictionaries
that describe the changes that have been made to the dictionary.
The files help users identify the changes and make the appropriate
actions.
a) It is discovered that a drug contains incorrect information
The data in the dictionary is reviewed in order to identify potential errors.
These entries may also be identified when new data is entered, from IMS
Health or other sources, and they are also reported by users.
All corrections of errors are done using the same source as the original or in
some cases sources from the same year as the drug was entered. This is
done in order to get information about what the drug contained at the time it
was entered. If it is discovered that the composition of the drug entry is
incorrect it will be changed.
The new composition will lead to a new Drug Code. All changed Drug Codes
are listed in the Changes files. If the composition is correct according to the
original source, but has changed after the year the entry was made, it will
be treated as a change of composition.
The changes are done with the same quality control as the entry of data,
and a reliable source must be used.
b) More up-to-date information is available
Sometimes provisional data is used and replaced when official
information is available. For instance, if a new substance is added to
the system, the official CAS number can not always be used. If the
substance has not been given an official CAS number yet, the UMC
will generate a provisional, unofficial code. This code will be
replaced with an official code when assigned by CAS.
C) Changes in drug composition
In some coding situations it is useful to know if a product is on the
market or not – for example when a verbatim name appears in the
dictionary in different compositions. In these 'non-unique name'
situations the medicinal product may have changed its composition
– and the dictionary contains both the old and the new formula.
From 2010 a simplified way to identify old form entries will help you
see if a drug entry is flagged as old form – and if there are
differences in different countries. The information is released as a
data file and is also available in the WHO Drug Dictionary Browser
(if you have access to the Comparison screen in the Browser).
The new table is distributed together with the dictionary files and
identifies Drug Codes that are flagged as old form in the data file
OldForm DrugCode List.txt. This is available in the same folder as
the B format in your downloaded data.
The information about if a product is on the market or not is country
specific – if you have set up your system to display country, form,
strength etc (C format) you will see which products are flagged as
old form and which are not. If you have set up your system to only
see product names (B format) it is a little bit more complicated. The
drug codes in the OldForm DrugCode List.txt file therefore have
two types of flags added to the country codes:


A = Old Form in all countries
M = Old Form in some countries (these countries are listed in
the file)
Notice that the entries that are flagged as old form remain in the
dictionary for a reason. If you have an old clinical trial – or an old
case report – it is important that they point to the products as they
existed at the time.
C) A drug changes its composition - Old form example 1
The verbatim Topisolon is available with 2 different compositions in
the dictionary.


Desoximetasone
Desoximetasone/Salicylic acid
In the C format the Topisolon - Desoximetasone/Salicylic
acid products are flagged as ’Old form’ in the Name Specifier field.
In the additional oldform_drugcode_list.txt file, Topisolon Desoximetasone/Salicylic acid has an A-flag added to
the German country code.

A = Old Form in all countries
C) A drug changes its composition - Old form example 2
The verbatim Bradosol is available with 3 different compositions in
the dictionary.



Benzalkonium chloride
Domiphen bromide
Hexylresorcinol
In the C format the Bradosol - Domiphen bromide products are
flagged as ’Old form’ in the Name Specifier field.
In the additional oldform_drugcode_list.txt file, Bradosol Domiphen bromide has an M-flag added to the UK country code.
The Bradosol with the same ingredients that is still on the market in
Austria is not listed in the file.






M = Old Form in some countries (these countries are listed in
the file)
d) The ATC classification has changed
The ATC classification is revised on a yearly basis – as a part
of the March 1 release. The revision mainly consists of
additions, but sometimes some groups are moved or split. An
example of this was in 2008, three new 4th level ATC
codes were assigned in ATC group L04A:
L04AB Tumor necrosis factor alpha (TNF-α) inhibitors, L04AC
Interleukin inhibitors and L04AD Calcineurin inhibitors.
Substances like these were previously classified in group
L04AA Selective immunosupressants and were moved to the
appropriate new ATC code and all affected products in the
dictionaries were reclassified according to these new principles.
If you code ATC as well as Drug Code – has the yearly ATC
revision affected any of the codes you have selected?
Information about the ATC revision is included in the changes
files.
A unique product name has become non unique
Has any of the product names you have coded to become nonunique? Use the DD Changed DrugName.txt in the B & C-format
changes files. Identify entries where /code/ has been added to the
name. In the March 1st release version of the Changed
DrugName.txt both new non-unique products and products which
have become unique are included.
The example below shows the product name Sevikar which became
non unique in the dictionary of September 2009 since a new product
with the same name but with different ingredients was added
(Sevikar 06230801006).
The file 'Changes DrugName.txt' in September 2009 shows the 'old'
Sevikar which have been added /code/SEVIKAR, meening that its
non unique:
06235401001SEVIKAR
/06235401/SEVIKAR
Find the 'new' Sevikar. Use the DD_ins.txt (both the new and old
entry can be found in the dd.txt file):
062308010065M09 237UNS 02 093SEVIKAR


/06230801/
Check if any of the 'Sevikar' drug codes have been used in
your clinical data
Decision: should the code selection be revised?
Non unique product names
Has any additional alternative been added to an already non-unique
product name? The example below shows the product
name Crampex which already was non-unique and became even
more non unique in the dictionary in September 2009 since a new
product with the same name but with different ingredients was
added (Sevikar 06230801006).
Use the DD_ins.txt file. Indentify inserts with /code/ that do not
have corresponding entries (same name minus /code/) in the DD
Changed DrugName.txt.
Example:
025954010010M05SCH UNS 08 053CRAMPEX
Compare with corresponding 'old' entries in the DD.txt file:
/02595401/
018265010019M05 237UNS 03 051CRAMPEX
005142010015M77 19UNS 04 044CRAMPEX
/01826501/
/00514201/
Check if any of the 'old' entries have been used in your clinical data.
Decision: should the code selection be revised?
Cumulative Changes
The Cumulative changes table makes it possible to trace all Drug
Codes that have been discontinued or reclassified since 2004, and
to identify the replacement Drug Code. The table can help users
upgrade from old dictionary releases and it can also be useful in
versioning of coded data.
The first Drug Record Number, Sequence Number 1 and Sequence
Number 2 list the discontinued Drug Code.
Year, Quarter identifies when the drug code was discontinued.
The second Drug Record Number, Sequence Number 1 and
Sequence Number 2 lists the Drug Code to which a reference was
made.
Example: 0232950100209401343101001
This shows how the mapping of drug code changes is presented in
the Cumulative Changes table. The example refers to the product
Nervan (see browser screenshots below) which is a non-unique
product name. The drug code 02329501002 was last used in
December 1 release 2009 and was replaced by the code
01343101001 in 1 March release 2010. This replacement Drug Code
may also have been discontinued at a later stage, so please make
sure that you code the discontinued code with the latest
replacement code.
Notice that the Cumulative changes table only dates back to 2004.
If you identify discontinued Drug Codes that you cannot find in the
Cumulative changes table, please submit them to the UMC and we
will trace them for you using other tools.
Introduction to Chapter 9
The main focus in this chapter is:
What goes into a database decides what is possible to extract out of
it.
Why is it important to be aware of this when coding concomitant
medication?
Learning objectives:
understand how the WHO Drug Dictionaries can be used for analysis
have a basic understanding how to use the ATC classification when
analyzing clinical data and identify protocol violations.
Preface
The WHO Drug Dictionaries are used for coding clinical data, but the
data is coded in order to enable analysis and communication. Good
quality coding and a correct set-up of your databases are
prerequisites for useful analysis.
The C and the B format
The C format contains more information than the B format, e.g.
pharmaceutical form and strength which allows for more detailed
analyses.
The UMC has put more focus on populating the pharmaceutical form
information than the strength information for two reasons:

The pharmaceutical form information, if collected by the
primary coders, is relevant to the analysis of clinical data. The
types of reaction may vary depending on the type of
administration, eg local versus systemic effects, and there
could be different types of reactions to a sustained release
tablet compared to a regular tablet. Sometimes adverse
reactions have been explained by inadequate pharmaceutical
forms, e.g. Esophagus Ulcer caused by capsules that were not
swallowed properly.

Sometimes the same trade name is available in different
pharmaceutical forms, with different ingredients, for example
the suppository could contain additional ingredients, or
different salts of the substance.
The C format also has some benefits over the B format when it
comes to codes and IDs and ATC coding in connection to analysis.
Read more about this on the following pages.
The C and the B format - ATC coding
Both the B format and the C format contain the ATC classification.
In the B format all products are coded with the same ATC codes as
the preferred name. For example, products with the
ingredient Aciclovir will be coded with the following ATC codes:
In the C format a specific product is coded with the ATC code that
reflects the most common use of the product, e.g. an Aciclovir
product used mainly topically would be coded with the D06BB code.
As previosuly discussed in the ATC coding section there are pros
and cons with the coding of ATC terms as indications.
If you choose to analyze the data using all possible ATC codes of a
Drug Record Number (active ingredients) you can always find them
by identifying the corresponding Preferred Name.
Medicinal Product ID and Drug Code
The Medicinal Product ID and the Drug Code are two ways of
representing drug information in the case report data. The structure
of the two systems is described in Chapter 3.
The coding system built on the Drug Code (the only system used in
the B format) describes the active ingredient(s), the salt/ester and
the product name. Its intrinsic information makes the code very
useful for analysis.
The C format contains both the Drug Code system and also the
Medicinal Product ID system. The Medicinal Product ID does not
have any intrinsic information.
The next two pages will describe the possibilities for analysis with
the Drug Code, the Medicinal Products ID and the combined use of
the two systems.
Drug Code - Analysis and querying
The use of the Drug Code facilitates querying since already in the
case report you will have important information about the active
ingredients.
Drug Record Number
In most cases the analysis and querying will be based on the Drug
Record Number – the first six characters in the Drug Code. The
Drug Record Number can be used to identify all products with the
same active ingredient(s)
Sequence number 1
Sometimes further investigation may be necessary, to identify a
certain salt’s effect in the coded data.
This is especially important when analyzing herbal products. In the
WHO Herbal Dictionary the Sequence 1 represents the part of the
plant or the extraction type.
Sequence number 2
Sequence 2 identifies a certain drug name, the drug name selected
during the coding, the name that closest represents the verbatim
text. This field is mainly used when producing reports or when
making case by case analysis.
9.1.7 Medicinal Product ID - Analysis and querying
When making advanced querying or analysis of large data sets the
use of the Medicinal Product ID may make queries complex. If the
analysis is made on a certain substance the query must first identify
all Medicinal Product IDs of the products that contain the substance
and then query the case report database for each of these Medicinal
Product IDs individually.
If the ID is indexed correctly in the case report database there
should not be any performance problem.
Combined use of Medicinal Product ID and Drug Code
The combined use of the two coding systems in case report data
may give the advantages of both. It makes it possible to get direct
access to the complete set of data used at the time of coding, but it
also makes it possible to query and analyze the data in a
straightforward way.


Analysis of individual case reports or production of reports can
be based on the Medicinal Product ID.
Analysis of large sets of data can be based on the Drug Code,
as described above.