Download A Coding History for Clinical Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Big data wikipedia , lookup

Data Protection Act, 2012 wikipedia , lookup

Data model wikipedia , lookup

Data center wikipedia , lookup

Database model wikipedia , lookup

Forecasting wikipedia , lookup

Data analysis wikipedia , lookup

Information privacy law wikipedia , lookup

Data vault modeling wikipedia , lookup

3D optical data storage wikipedia , lookup

Business intelligence wikipedia , lookup

Transcript
A Coding History for Clinical Data
Phyllis Wolf, Covance, Princeton, NJ
ABSTRACT
The need to track the coding of clinical data (adverse event,
medications) inspired the development of a coding history system.
Because the coding element of the data review system does not
maintain an audit trail, it has proven a tedious task for data managers
to keep track of when raw clinical terms are coded and when an
equivalence (synonym) is used.
The system consists of two macros and a calling program.
The first macro maintains a permanent SAS data set that contains,
among other variables, the raw clinical term (the verbatim), the
dictionary term to which it is mapped, and the date that term first
appeared in the raw data. The system also keeps track of whether the
verbatim is in the coding dictionary or if the term had to be added
through an equivalence dictionary. It maintains this history even if the
verbatim is later deleted from the raw data.
The second macro gives many choices of viewing the coding history
data set. A user can view only those terms coded after a certain date,
or only those terms that were equivalenced, or only those terms still in
the raw data, or any combination of those options.
INTRODUCTION
The data collected on Case Report Forms (CRFs) include, among
other things, concomitant medications, i.e., any other drugs a patient
may be taking while participating in a clinical trial. In an effort to
standardize the reporting of this information, the data is coded. The
verbatim is matched with entries in a dictionary, for example, the WHO
Drug Dictionary, and those terms are then added to the raw data. If a
drug term as it is recorded on the CRF is not in the dictionary, the Data
Manager will form an equivalence by mapping it to a term in the
dictionary. We have found it useful to keep a separate data set of
those equivalenced terms. Often we need to track when certain terms
first appear in the raw data, how they have been mapped, and how the
mapping may have changed over the course of the trial. This paper
will highlight some of the techniques used to report on the coding
status.
DIRECT HIT
The coding history system consists of two macros. One is used to set
up and maintain a permanent SAS data set and one to display it.
In Covance’s data review system, the fields for the coded terms are
added to the raw data when the data is brought over from the
relational database into the SAS environment. While still in the
relational database the Data Manager runs coding, establishing a rowby-row mapping of verbatim to dictionary term. If the verbatim isn’t in
the dictionary, the coding system looks for an equivalence.
Coding creates a separate table containing the row identifier from the
raw data and all the corresponding dictionary terms. For example, if
the verbatim on the CRF is ASPIRIN, coding would look for, and not
find, ASPIRIN, in the dictionary. Instead an equivalence would be set
by the Data Manager to map ASPIRIN to ACETYLSALICYLIC ACID.
Then, the descriptors of ACETYLSALICYLIC ACID, i.e., all the codes
and decodes which are a part of the dictionary, are added.
Until the first time coding is run in the data review system, the
dictionary terms are not part of the raw data structure as seen in the
SAS environment. To avoid attempting to process fields that do not
exist, a check needed to be made to see if the coding had been set
up. We need to look for one of the dictionary terms in the contents of
the data set. Utilizing an extremely useful feature of PROC SQL, the
macro variable &SQLOBS will automatically be created.
This macro variable contains the number of observations
returned by the last select.
proc sql noprint;
select name
from dictionary.columns
where libname = "RAW"
and memname = "CONMED"
and name = “PREFTERM”;
quit;
%if &sqlobs %then %do;
.
.
%end;
%else %do;
data _null_;
put "WAR" "NING: CODING NOT SET UP YET";
run;
%end;
Here, the system looks in DICTIONARY.COLUMNS for the
field PREFTERM in the RAW.CONMED data set. If it finds
one observation, &SQLOBS will be assigned the value 1
(synonymous with TRUE), if not, it is 0 (synonymous with
FALSE).
KEEPING TRACK
Now that we’ve established that the data have been coded,
we can track when terms first appear in the data, to what
dictionary terms they are mapped, and whether they had to
be equivalenced. To do that we look at the raw data (with
the coded variables attached), and the list of equivalences.
We first pare down the raw data to one occurrence of each
distinct verbatim term, coded term, any other distinguishing
variables. In our example, we will include the primary ATC
code (a therapeutic code).
It has proven important to
include all those terms in the selection because not only
have the mappings changed, but also because clients may
request that a different ATC code be used.
It has proven useful to include a flag which indicates
whether the verbatim term is still in the raw data, as there
have been times when terms are removed after a site
corrects CRF pages. Being able to flag what is still in the
raw data is helpful when it comes to reporting it later.
Next, the pared down list is merged with the equivalence
terms data set. If the verbatim is in the equivalence list,
then a flag (for example, equiv=’*’), is created to report the
term as equivalenced.
The first time a verbatim appears in the list we have to
datestamp it. Rather than just assign the current system
date to the data, it was decided to take the date from the
SAS data set itself. Some more useful PROC SQL:
proc sql noprint;
select modate into: stamp
from dictionary.tables
where libname = “RAW”
and memname = “CONMED”;
quit;
This creates a macro variable called &STAMP which
contains the datetime formatted modified date from the
DICTIONARY.TABLES of the RAW.CONMED data set. Then, if the
variable stamp is missing for a record, the record must be new so
assign &stamp to it:
if stamp eq . then stamp = input(“&stamp”, datetime16.);
WHAT IT LOOKS LIKE
Now it is time to look at the data. The coding history system had to be
easy enough to use for someone who is not too familiar with SAS and
also easily produce output that looks good.
PROC REPORT
immediately came to mind. Here is where all the different flags come
in. There are flags to indicate what type of coding it is (drug, adverse
event…), if the data is still in the raw data, if it is an equivalenced term,
and, of course, if there is the datestamp. If a user wants to see only
those drug terms that have been equivalenced since December 21,
1999, the macro call might look like this:
%codeprnt(type=DRUG, only_eq=YES, since=21DEC1999)
columns code prefterm verbatim
%if %upcase(&only_eq)=NO %then %do;
equiv
%end;
date;
.
.
%if %upcase(&only_eq)=NO %then %do;
define equiv / '' spacing=1;
%end;
An example of a report typically produced is shown below.
CONCLUSION
Keeping track of the coding of clinical data is useful,
especially when it is necessary to show progress. Clients
and Medical Monitors want to see that coding is being done
accurately and that changes, when necessary, are
preformed in a timely manner.
Now, WHERE and/or subsetting IF clauses can be used:
REFERENCE
%if &type ne %then %do;
where type = upcase(“&type”);
%end;
%if %upcase(only_eq) ne ‘NO’ %then %do;
if equiv eq ‘*’;
title7 ‘Equivalenced Terms Only’;
%else %do;
footnote2 '* = Verbatim was equivalenced';
%end;
%if &since ne %then %do;
if datepart(stamp) gt “&since”d;
%end;
SAS is a registered trademark of the SAS Institute, Inc.
CONTACT INFORMATION
In the above example, only the equivalenced terms are to be displayed
so the flag indicating it was equivalenced is not needed. However,
when all the terms are displayed, it is useful to indicate which terms
are equivalenced. Put some conditionals around the EQUIV variable
in the column and define statements:
Your comments and questions are valued and encouraged.
Contact the author at:
Phyllis K. Wolf
Covance
210 Carnegie Center
Princeton, NJ 08540
Work Phone: 609/ 452-4062
Fax: 609/520-1754
Email: [email protected]
Dummy Drug Co.
9999 9999 (CVD 9999)
09:57 Thursday, June 29, 2000
Coding History Report
Type: DRUG
Equivalenced Terms Only
Terms coded since 21DEC1999
Code
Verbatim
Preferred Term
Date Coded
-----------------------------------------------------------------------------------------------------------A01AB
FRAMYCETINSULFAT
GRANEODINE
NEOMYCIN
NEOMYCIN
17JAN2000
27APR2000
A01AC
DECADRON
DEXAMETASONA
DEXAMETASONE
DEXOMETHASONE
DEXOMETHESONE
FORTECOTZIN
HYDROCORTISONE CREAM
HYDROCORTISONE CREAM 2.5%
DEXAMETHASONE
DEXAMETHASONE
DEXAMETHASONE
DEXAMETHASONE
DEXAMETHASONE
DEXAMETHASONE
HYDROCORTISONE
HYDROCORTISONE
28DEC1999
17JAN2000
27APR2000
01JAN2000
28DEC1999
29MAR2000
17MAR2000
04APR2000
A01AD
ACETYLSALICYLATE
ARTIFICIAL SALIVA
ASA 1 TAB QD
ASPIRIN
ASPIRIN (ACETYLSALICYLIC ACID)
BIODENT-WITH CALCIUM
COUGH DROPS
DISPIRIN
EC ASA
MUCOSITIS COCKTAIL
ULTIMATE CLEANSE
ACETYLSALICYLIC ACID
UNKNOWN MOUTH PREPARATION
ACETYLSALICYLIC ACID
ACETYLSALICYLIC ACID
ACETYLSALICYLIC ACID
UNKNOWN MOUTH PREPARATION
UNKNOWN MOUTH PREPARATION
ACETYLSALICYLIC ACID
ACETYLSALICYLIC ACID
UNKNOWN MOUTH PREPARATION
UNKNOWN MOUTH PREPARATION
17JAN2000
29MAR2000
27APR2000
28DEC1999
06MAR2000
27APR2000
29MAR2000
17JAN2000
10JAN2000
29MAR2000
29MAR2000
A02AA
MILK OF MAGNESIA
MAGNESIUM HYDROXIDE
28DEC1999
A02AC
CALCIUM-CARBONATE
ORCAL D3
OROCAL D3
TUMS ULTRA
CALCIUM
CALCIUM
CALCIUM
CALCIUM
28DEC1999
17JAN2000
17JAN2000
01JAN2000
A02BA
BELOC MITE
BELOCMITE
CIMETIDINE
CIMETIDINE
A02BC
ANAGASTRA
AUTRA
OMEPRAZOL
OMEPRAZOL (MORPAL)
PANTOPRAZOLE
OMEPRAZOLE
OMEPRAZOLE
OMEPRAZOLE
D = Deleted from raw data
CARBONATE
CARBONATE
CARBONATE
CARBONATE
29MAR2000
29MAR2000
D
27APR2000
15MAY2000
17JAN2000
28DEC1999
1