* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download A Coding History for Clinical Data
Entity–attribute–value model wikipedia , lookup
Data Protection Act, 2012 wikipedia , lookup
Data center wikipedia , lookup
Database model wikipedia , lookup
Forecasting wikipedia , lookup
Data analysis wikipedia , lookup
Information privacy law wikipedia , lookup
Data vault modeling wikipedia , lookup
A Coding History for Clinical Data Phyllis Wolf, Covance, Princeton, NJ ABSTRACT The need to track the coding of clinical data (adverse event, medications) inspired the development of a coding history system. Because the coding element of the data review system does not maintain an audit trail, it has proven a tedious task for data managers to keep track of when raw clinical terms are coded and when an equivalence (synonym) is used. The system consists of two macros and a calling program. The first macro maintains a permanent SAS data set that contains, among other variables, the raw clinical term (the verbatim), the dictionary term to which it is mapped, and the date that term first appeared in the raw data. The system also keeps track of whether the verbatim is in the coding dictionary or if the term had to be added through an equivalence dictionary. It maintains this history even if the verbatim is later deleted from the raw data. The second macro gives many choices of viewing the coding history data set. A user can view only those terms coded after a certain date, or only those terms that were equivalenced, or only those terms still in the raw data, or any combination of those options. INTRODUCTION The data collected on Case Report Forms (CRFs) include, among other things, concomitant medications, i.e., any other drugs a patient may be taking while participating in a clinical trial. In an effort to standardize the reporting of this information, the data is coded. The verbatim is matched with entries in a dictionary, for example, the WHO Drug Dictionary, and those terms are then added to the raw data. If a drug term as it is recorded on the CRF is not in the dictionary, the Data Manager will form an equivalence by mapping it to a term in the dictionary. We have found it useful to keep a separate data set of those equivalenced terms. Often we need to track when certain terms first appear in the raw data, how they have been mapped, and how the mapping may have changed over the course of the trial. This paper will highlight some of the techniques used to report on the coding status. DIRECT HIT The coding history system consists of two macros. One is used to set up and maintain a permanent SAS data set and one to display it. In Covance’s data review system, the fields for the coded terms are added to the raw data when the data is brought over from the relational database into the SAS environment. While still in the relational database the Data Manager runs coding, establishing a rowby-row mapping of verbatim to dictionary term. If the verbatim isn’t in the dictionary, the coding system looks for an equivalence. Coding creates a separate table containing the row identifier from the raw data and all the corresponding dictionary terms. For example, if the verbatim on the CRF is ASPIRIN, coding would look for, and not find, ASPIRIN, in the dictionary. Instead an equivalence would be set by the Data Manager to map ASPIRIN to ACETYLSALICYLIC ACID. Then, the descriptors of ACETYLSALICYLIC ACID, i.e., all the codes and decodes which are a part of the dictionary, are added. Until the first time coding is run in the data review system, the dictionary terms are not part of the raw data structure as seen in the SAS environment. To avoid attempting to process fields that do not exist, a check needed to be made to see if the coding had been set up. We need to look for one of the dictionary terms in the contents of the data set. Utilizing an extremely useful feature of PROC SQL, the macro variable &SQLOBS will automatically be created. This macro variable contains the number of observations returned by the last select. proc sql noprint; select name from dictionary.columns where libname = "RAW" and memname = "CONMED" and name = “PREFTERM”; quit; %if &sqlobs %then %do; . . %end; %else %do; data _null_; put "WAR" "NING: CODING NOT SET UP YET"; run; %end; Here, the system looks in DICTIONARY.COLUMNS for the field PREFTERM in the RAW.CONMED data set. If it finds one observation, &SQLOBS will be assigned the value 1 (synonymous with TRUE), if not, it is 0 (synonymous with FALSE). KEEPING TRACK Now that we’ve established that the data have been coded, we can track when terms first appear in the data, to what dictionary terms they are mapped, and whether they had to be equivalenced. To do that we look at the raw data (with the coded variables attached), and the list of equivalences. We first pare down the raw data to one occurrence of each distinct verbatim term, coded term, any other distinguishing variables. In our example, we will include the primary ATC code (a therapeutic code). It has proven important to include all those terms in the selection because not only have the mappings changed, but also because clients may request that a different ATC code be used. It has proven useful to include a flag which indicates whether the verbatim term is still in the raw data, as there have been times when terms are removed after a site corrects CRF pages. Being able to flag what is still in the raw data is helpful when it comes to reporting it later. Next, the pared down list is merged with the equivalence terms data set. If the verbatim is in the equivalence list, then a flag (for example, equiv=’*’), is created to report the term as equivalenced. The first time a verbatim appears in the list we have to datestamp it. Rather than just assign the current system date to the data, it was decided to take the date from the SAS data set itself. Some more useful PROC SQL: proc sql noprint; select modate into: stamp from dictionary.tables where libname = “RAW” and memname = “CONMED”; quit; This creates a macro variable called &STAMP which contains the datetime formatted modified date from the DICTIONARY.TABLES of the RAW.CONMED data set. Then, if the variable stamp is missing for a record, the record must be new so assign &stamp to it: if stamp eq . then stamp = input(“&stamp”, datetime16.); WHAT IT LOOKS LIKE Now it is time to look at the data. The coding history system had to be easy enough to use for someone who is not too familiar with SAS and also easily produce output that looks good. PROC REPORT immediately came to mind. Here is where all the different flags come in. There are flags to indicate what type of coding it is (drug, adverse event…), if the data is still in the raw data, if it is an equivalenced term, and, of course, if there is the datestamp. If a user wants to see only those drug terms that have been equivalenced since December 21, 1999, the macro call might look like this: %codeprnt(type=DRUG, only_eq=YES, since=21DEC1999) columns code prefterm verbatim %if %upcase(&only_eq)=NO %then %do; equiv %end; date; . . %if %upcase(&only_eq)=NO %then %do; define equiv / '' spacing=1; %end; An example of a report typically produced is shown below. CONCLUSION Keeping track of the coding of clinical data is useful, especially when it is necessary to show progress. Clients and Medical Monitors want to see that coding is being done accurately and that changes, when necessary, are preformed in a timely manner. Now, WHERE and/or subsetting IF clauses can be used: REFERENCE %if &type ne %then %do; where type = upcase(“&type”); %end; %if %upcase(only_eq) ne ‘NO’ %then %do; if equiv eq ‘*’; title7 ‘Equivalenced Terms Only’; %else %do; footnote2 '* = Verbatim was equivalenced'; %end; %if &since ne %then %do; if datepart(stamp) gt “&since”d; %end; SAS is a registered trademark of the SAS Institute, Inc. CONTACT INFORMATION In the above example, only the equivalenced terms are to be displayed so the flag indicating it was equivalenced is not needed. However, when all the terms are displayed, it is useful to indicate which terms are equivalenced. Put some conditionals around the EQUIV variable in the column and define statements: Your comments and questions are valued and encouraged. Contact the author at: Phyllis K. Wolf Covance 210 Carnegie Center Princeton, NJ 08540 Work Phone: 609/ 452-4062 Fax: 609/520-1754 Email: [email protected] Dummy Drug Co. 9999 9999 (CVD 9999) 09:57 Thursday, June 29, 2000 Coding History Report Type: DRUG Equivalenced Terms Only Terms coded since 21DEC1999 Code Verbatim Preferred Term Date Coded -----------------------------------------------------------------------------------------------------------A01AB FRAMYCETINSULFAT GRANEODINE NEOMYCIN NEOMYCIN 17JAN2000 27APR2000 A01AC DECADRON DEXAMETASONA DEXAMETASONE DEXOMETHASONE DEXOMETHESONE FORTECOTZIN HYDROCORTISONE CREAM HYDROCORTISONE CREAM 2.5% DEXAMETHASONE DEXAMETHASONE DEXAMETHASONE DEXAMETHASONE DEXAMETHASONE DEXAMETHASONE HYDROCORTISONE HYDROCORTISONE 28DEC1999 17JAN2000 27APR2000 01JAN2000 28DEC1999 29MAR2000 17MAR2000 04APR2000 A01AD ACETYLSALICYLATE ARTIFICIAL SALIVA ASA 1 TAB QD ASPIRIN ASPIRIN (ACETYLSALICYLIC ACID) BIODENT-WITH CALCIUM COUGH DROPS DISPIRIN EC ASA MUCOSITIS COCKTAIL ULTIMATE CLEANSE ACETYLSALICYLIC ACID UNKNOWN MOUTH PREPARATION ACETYLSALICYLIC ACID ACETYLSALICYLIC ACID ACETYLSALICYLIC ACID UNKNOWN MOUTH PREPARATION UNKNOWN MOUTH PREPARATION ACETYLSALICYLIC ACID ACETYLSALICYLIC ACID UNKNOWN MOUTH PREPARATION UNKNOWN MOUTH PREPARATION 17JAN2000 29MAR2000 27APR2000 28DEC1999 06MAR2000 27APR2000 29MAR2000 17JAN2000 10JAN2000 29MAR2000 29MAR2000 A02AA MILK OF MAGNESIA MAGNESIUM HYDROXIDE 28DEC1999 A02AC CALCIUM-CARBONATE ORCAL D3 OROCAL D3 TUMS ULTRA CALCIUM CALCIUM CALCIUM CALCIUM 28DEC1999 17JAN2000 17JAN2000 01JAN2000 A02BA BELOC MITE BELOCMITE CIMETIDINE CIMETIDINE A02BC ANAGASTRA AUTRA OMEPRAZOL OMEPRAZOL (MORPAL) PANTOPRAZOLE OMEPRAZOLE OMEPRAZOLE OMEPRAZOLE D = Deleted from raw data CARBONATE CARBONATE CARBONATE CARBONATE 29MAR2000 29MAR2000 D 27APR2000 15MAY2000 17JAN2000 28DEC1999 1