Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
NESUG 2008 Pharmaceuticals, Health Care, and Life Sciences Just What the Doctor Ordered: Using SAS® to Calculate Patient Drug Dose with Electronic Medical Record (EMR) Prescription Data Kathy Hardis Fraeman, United BioSource Corporation, Bethesda, MD ABSTRACT Electronic medical records (EMRs) are paperless digital versions of physicians’ paper charts. EMRs can contain a myriad of health care information – including details of physician drug prescriptions – and research efforts with EMR data often include determining patient drug treatment patterns and dose modifications. Unfortunately, electronic versions of physician prescriptions are often as varied and undecipherable as their paper counterparts. Elements in EMR drug prescription data include the often unstandardized text name of the drug, and inconsistently used numeric variables indicating the number of pills to be taken at one time, the number of times per day the pills should be taken, number of days the pills should be taken, and dosage amount per pill. Further, dosing instructions are often given as unstandardized Latin abbreviations, such as “BID,” “QD,” and “PRN.” Although there’s no “right way” to analyze such varied EMR drug prescription data, programming tips and techniques with Base SAS are presented that show ways to derive patient drug dose and treatment patterns with EMR data. Examples from a general practice EMR and an outpatient oncology EMR are used to illustrate the tips and techniques. INTRODUCTION Electronic medical records (EMRs) can be described as paperless, digital versions of the patient chart folders found on rows of shelves in physicians’ offices. EMRs can contain a wide variety of health care information, including medical histories, details of diagnoses and treatments, clinical laboratory test results and treatment responses, and visit scheduling and billing information. Numerous studies have touted the advantages for physicians using EMR systems, as EMRs have the potential to improve the quality of patient care while substantially reducing medical costs. Although EMRs are not yet widely used in a majority of physician offices, their use is increasing. A survey conducted by the CDC’s National Center for Health Statistics found that almost one in four physicians reported using either full or partial EMR systems in their office-based practices in 2005, representing a 32% increase since 2001. While physicians use EMRs to improve the quality of medical care and efficiency of their practices, scientists and medical researchers see the enormous research potential in de-identified EMR data. Not only do EMR data provide a wealth of information about real-world medical practices, they also include laboratory results and patient treatment response assessments not usually found in insurance claims databases. EMR data further provide important patient demographics not available in insurance claims data, such as patient height and weight, and smoking status and alcohol use. As the saying goes, “if you’ve seen one EMR, you’ve seen one EMR.” EMRs are not standardized in terms of either their database structure or content, and there’s no single or “right” way to analyze EMR data for their wide range of research possibilities. Drug prescription data – one of the many types of data found in EMRs – can be very useful for a variety of research efforts, but these electronic drug prescription data can also be as undecipherable as their paper prescription counterparts. SPECIFIC EMRS The two EMRs discussed in this paper are: • a general practice EMR, and • an outpatient oncology EMR. All of the patient data in these EMRs are de-identified, as required by the Health Insurance Portability and Accountability Act of 1996 (HIPAA) regulations. GENERAL PRACTICE EMR The general practice EMR is a database containing medical data from a group of over 2,600 general practitioners and over 6 million patients. More than half of the states in the US are represented, and the average length of follow-up for patients is approximately 3 years. This database includes detailed information regarding patient demographics, such as patient height and weight, smoking status, and alcohol use. The general practice EMR 1 NESUG 2008 Pharmaceuticals, Health Care, and Life Sciences also includes the results of several hundred different types of laboratory tests, and provides complete information on all drugs for which a prescription is written for patients receiving care from that practice. The general practice EMR contains 17 separate data sets, four of which contain prescription drug information. OUTPATIENT ONCOLOGY EMR The outpatient oncology EMR database includes medical and treatment information for more than 185,000 cancer patients from 18 outpatient oncology provider organizations comprising 71 clinic locations in 15 different states across the United States. The EMR contains descriptions of prescribed chemotherapy treatment regimens, including the number and length of each chemotherapy cycle within the regimen. The EMR also contains physician prescription information for each specific chemotherapy agent included in the regimen, along with prescriptions for supportive care drugs for the side effects of chemotherapy and palliative care for cancer patients. If the chemotherapy agents or supportive care drugs are administered medically at the cancer clinic, usually by injection or infusion, then the EMR records information about the drug administrations at the clinic. The outpatient oncology EMR contains 49 separate data sets, six of which contain drug prescription and drug administration information, although our drug analyses with these EMR data only use three data sets. CONVERTING EMR DATA TO SAS None of the data in either of these two EMRs were originally collected in SAS format at the physician offices or oncology clinics, and these data are not delivered to research clients in SAS format. The original EMR data were collected using other relational database packages, and these relational databases were received for analysis as multiple separate text files. These text files had to be converted into SAS format data sets, with the appropriate variable characteristics, labels, and formats. The EMR data also needed varying degrees of internal standardization, especially for the variables relevant to deriving patient drug dosing and treatment patterns. EMR Programming Tip – Converting non-SAS EMR data to SAS format Conversion of non-SAS EMR data into a SAS data set format can be time-consuming, especially given the large number of data sets, variables, and attached formats found in most EMRs. If non-SAS EMR data are delivered with an electronic data dictionary, it’s possible to use that data dictionary as part of the data conversion process. The electronic data dictionary can be converted to SAS meta-data, which can then be used to dynamically generate SAS DATA-step programming, such as format, label, and length statements, to be used as part of the data conversion process. For an example of this programming technique, see Fraeman, NESUG 1999. DRUG PRESCRIPTION DATA VS. DRUG DISPENSING DATA Electronic medical records contain drug prescription data, rather than the actual drug dispensing information found in medical insurance claims data. EMR prescription data describes how a drug was prescribed for the patient by the physician, but EMR data can’t verify if that prescription was actually filled. If the prescription indicates an allowable number of refills, the EMR data won’t be able to directly indicate if the patient actually ever refilled the prescription. One advantage of EMR drug prescription data over insurance claims drug dispensing data is that EMRs are able to monitor physician prescriptions for over-the-counter medications. EMR Programming Tip – Analytically interpreting EMR prescription data There are no standard rules about how to interpret patient drug use with “drug prescription data” as opposed to “drug dispensing data”, other than to be aware of this distinction. It is important to know about the drugs being studied, the medical conditions being treated by these drugs, and the types of treatment patterns that might be expected for these medical conditions. Most importantly, it is very important to look at the actual EMR drug prescription data, paying special attention to the relative timing and dates of each patient’s drug prescriptions. Understanding the drugs, diseases, and the actual EMR drug prescription data will help develop study-specific rules on how to interpret the EMR drug prescription data. 2 NESUG 2008 Pharmaceuticals, Health Care, and Life Sciences DRUG PRESCRIPTION DATA IN EMRS The data sets, or files, in an EMR that provide information about each patient’s drug prescriptions -- and variables contained within each drug prescription data file -- can vary across different EMRs. GENERAL PRACTICE EMR The drug prescription data in the general practice EMR have already undergone some degree of standardization prior to receipt of the data. The prescription drug files and selected variables found in these files from the general practice EMR are described below: EMR General Practice EMR File Description Selected Variables Drug Reference File Medication Key – Unique drug identifier One observation for each drug Names of drug – product and generic Numeric drug strength and units of measure Drug Generic Product Index (GPI) code National Drug Council (NDC) code Medication Categories Patient Medication File Patient Key – Unique patient identifier One observation for each patient medication Medication Key – Unique drug identifier Sig Key – Unique key for medication instructions Drug start and stop dates Drug active flag Drug stop reason Patient Prescription File Patient Key – Unique patient identifier One observation for each patient prescription Medication Key – Unique drug identifier Prescription Key – Unique prescription identifier Prescription Date Number of refills Sig File Sig Key – Unique key for medication instructions One observation for each set of medication instructions Medication frequency, dose, route EMR Programming Tip – Efficiently using the Drug Reference File A Drug Reference File in an EMR will have a unique drug key variable associated with each drug, and will have more detailed drug information, such as complete drug name, drug strength, formulation, and route of administration. Each entry in the EMR Patient Medication File will be keyed on patient ID, and will also include the unique drug key variable, instead of including all of the detailed information about the drug in each observation of the Patient Medication File. One method to get all of the detailed information about the drug into each patient drug record in the Patient Medication File would be to SORT/MERGE both files using the drug key variable. This method of adding detailed drug information to each patient drug record will be slow and inefficient, especially if the Drug Reference File has hundreds of thousands of observations, and the Patient Medication File has hundreds of millions of observations. 3 NESUG 2008 Pharmaceuticals, Health Care, and Life Sciences EMR Programming Tip – Efficiently using the Drug Reference File, continued The method described by Eason (NESUG 2005) describes how to use PROC FORMAT as a “speedy alternative” to the SORT/MERGE. The SAS code that will create a variable for drug name, based on a format generated from the unique medication key, would look something like this: /****************************************************************/ /* Create a format for drug name from the Drug Reference File /****************************************************************/ data drgfmt (keep = start label fmtname); set in.DRUG_REFERENCE_FILE; length label $ 30; start = medication_key; /*---------------------------*/ /* Format for name of drug /*---------------------------*/ label = drug_name; fmtname = "drgfmt"; run; proc format cntlin=drgfmt library=library; run; /**********************************************************************/ /* Create a new variable in Patient Medication file using the format /**********************************************************************/ data new_drug_file; set in.PATIENT_MEDICATION_FILE; length drug_name $ 30; label drug_name = "Drug Name"; drug_name = put(medication_key, drgfmt.); run; Other formats can be made that translate the unique medication key into drug strength, formulation, or any other type of information about the drug found in the Drug Reference File. OUTPATIENT ONCOLOGY EMR The drug prescription data in the outpatient oncology EMR are complicated, both in terms of the structure of the EMR data and in the actual drug treatment patterns. These EMR data describe complex chemotherapy regimens used to treat cancer patients, where each chemotherapy regimen might consist of multiple chemotherapy agents administered with different timings, supportive care for the adverse side effects of chemotherapy, and palliative care for the symptoms of cancer. These complex chemotherapy dosing schedules can be disrupted for a variety of reasons, such as patient health, adverse side effects of treatment and changes in treatment strategy. The prescription drug data in the outpatient oncology EMR have undergone little to no standardization prior to receipt of the data. This lack of standardization is especially notable for the text variables supplying the names of the chemotherapy regimens, and the names of the individual chemotherapy agents. The regimen and chemotherapy agent names are found in the EMR as trade names, brand names, and abbreviations, all of which can have “alternate” spellings and contain extraneous text. The outpatient oncology EMR only contains information about drugs prescribed by the outpatient oncology clinic for the treatment and care of cancer patients. The EMR does not contain prescription drug information for other types of drugs used to treat other common non-cancer conditions, such as diabetes or hyperlipidemia. The important drug data files and selected variables found in these files from in the outpatient oncology EMR are described below. 4 NESUG 2008 Pharmaceuticals, Health Care, and Life Sciences EMR Outpatient Oncology EMR File Description Selected Variables Treatment Regimen File Patient Key – Unique patient identifier One observation for each prescribed regimen per patient, although this file is not used by all facilities entering data into the EMR Regimen Key – Unique treatment regimen identifier Name of the treatment regimen, unstandardized Number of cycles within each treatment plan Length of each treatment cycle Regimen start date Patient Prescription File Patient Key – Unique patient identifier One observation for each individual agent prescribed for the patient Regimen Key – Unique treatment regimen identifier if prescription is part of a defined regimen Prescription Key – Unique prescription identifier Name of prescribed agent Date of prescription Over 20 separate variables used to describe the drug’s prescribed dose, strength, route, frequency, duration, quantity dispensed, number of refills Patient Drug Administration File Patient Key – Unique patient identifier Prescription Key – Unique prescription identifier One observation for each prescribed drug administered medically at the outpatient oncology clinic Name of administered agent, (unstandardized text) Date of administration Variables for amount of administered dose, route of administration, and number of doses administered EMR Programming Tip – Multiple inconsistently used variables for drug dose, strength, and frequency The best way to interpret variables that quantify drug dose, especially when the variables are not used consistently, is to carefully look at frequencies and cross-frequencies of all drug dosing variables prior to analysis. The variables’ actual values cannot automatically be assumed to be consistent with the variables’ definitions. For example, sometimes three separate variables will be used to indicate 2 pills (dose) of 50 mg each (strength) taken 4 times a day (frequency), calculating a total daily dose of 400 mg. Other times, the value 400 will be entered directly in the drug dose field, and the values for strength and frequency will be missing. EMR Programming Tip – Standardizing Drug Names We maintain an Excel spreadsheet for almost 250 anti-neoplastic agents relevant to EMR oncology analyses. This spreadsheet gives a standardized name and classification for each anti-neoplastic agent, along with a variety of brand names, generic names, and alternate spellings that might be found as text data in the EMR. This spreadsheet is updated as needed, and converted into a SAS data set. One SAS program uses this SAS data set of standardized agent names as input to dynamically generate another SAS “text standardization” program. None of the actual agent names are ever “hard-coded” in this dynamically generated standardization program. The text standardization program also incorporates base SAS string functions, “fuzzy” text comparison functions, and PERL regular expression (PRX) functions as part of the text standardization process. 5 NESUG 2008 Pharmaceuticals, Health Care, and Life Sciences PRESCRIPTION TERMINOLOGY Doctors write prescriptions using abbreviations called “sig” codes to describe how the prescribed drug should be administered to the patient. These sig codes often contain abbreviations of Latin words that indicate when and how these drugs should be administered. Sig prescription codes can appear in an EMR as text data, which will need to be parsed out of the text data into relevant numeric variables for quantitative dose calculations. A partial list of sig codes that might be found in EMR data, and their definitions, are given in the table below: TYPE OF RX SIG CODE Amount to take Route of drug administration When to take drug CODE T1 Take one DEFINITION T2 Take two PO By mouth (orally) SL Sublingual (under tongue) s.c. Subcutaneously (inject under skin) IV Intravenously (inject or infuse into veins) top Topically PR Rectally qd Every day bid Twice a day tid Three times a day qid Four times a day qhs Once a day before bedtime q4h Every 4 hours q8h Every 8 hours qw Every week tiw Three times a week qam Every morning prn As needed The following gives an example of SAS code used to translate an EMR text variable that contains either a sig code, or the text definition of the code, into a numeric variable equal to the number of doses administered per day. The variables in the code include: • ADMN_DOSE_FRQ_DESC – A text variable that contains either a sig code or the text of the definition of a sig code • ADMN_FRQ_X – A numeric variable that quantifies “every” when ADMN_DOSE_FRQ_DESC = “q” • ADMN_FRQ_UNIT – A coded numeric variable that gives the units associated with ADMN_FRQ_X, where 4 = “Days”, 3 = “Hours” • ADMN_DOSE_FRQ_NUM – A derived numeric variable that gives the number of doses taken per day 6 NESUG 2008 Pharmaceuticals, Health Care, and Life Sciences data DRUG_daily_freq; set DRUG_orders; label admn_dose_frq_num = "Numeric daily dose frequency"; /*-----------------------------------------------------------------------*/ /* Translate character Dose Frequency (ADMN_DOSE_FRQ_DESC) into a number /* Need to also use variables ADMN_FRQ_X and ADMN_FRQ_UNIT when /* Dose Frequency = "q" (medical term for "every") /* Need to run cross frequencies of variables ADMN_DOSE_FRQ_DESC, /* ADMN_FRQ_X, and ADMN_FRQ_UNIT prior to writing program to make sure /* all values of unstandardized text strings found in data will be /* included in the program. /*-----------------------------------------------------------------------*/ if admn_dose_frq_desc else if admn_dose_frq_desc else if admn_dose_frq_desc else if admn_dose_frq_desc else if admn_dose_frq_desc else if admn_dose_frq_desc else if admn_dose_frq_desc else if admn_dose_frq_desc = = = = = = = = "at bedtime" then admn_dose_frq_num = 1; "b.i.d." then admn_dose_frq_num = 2; "four times a day" then admn_dose_frq_num = 4; "t.i.d." then admn_dose_frq_num = 3; "daily" then admn_dose_frq_num = 1; "ac am" then admn_dose_frq_num = 1; "pc (bid)" then admn_dose_frq_num = 2; "6x/d" then admn_dose_frq_num = 6; else if admn_dose_frq_desc = "q" then do; /*----------------------*/ /* Units = "every day" /*----------------------*/ if admn_frq_x = 1 and admn_frq_unit = 4 else if admn_frq_x=3 and admn_frq_unit=4 /*-----------------------*/ /* Units = "every hour" /*-----------------------*/ else if admn_frq_x=12 and admn_frq_unit=3 end; then admn_dose_frq_num = 1; then admn_dose_frq_num = 1/3; then admn_dose_frq_num = 2; run; TYPES OF ANALYSES WITH EMR DRUG PRESCRIPTION DATA Types of analyses that can be conducted with EMR drug prescription data include determining drug treatment patterns, identifying drug dose modifications, and quantifying total drug dose or average daily drug dose. EMR drug prescription data can also be used to quantify each patient’s time to an event associated with drug treatment, such as the time to a drug discontinuation, drug switch, or modification of drug dose. DRUG TREATMENT PATTERNS EMR drug prescription data can be analyzed to determine if a patient continued drug therapy over a period of time, discontinued drug therapy, switched to use of a different drug, or added an additional drug to a treatment regimen. Each analysis of EMR drug prescription data will need to define the specific details of the rules used to characterize drug treatment patterns. Continuation of drug therapy can be determined by looking at the number of days between the end of the previous prescription and the start of the next prescription for the same drug. If an allowable number of “gap days” between prescriptions is defined, then patients can be considered as having continued drug use, even though the actual prescription start and stop dates might not be continuous. When the number of days between drug prescriptions exceeds the allowable number of gap days, or if a patient has been followed-up for more than the specified number of gap days without a new prescription, then that patient can be assumed to have discontinued use of the drug. 7 NESUG 2008 Pharmaceuticals, Health Care, and Life Sciences EMR Programming Tip – Determining drug treatment patterns with gaps between prescriptions Set up the allowable number of “gap days” between prescriptions as a macro variable, instead of hard-coding the number of allowable gap days. Sensitivity analyses, based on changing the number of allowable “gap days” can be done more easily if the number of allowable gap days is stored in a macro variable. Switching a drug can be defined as discontinuing treatment with one drug and starting treatment with a different drug. Adding another drug can be defined as starting treatment with another drug while continuing treatment with the original drug. RELATIVE CHANGE IN DRUG DOSE Patients might need to change the prescribed dose of a drug in response to an adverse side effect, or in response to a treatment effect. Changes in drug dose can be identified by looking at the individual components of a drug prescription – amount taken, frequency of administration, and dose associated with a singe dose – to determine if there was a change in prescription drug dose. QUANTIFYING DRUG DOSE Total patient drug dose can be calculated several ways with EMR data, depending on the specific available prescription drug data. Daily patient drug dose can also be derived with simple computations, either by knowing the patient’s daily treatment regimen, or by knowing the total amount of drug included in the prescription and the prescription’s days supply. The calculation of total patient dose using real-world EMR data can require more than simple computations because a patient might have several prescriptions for a single drug. The available prescription drug data needed to derive a patient’s total drug dose might be different on the different prescriptions, and any single prescription might not have sufficient data to calculate total dose for that prescription. EMR Programming Tip – Calculating total drug dose from multiple prescriptions, with missing data As the number of drug prescriptions for a patient in an EMR increases, the odds are that at least one of these prescriptions will be missing some component of the relevant data needed to calculate total prescription dose. Two options for analyzing prescription drug data with missing dose variables include excluding patients with any missing drug dose data, or imputing the missing data. Imputing missing dose data can be based on data found in the patient’s other prescriptions. If patients with any missing prescription data are dropped from an analysis of total dose, then the study will be biased towards patients with fewer prescriptions. CONCLUSION: As the use of EMR systems in medical practices continues to increase, the number of patients, different medical specialties represented, and total length of follow-up available on each patient will rise. As the medical and scientific communities realize the value of EMR data for health-related research, EMR systems will hopefully be designed to allow for data that are more easily accessible for analysis, especially for analysis with SAS. REFERENCES: Burt CW, Hing E, Woodwell D “Electronic Medical Record Use by Office-Based Physicians: United States, 2005” Center for Disease Control, National Center for Health Statistics, Health E-Stat, July 21, 2006 www.cdc.gov/hchs th Eason, Jenine, “Proc Format, a Speedy Alternative to Sort/Merge”, Proceedings of the 30 Annual Sas Users Group (SUGI) Conference, 2005. Fraeman, Kathy Hardis, “Dynamic Generation of SAS Data Step Programming From An Excel Data Dictionary”, th Proceedings of the 12 Annual Northeast SAS Users Group (NESUG) Regional Conference, 1999. 8 NESUG 2008 Pharmaceuticals, Health Care, and Life Sciences Nordstrom BL, Fraeman KH, Luo W, Collins JW, O’Malley CD, Nordyke RJ. “Red Blood Cell Transfusions Among Cancer Patients on Chemotherapy: A Descriptive Epidemiologic Study.” Journal of Clinical Oncology, 2008 ASCO Annual Meeting Proceedings Part I. Vol 26, (May 20 Supplement) 2008: 20623. Nordstrom BL, Langer C, Hussain A, Barghout V, Modi C, Lacerna L, Gralow, JR. “Renal Function Among Cancer Patients with Bone Metastases Treated with Zolendronic Acid in a Real World Setting.” Journal of Clinical Oncology, 2007 ASCO Annual Meeting Proceedings Part I. Vol 25, No 18S (June 20 Supplement) 2007: 19540. Luo W, Nordstrom B, Ranganathan G, Linz H, Stokes M, Ross SD, Knopf K. “Adherence to Guidelines for use of Erythropoiesis Stimulating Agents in Patients with Chemotherapy-Induced Anemia: Trends from Electronic th Medical Records.” International Society for Pharmacoeconomics and Outcomes Research (ISPOR) 12 Annual International Meeting, May 22 2007, Arlington, VA. ACKNOWLEDGMENTS The author would like to acknowledge Beth L. Nordstrom, PhD, MPH, Senior Epidemiologist at the United BioSource Corporation and principal investigator for all EMR research studies in which the author participated. The author would also like to thank Dr. Nordstrom for her insightful comments while editing this manuscript. SAS is a Registered Trademark of the SAS Institute, Inc. of Cary, North Carolina. CONTACT INFORMATION Please contact the author with any comments or questions: Kathy H. Fraeman United BioSource Corporation 7101 Wisconsin Avenue, Suite 600 Bethesda, MD 20832 (240) 235-2525 voice (301) 654-9864 fax [email protected] 9