Download Just What the Doctor Ordered: Using SAS to Calculate Patient Drug Dose with Electronic Medical Record (EMR) Prescription Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nanomedicine wikipedia , lookup

Adherence (medicine) wikipedia , lookup

Drug discovery wikipedia , lookup

Harm reduction wikipedia , lookup

Prescription costs wikipedia , lookup

Pharmacokinetics wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Theralizumab wikipedia , lookup

Electronic prescribing wikipedia , lookup

Transcript
NESUG 2008
Pharmaceuticals, Health Care, and Life Sciences
Just What the Doctor Ordered: Using SAS® to Calculate Patient Drug Dose
with Electronic Medical Record (EMR) Prescription Data
Kathy Hardis Fraeman, United BioSource Corporation, Bethesda, MD
ABSTRACT
Electronic medical records (EMRs) are paperless digital versions of physicians’ paper charts. EMRs can contain
a myriad of health care information – including details of physician drug prescriptions – and research efforts with
EMR data often include determining patient drug treatment patterns and dose modifications. Unfortunately,
electronic versions of physician prescriptions are often as varied and undecipherable as their paper counterparts.
Elements in EMR drug prescription data include the often unstandardized text name of the drug, and
inconsistently used numeric variables indicating the number of pills to be taken at one time, the number of times
per day the pills should be taken, number of days the pills should be taken, and dosage amount per pill. Further,
dosing instructions are often given as unstandardized Latin abbreviations, such as “BID,” “QD,” and “PRN.”
Although there’s no “right way” to analyze such varied EMR drug prescription data, programming tips and
techniques with Base SAS are presented that show ways to derive patient drug dose and treatment patterns with
EMR data. Examples from a general practice EMR and an outpatient oncology EMR are used to illustrate the tips
and techniques.
INTRODUCTION
Electronic medical records (EMRs) can be described as paperless, digital versions of the patient chart folders
found on rows of shelves in physicians’ offices. EMRs can contain a wide variety of health care information,
including medical histories, details of diagnoses and treatments, clinical laboratory test results and treatment
responses, and visit scheduling and billing information. Numerous studies have touted the advantages for
physicians using EMR systems, as EMRs have the potential to improve the quality of patient care while
substantially reducing medical costs.
Although EMRs are not yet widely used in a majority of physician offices, their use is increasing. A survey
conducted by the CDC’s National Center for Health Statistics found that almost one in four physicians reported
using either full or partial EMR systems in their office-based practices in 2005, representing a 32% increase since
2001.
While physicians use EMRs to improve the quality of medical care and efficiency of their practices, scientists and
medical researchers see the enormous research potential in de-identified EMR data. Not only do EMR data
provide a wealth of information about real-world medical practices, they also include laboratory results and patient
treatment response assessments not usually found in insurance claims databases. EMR data further provide
important patient demographics not available in insurance claims data, such as patient height and weight, and
smoking status and alcohol use.
As the saying goes, “if you’ve seen one EMR, you’ve seen one EMR.” EMRs are not standardized in terms of
either their database structure or content, and there’s no single or “right” way to analyze EMR data for their wide
range of research possibilities. Drug prescription data – one of the many types of data found in EMRs – can be
very useful for a variety of research efforts, but these electronic drug prescription data can also be as
undecipherable as their paper prescription counterparts.
SPECIFIC EMRS
The two EMRs discussed in this paper are:
•
a general practice EMR, and
•
an outpatient oncology EMR.
All of the patient data in these EMRs are de-identified, as required by the Health Insurance Portability and
Accountability Act of 1996 (HIPAA) regulations.
GENERAL PRACTICE EMR
The general practice EMR is a database containing medical data from a group of over 2,600 general practitioners
and over 6 million patients. More than half of the states in the US are represented, and the average length of
follow-up for patients is approximately 3 years. This database includes detailed information regarding patient
demographics, such as patient height and weight, smoking status, and alcohol use. The general practice EMR
1
NESUG 2008
Pharmaceuticals, Health Care, and Life Sciences
also includes the results of several hundred different types of laboratory tests, and provides complete information
on all drugs for which a prescription is written for patients receiving care from that practice.
The general practice EMR contains 17 separate data sets, four of which contain prescription drug information.
OUTPATIENT ONCOLOGY EMR
The outpatient oncology EMR database includes medical and treatment information for more than 185,000 cancer
patients from 18 outpatient oncology provider organizations comprising 71 clinic locations in 15 different states
across the United States. The EMR contains descriptions of prescribed chemotherapy treatment regimens,
including the number and length of each chemotherapy cycle within the regimen. The EMR also contains
physician prescription information for each specific chemotherapy agent included in the regimen, along with
prescriptions for supportive care drugs for the side effects of chemotherapy and palliative care for cancer patients.
If the chemotherapy agents or supportive care drugs are administered medically at the cancer clinic, usually by
injection or infusion, then the EMR records information about the drug administrations at the clinic.
The outpatient oncology EMR contains 49 separate data sets, six of which contain drug prescription and drug
administration information, although our drug analyses with these EMR data only use three data sets.
CONVERTING EMR DATA TO SAS
None of the data in either of these two EMRs were originally collected in SAS format at the physician offices or
oncology clinics, and these data are not delivered to research clients in SAS format. The original EMR data were
collected using other relational database packages, and these relational databases were received for analysis as
multiple separate text files. These text files had to be converted into SAS format data sets, with the appropriate
variable characteristics, labels, and formats. The EMR data also needed varying degrees of internal
standardization, especially for the variables relevant to deriving patient drug dosing and treatment patterns.
EMR Programming Tip – Converting non-SAS EMR data to SAS format
Conversion of non-SAS EMR data into a SAS data set format can be time-consuming, especially given the large
number of data sets, variables, and attached formats found in most EMRs. If non-SAS EMR data are delivered
with an electronic data dictionary, it’s possible to use that data dictionary as part of the data conversion process.
The electronic data dictionary can be converted to SAS meta-data, which can then be used to dynamically
generate SAS DATA-step programming, such as format, label, and length statements, to be used as part of the
data conversion process. For an example of this programming technique, see Fraeman, NESUG 1999.
DRUG PRESCRIPTION DATA VS. DRUG DISPENSING DATA
Electronic medical records contain drug prescription data, rather than the actual drug dispensing information
found in medical insurance claims data. EMR prescription data describes how a drug was prescribed for the
patient by the physician, but EMR data can’t verify if that prescription was actually filled. If the prescription
indicates an allowable number of refills, the EMR data won’t be able to directly indicate if the patient actually ever
refilled the prescription. One advantage of EMR drug prescription data over insurance claims drug dispensing
data is that EMRs are able to monitor physician prescriptions for over-the-counter medications.
EMR Programming Tip – Analytically interpreting EMR prescription data
There are no standard rules about how to interpret patient drug use with “drug prescription data” as opposed to
“drug dispensing data”, other than to be aware of this distinction. It is important to know about the drugs being
studied, the medical conditions being treated by these drugs, and the types of treatment patterns that might be
expected for these medical conditions. Most importantly, it is very important to look at the actual EMR drug
prescription data, paying special attention to the relative timing and dates of each patient’s drug
prescriptions. Understanding the drugs, diseases, and the actual EMR drug prescription data will help develop
study-specific rules on how to interpret the EMR drug prescription data.
2
NESUG 2008
Pharmaceuticals, Health Care, and Life Sciences
DRUG PRESCRIPTION DATA IN EMRS
The data sets, or files, in an EMR that provide information about each patient’s drug prescriptions -- and variables
contained within each drug prescription data file -- can vary across different EMRs.
GENERAL PRACTICE EMR
The drug prescription data in the general practice EMR have already undergone some degree of standardization
prior to receipt of the data. The prescription drug files and selected variables found in these files from the general
practice EMR are described below:
EMR
General Practice
EMR
File Description
Selected Variables
Drug Reference File
Medication Key – Unique drug identifier
One observation for each drug
Names of drug – product and generic
Numeric drug strength and units of measure
Drug Generic Product Index (GPI) code
National Drug Council (NDC) code
Medication Categories
Patient Medication File
Patient Key – Unique patient identifier
One observation for each patient
medication
Medication Key – Unique drug identifier
Sig Key – Unique key for medication instructions
Drug start and stop dates
Drug active flag
Drug stop reason
Patient Prescription File
Patient Key – Unique patient identifier
One observation for each patient
prescription
Medication Key – Unique drug identifier
Prescription Key – Unique prescription identifier
Prescription Date
Number of refills
Sig File
Sig Key – Unique key for medication instructions
One observation for each set of
medication instructions
Medication frequency, dose, route
EMR Programming Tip – Efficiently using the Drug Reference File
A Drug Reference File in an EMR will have a unique drug key variable associated with each drug, and will have
more detailed drug information, such as complete drug name, drug strength, formulation, and route of
administration. Each entry in the EMR Patient Medication File will be keyed on patient ID, and will also include
the unique drug key variable, instead of including all of the detailed information about the drug in each
observation of the Patient Medication File.
One method to get all of the detailed information about the drug into each patient drug record in the Patient
Medication File would be to SORT/MERGE both files using the drug key variable. This method of adding
detailed drug information to each patient drug record will be slow and inefficient, especially if the Drug Reference
File has hundreds of thousands of observations, and the Patient Medication File has hundreds of millions of
observations.
3
NESUG 2008
Pharmaceuticals, Health Care, and Life Sciences
EMR Programming Tip – Efficiently using the Drug Reference File, continued
The method described by Eason (NESUG 2005) describes how to use PROC FORMAT as a “speedy
alternative” to the SORT/MERGE. The SAS code that will create a variable for drug name, based on a format
generated from the unique medication key, would look something like this:
/****************************************************************/
/* Create a format for drug name from the Drug Reference File
/****************************************************************/
data drgfmt (keep = start label fmtname);
set in.DRUG_REFERENCE_FILE;
length label $ 30;
start = medication_key;
/*---------------------------*/
/* Format for name of drug
/*---------------------------*/
label = drug_name;
fmtname = "drgfmt";
run;
proc format cntlin=drgfmt library=library;
run;
/**********************************************************************/
/* Create a new variable in Patient Medication file using the format
/**********************************************************************/
data new_drug_file;
set in.PATIENT_MEDICATION_FILE;
length drug_name $ 30;
label drug_name = "Drug Name";
drug_name = put(medication_key, drgfmt.);
run;
Other formats can be made that translate the unique medication key into drug strength, formulation, or any other
type of information about the drug found in the Drug Reference File.
OUTPATIENT ONCOLOGY EMR
The drug prescription data in the outpatient oncology EMR are complicated, both in terms of the structure of the
EMR data and in the actual drug treatment patterns. These EMR data describe complex chemotherapy regimens
used to treat cancer patients, where each chemotherapy regimen might consist of multiple chemotherapy agents
administered with different timings, supportive care for the adverse side effects of chemotherapy, and palliative
care for the symptoms of cancer. These complex chemotherapy dosing schedules can be disrupted for a variety
of reasons, such as patient health, adverse side effects of treatment and changes in treatment strategy.
The prescription drug data in the outpatient oncology EMR have undergone little to no standardization prior to
receipt of the data. This lack of standardization is especially notable for the text variables supplying the names of
the chemotherapy regimens, and the names of the individual chemotherapy agents. The regimen and
chemotherapy agent names are found in the EMR as trade names, brand names, and abbreviations, all of which
can have “alternate” spellings and contain extraneous text.
The outpatient oncology EMR only contains information about drugs prescribed by the outpatient oncology clinic
for the treatment and care of cancer patients. The EMR does not contain prescription drug information for other
types of drugs used to treat other common non-cancer conditions, such as diabetes or hyperlipidemia.
The important drug data files and selected variables found in these files from in the outpatient oncology EMR are
described below.
4
NESUG 2008
Pharmaceuticals, Health Care, and Life Sciences
EMR
Outpatient Oncology
EMR
File Description
Selected Variables
Treatment Regimen File
Patient Key – Unique patient identifier
One observation for each
prescribed regimen per patient,
although this file is not used by all
facilities entering data into the
EMR
Regimen Key – Unique treatment regimen identifier
Name of the treatment regimen, unstandardized
Number of cycles within each treatment plan
Length of each treatment cycle
Regimen start date
Patient Prescription File
Patient Key – Unique patient identifier
One observation for each
individual agent prescribed for the
patient
Regimen Key – Unique treatment regimen identifier
if prescription is part of a defined regimen
Prescription Key – Unique prescription identifier
Name of prescribed agent
Date of prescription
Over 20 separate variables used to describe the
drug’s prescribed dose, strength, route, frequency,
duration, quantity dispensed, number of refills
Patient Drug Administration
File
Patient Key – Unique patient identifier
Prescription Key – Unique prescription identifier
One observation for each
prescribed drug administered
medically at the outpatient
oncology clinic
Name of administered agent, (unstandardized text)
Date of administration
Variables for amount of administered dose, route of
administration, and number of doses administered
EMR Programming Tip – Multiple inconsistently used variables for drug dose, strength, and frequency
The best way to interpret variables that quantify drug dose, especially when the variables are not used
consistently, is to carefully look at frequencies and cross-frequencies of all drug dosing variables prior to
analysis. The variables’ actual values cannot automatically be assumed to be consistent with the variables’
definitions. For example, sometimes three separate variables will be used to indicate 2 pills (dose) of 50 mg
each (strength) taken 4 times a day (frequency), calculating a total daily dose of 400 mg. Other times, the value
400 will be entered directly in the drug dose field, and the values for strength and frequency will be missing.
EMR Programming Tip – Standardizing Drug Names
We maintain an Excel spreadsheet for almost 250 anti-neoplastic agents relevant to EMR oncology analyses.
This spreadsheet gives a standardized name and classification for each anti-neoplastic agent, along with a
variety of brand names, generic names, and alternate spellings that might be found as text data in the EMR.
This spreadsheet is updated as needed, and converted into a SAS data set. One SAS program uses this SAS
data set of standardized agent names as input to dynamically generate another SAS “text standardization”
program. None of the actual agent names are ever “hard-coded” in this dynamically generated standardization
program. The text standardization program also incorporates base SAS string functions, “fuzzy” text comparison
functions, and PERL regular expression (PRX) functions as part of the text standardization process.
5
NESUG 2008
Pharmaceuticals, Health Care, and Life Sciences
PRESCRIPTION TERMINOLOGY
Doctors write prescriptions using abbreviations called “sig” codes to describe how the prescribed drug should be
administered to the patient. These sig codes often contain abbreviations of Latin words that indicate when and
how these drugs should be administered.
Sig prescription codes can appear in an EMR as text data, which will need to be parsed out of the text data into
relevant numeric variables for quantitative dose calculations. A partial list of sig codes that might be found in
EMR data, and their definitions, are given in the table below:
TYPE OF RX SIG CODE
Amount to take
Route of drug administration
When to take drug
CODE
T1
Take one
DEFINITION
T2
Take two
PO
By mouth (orally)
SL
Sublingual (under tongue)
s.c.
Subcutaneously (inject under skin)
IV
Intravenously (inject or infuse into veins)
top
Topically
PR
Rectally
qd
Every day
bid
Twice a day
tid
Three times a day
qid
Four times a day
qhs
Once a day before bedtime
q4h
Every 4 hours
q8h
Every 8 hours
qw
Every week
tiw
Three times a week
qam
Every morning
prn
As needed
The following gives an example of SAS code used to translate an EMR text variable that contains either a sig
code, or the text definition of the code, into a numeric variable equal to the number of doses administered per
day. The variables in the code include:
•
ADMN_DOSE_FRQ_DESC – A text variable that contains either a sig code or the text of the definition of
a sig code
•
ADMN_FRQ_X – A numeric variable that quantifies “every” when ADMN_DOSE_FRQ_DESC = “q”
•
ADMN_FRQ_UNIT – A coded numeric variable that gives the units associated with ADMN_FRQ_X,
where 4 = “Days”, 3 = “Hours”
•
ADMN_DOSE_FRQ_NUM – A derived numeric variable that gives the number of doses taken per day
6
NESUG 2008
Pharmaceuticals, Health Care, and Life Sciences
data DRUG_daily_freq;
set DRUG_orders;
label admn_dose_frq_num = "Numeric daily dose frequency";
/*-----------------------------------------------------------------------*/
/* Translate character Dose Frequency (ADMN_DOSE_FRQ_DESC) into a number
/* Need to also use variables ADMN_FRQ_X and ADMN_FRQ_UNIT when
/* Dose Frequency = "q" (medical term for "every")
/* Need to run cross frequencies of variables ADMN_DOSE_FRQ_DESC,
/*
ADMN_FRQ_X, and ADMN_FRQ_UNIT prior to writing program to make sure
/*
all values of unstandardized text strings found in data will be
/*
included in the program.
/*-----------------------------------------------------------------------*/
if admn_dose_frq_desc
else if admn_dose_frq_desc
else if admn_dose_frq_desc
else if admn_dose_frq_desc
else if admn_dose_frq_desc
else if admn_dose_frq_desc
else if admn_dose_frq_desc
else if admn_dose_frq_desc
=
=
=
=
=
=
=
=
"at bedtime"
then admn_dose_frq_num = 1;
"b.i.d."
then admn_dose_frq_num = 2;
"four times a day" then admn_dose_frq_num = 4;
"t.i.d."
then admn_dose_frq_num = 3;
"daily"
then admn_dose_frq_num = 1;
"ac am"
then admn_dose_frq_num = 1;
"pc (bid)"
then admn_dose_frq_num = 2;
"6x/d"
then admn_dose_frq_num = 6;
else if admn_dose_frq_desc = "q" then do;
/*----------------------*/
/* Units = "every day"
/*----------------------*/
if admn_frq_x = 1 and admn_frq_unit = 4
else if admn_frq_x=3 and admn_frq_unit=4
/*-----------------------*/
/* Units = "every hour"
/*-----------------------*/
else if admn_frq_x=12 and admn_frq_unit=3
end;
then admn_dose_frq_num = 1;
then admn_dose_frq_num = 1/3;
then admn_dose_frq_num = 2;
run;
TYPES OF ANALYSES WITH EMR DRUG PRESCRIPTION DATA
Types of analyses that can be conducted with EMR drug prescription data include determining drug treatment
patterns, identifying drug dose modifications, and quantifying total drug dose or average daily drug dose. EMR
drug prescription data can also be used to quantify each patient’s time to an event associated with drug
treatment, such as the time to a drug discontinuation, drug switch, or modification of drug dose.
DRUG TREATMENT PATTERNS
EMR drug prescription data can be analyzed to determine if a patient continued drug therapy over a period of
time, discontinued drug therapy, switched to use of a different drug, or added an additional drug to a treatment
regimen. Each analysis of EMR drug prescription data will need to define the specific details of the rules used to
characterize drug treatment patterns.
Continuation of drug therapy can be determined by looking at the number of days between the end of the
previous prescription and the start of the next prescription for the same drug. If an allowable number of “gap
days” between prescriptions is defined, then patients can be considered as having continued drug use, even
though the actual prescription start and stop dates might not be continuous. When the number of days between
drug prescriptions exceeds the allowable number of gap days, or if a patient has been followed-up for more than
the specified number of gap days without a new prescription, then that patient can be assumed to have
discontinued use of the drug.
7
NESUG 2008
Pharmaceuticals, Health Care, and Life Sciences
EMR Programming Tip – Determining drug treatment patterns with gaps between prescriptions
Set up the allowable number of “gap days” between prescriptions as a macro variable, instead of hard-coding
the number of allowable gap days. Sensitivity analyses, based on changing the number of allowable “gap days”
can be done more easily if the number of allowable gap days is stored in a macro variable.
Switching a drug can be defined as discontinuing treatment with one drug and starting treatment with a different
drug. Adding another drug can be defined as starting treatment with another drug while continuing treatment with
the original drug.
RELATIVE CHANGE IN DRUG DOSE
Patients might need to change the prescribed dose of a drug in response to an adverse side effect, or in response
to a treatment effect. Changes in drug dose can be identified by looking at the individual components of a drug
prescription – amount taken, frequency of administration, and dose associated with a singe dose – to determine if
there was a change in prescription drug dose.
QUANTIFYING DRUG DOSE
Total patient drug dose can be calculated several ways with EMR data, depending on the specific available
prescription drug data. Daily patient drug dose can also be derived with simple computations, either by knowing
the patient’s daily treatment regimen, or by knowing the total amount of drug included in the prescription and the
prescription’s days supply.
The calculation of total patient dose using real-world EMR data can require more than simple computations
because a patient might have several prescriptions for a single drug. The available prescription drug data needed
to derive a patient’s total drug dose might be different on the different prescriptions, and any single prescription
might not have sufficient data to calculate total dose for that prescription.
EMR Programming Tip – Calculating total drug dose from multiple prescriptions, with missing data
As the number of drug prescriptions for a patient in an EMR increases, the odds are that at least one of these
prescriptions will be missing some component of the relevant data needed to calculate total prescription dose.
Two options for analyzing prescription drug data with missing dose variables include excluding patients with any
missing drug dose data, or imputing the missing data. Imputing missing dose data can be based on data found
in the patient’s other prescriptions. If patients with any missing prescription data are dropped from an analysis of
total dose, then the study will be biased towards patients with fewer prescriptions.
CONCLUSION:
As the use of EMR systems in medical practices continues to increase, the number of patients, different medical
specialties represented, and total length of follow-up available on each patient will rise. As the medical and
scientific communities realize the value of EMR data for health-related research, EMR systems will hopefully be
designed to allow for data that are more easily accessible for analysis, especially for analysis with SAS.
REFERENCES:
Burt CW, Hing E, Woodwell D “Electronic Medical Record Use by Office-Based Physicians: United States, 2005”
Center for Disease Control, National Center for Health Statistics, Health E-Stat, July 21, 2006 www.cdc.gov/hchs
th
Eason, Jenine, “Proc Format, a Speedy Alternative to Sort/Merge”, Proceedings of the 30 Annual Sas Users
Group (SUGI) Conference, 2005.
Fraeman, Kathy Hardis, “Dynamic Generation of SAS Data Step Programming From An Excel Data Dictionary”,
th
Proceedings of the 12 Annual Northeast SAS Users Group (NESUG) Regional Conference, 1999.
8
NESUG 2008
Pharmaceuticals, Health Care, and Life Sciences
Nordstrom BL, Fraeman KH, Luo W, Collins JW, O’Malley CD, Nordyke RJ. “Red Blood Cell Transfusions Among
Cancer Patients on Chemotherapy: A Descriptive Epidemiologic Study.” Journal of Clinical Oncology, 2008 ASCO
Annual Meeting Proceedings Part I. Vol 26, (May 20 Supplement) 2008: 20623.
Nordstrom BL, Langer C, Hussain A, Barghout V, Modi C, Lacerna L, Gralow, JR. “Renal Function Among Cancer
Patients with Bone Metastases Treated with Zolendronic Acid in a Real World Setting.” Journal of Clinical
Oncology, 2007 ASCO Annual Meeting Proceedings Part I. Vol 25, No 18S (June 20 Supplement) 2007: 19540.
Luo W, Nordstrom B, Ranganathan G, Linz H, Stokes M, Ross SD, Knopf K. “Adherence to Guidelines for use of
Erythropoiesis Stimulating Agents in Patients with Chemotherapy-Induced Anemia: Trends from Electronic
th
Medical Records.” International Society for Pharmacoeconomics and Outcomes Research (ISPOR) 12 Annual
International Meeting, May 22 2007, Arlington, VA.
ACKNOWLEDGMENTS
The author would like to acknowledge Beth L. Nordstrom, PhD, MPH, Senior Epidemiologist at the United
BioSource Corporation and principal investigator for all EMR research studies in which the author participated.
The author would also like to thank Dr. Nordstrom for her insightful comments while editing this manuscript.
SAS is a Registered Trademark of the SAS Institute, Inc. of Cary, North Carolina.
CONTACT INFORMATION
Please contact the author with any comments or questions:
Kathy H. Fraeman
United BioSource Corporation
7101 Wisconsin Avenue, Suite 600
Bethesda, MD 20832
(240) 235-2525 voice
(301) 654-9864 fax
[email protected]
9