Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
-7- 2. CONCEPTUALIZING AND MEASURING QUALITY OF CARE How can we tell whether health care performance is poor, average or superior? To answer this question, standards of performance must be defined and measured. This chapter provides a brief review of how quality is conceptualized by health services researchers and discusses potential data sources for quality measurement. Measuring the quality of care with claims data is emphasized. WHAT IS QUALITY AND HOW IS IT MEASURED? Quality of care is a multidimensional concept that includes technical care, the application of medical science and technology to a problem, and interpersonal care, the personal interaction between patient and provider. Quality is often assessed with measures of structure, process, and outcome (Donabedian 1980). Measures of structure are concerned with descriptive characteristics of the health care market, provider organizations, professional personnel, and the individuals needing health care services. providers do for patients. Processes reflect what Outcomes pertain to the effects of care on the patient’s physical, social, and mental functioning. The Institute of Medicine (IOM) defines quality as “the degree to which health care services for individuals and populations increase the likelihood of desired health outcomes and are consistent with current professional knowledge” (Lohr 1990). Health outcomes include the presence or absence of illness, impairments, or handicaps; a patient’s physical functioning, emotional health, cognitive functioning, pain and other symptoms; days lost from work, school or usual activities; and a patient’s satisfaction, knowledge, or compliance. Although improved health outcomes are the ultimate goal of delivering health care, measures of structure and process are of interest because good structure increases the likelihood of good process, and good process increases the likelihood of a good outcome. -8- Structural measures evaluate the human, physical and financial resources that are needed to provide medical care. For health care providers, structural variables include demographics and professional characteristics such as specialty and board certification. For institutions, structural variables include the number, size and geographic distribution of health care providers and hospitals as well as their access to health care equipment and technologies. The way in which health care is financed and how providers are reimbursed are also structural components of health care. Accreditation of health care organizations by groups such as the Joint Commission on Accreditation of Healthcare Organizations (JCAHO) and the National Committee for Quality Assurance (NCQA) has traditionally relied heavily upon structural measures. The Leapfrog Group, a consortium of health care purchasers, uses structural measures to encourage quality improvement (Meyer and Massagli 2001). Their measures include the presence of computerized physician order entry systems for medications in hospitals, the rate at which patients are referred to high volume providers for specific surgical procedures, and the immediate availability of board-certified critical care specialists in intensive care units. Each of these measures has been associated with improved processes and outcomes (Begg, Cramer et al. 1998; Cebul, Snow et al. 1998; Wennberg, Lucas et al. 1998; Pronovost, Jenckes et al. 1999; Teich, Merchia et al. 2000). Process measures evaluate the preventive, diagnostic and therapeutic interventions received by a patient. A process measure is a valid indicator of quality when the indicated care has been shown to have a direct link to improved outcomes. The rate at which patients receive aspirin promptly upon presentation with a heart attack, for example, is a valid quality measure because early initiation of aspirin improves survival (ISIS-2 1988). Another valid measure of quality is the rate at which diabetics have regular testing of their blood sugar levels because control of HbA1c levels reduces the risk for many of the complications associated with diabetes (DCCT 1993; Eastman, Javitt et al. 1997). -9- The type of measure selected—structure, process, or outcome—depends on the purpose of assessment. Structural characteristics, for example, can be used to infer that the context in which care is delivered is conducive to good care, but are generally too blunt to determine whether care is good or bad. For example, there can be wide variation in the quality of care delivered at the same hospital depending on diagnosis, procedure, or attending physician (Rosenthal, Shah et al. 1998). Structural measures are unable to reflect these differences. The principal value of process measures is that the link between health care and outcomes highlight what can be changed in the delivery of care to improve health outcomes. Nevertheless, our ability to use process measures is limited by the strengths and weaknesses of clinical science. Outcome measures are appealing because they appear to be the most direct assessment of quality; but good quality may not prevent a bad outcome. Other factors outside the control of the health care system, such as patient behavior and environment, also affect outcomes. For a health outcome to be a valid quality measure, it must be possible to differentiate between the influences of the health care system from the effects of other factors. Outcome measurement is also problematic because the time between the delivery of health care services and the outcome of interest can be quite long. As a rule, quality measurement activities with components of structure, process, and outcome allow the strengths of each approach to compensate for the weaknesses of the others. WHAT IS A QUALITY INDICATOR? There is no standardized vocabulary to describe the measurement of quality in health care. A variety of terms such as performance reports, report cards, provider profiles, recommended care, criteria, standards, measures, and indicators are used in the discussions of quality measurement – often with subtle, but not standard, distinctions between them. I will use a single term, indicator, to refer to explicit criteria by which the quality of care can be evaluated. indicators are: Examples of -10- • All patients aged 65 and over should have been offered influenza vaccine annually or have documentation that they received it elsewhere. • Recurrent moderate or severe tension headaches should be treated with a trial of tricyclic antidepressant agents, if there are no medical contraindications to use. • A urine culture should be obtained for patients who have dysuria and have had "several" (three or more) infections in the past year. When an indicator is used to measure a process for many people, it can be summarized as a performance rate or score. The performance rate is the proportion of people who received the indicated care among those who should have received it. I refer to the method by which one would calculate a performance rate as constructing the indicator. The patients who comprise the numerator and denominator of a performance rate must be identified to construct an indicator. The denominator is the number of patients who are eligible for the indicator, that is people who should receive the indicated care. The numerator is the number of eligible patients who received the indicated care, or passed the indictor. The pieces of information required to determine who satisfies the eligibility and scoring criteria are referred to as data elements. Consider the following indicator: Men under age 75 with preexisting heart disease who are not on pharmacological therapy for hyperlipidemia should have total cholesterol, HDL, and LDL levels documented at least every five years. For this indicator, eligible patients are men who are less than 75 years old, have preexisting heart disease, and are not taking a lipid lowering medication. Among those who satisfy the eligibility criteria, the indicator is passed if the patient had his total cholesterol, HDL, and LDL levels documented at least once in the last five years. If there were 500 men who met the eligibility criteria, for example, and 300 of them had their total cholesterol, HDL, and LDL levels documented in the -11- past five years, then the performance rate, or score, for this indicator would be 60%. DATA SOURCES FOR QUALITY MEASUREMENT Data are required to measure the quality of health care. Data can come from directly observing the practice of care providers, by asking providers and patients about their actions and experiences, or by studying the documentation and other records that are produced as health care is delivered. The best choice of a data source depends on available data sources’ content, accuracy, ease of use, cost, and the purpose of the quality measurement activity. If cost were not an issue, a comprehensive quality measurement system would draw on multiple data sources so that the strengths of each source could be brought to bear. Logistical difficulties and limited resources generally make such a comprehensive approach infeasible. In this thesis, two data sources for the widespread measurement of technical quality are compared: medical records and claims data. Although patient surveys are a common data source for quality measurement, they are most often used to study special issues or populations not routinely captured in medical records or claims data (McGlynn, Damberg et al. 1998). Medical records. Health care professionals and institutions generally maintain medical records for individuals in a paper format.3 Medical records are rich in clinical information such as patient medical history, primary complaints, presenting symptoms, results of physical examinations, clinical assessments and diagnoses, test and lab results, performed procedures, prescribed treatments, and patient response to treatments. In the absence of direct observation, medical records are frequently considered the gold standard data source to measure technical ___________ 3 The concept of an electronic medical record (EMR) has been around for more than thirty years, but adoption has been slow, and paper records continue to be the dominant format. Among the barriers to the implementation of EMR systems are software problems of encoding the complex clinical information found in the written medical records, security issues, a dearth of integrated delivery systems, reluctant providers, and high costs (Retchin and Wenzel 1999). -12- quality (Fowles, Fowler et al. 1997; Steinwachs, Stuart et al. 1998). Unfortunately, medical records are not amenable to large-scale quality assessment because they are an expensive source of data. Medical record review is expensive because it requires trained personnel to abstract data from medical records in a standard format for analysis. Further, each patient can have multiple medical records as a result of receiving care from different doctors and hospitals; to understand fully the care patients have received from the different providers across multiple medical conditions, all of their medical records must be obtained and abstracted. The burden associated with data collection from medical records is even higher because it is increasingly difficult to obtain medical records for quality studies due to concerns over protecting the privacy of patients’ health information and the enactment of the Health Insurance Portability and Accountability Act (HIPAA).4 Claims data. In contrast to medical records, claims data are generally contained in electronic files. Claims data are generated for billing purposes as a result of a patient’s encounter with the health care system, including outpatient care, hospital care, and filled prescriptions. Enrollment data are also maintained electronically by health plans to identify the people who are eligible for coverage. Together, claims and enrollment files may contain information on demographics, diagnoses, delivered services, and prescriptions (McGlynn, Damberg et al. 1998). ___________ 4 Congress enacted the Health Insurance Portability and Accountability Act (HIPAA) in 1996 and the Standards for Privacy of Individually Identifiable Health Information (the Privacy Rule) was finalized on August 14, 2002. The Privacy Rule protects any health information that can be used to identify an individual. Protected information includes an individual’s medical records and other personal health information, and it applies to information in any form of communication, electronic, oral, or written (Friedrich 2001; Gostin 2001). The rule states that health information may be disclosed for research without the person’s permission, provided that the study obtains a waiver from an IRB or privacy board (Code of Federal Regulations, 2002). However, the financial and incarceration penalties for compliance failures with HIPAA may lead institutions and IRBs to be too cautious and to act defensively. Further, providers, such as community physicians and hospitals, may decide not to give researchers access to medical records (Kulynych and Korn 2002). -13- Although claims data have less clinical information than medical records, they are widely available and relatively inexpensive to analyze (Lohr 1990; Dresser, Feingold et al. 1997; Steinwachs, Stuart et al. 1998). In contrast to medical records, claims can be easily deidentified which minimizes concerns about privacy and HIPAA compliance. Since claims data are routinely collected and computerized, quality of care indicators can be calculated repeatedly to identify trends and progress in quality (Asch, Sloss et al. 2000). The large numbers of cases generally contained in claims files also permit multiple comparisons, the testing of hypotheses about population subgroups, and comparisons across statistical models (Lohr 1990). Since claims data are easy to use and less costly than other data sources, they have the potential to contribute to the knowledge base about the quality of care. However, any data source used in quality measurement should be evaluated with regard to two criteria -availability and accuracy. Evaluating the availability of a data source means understanding who and what activities are included in the data source and exactly what type of information the data contain. Accuracy addresses whether the data source can generate reliable answers to the quality question at hand. The remainder of this chapter reviews what is known about the availability and accuracy of claims data and provides examples of how claims data have been used to measure quality. AVAILABILITY OF CLAIMS DATA Whose information is included in claims data? Claims data are a by-product of reimbursing for health care services. Therefore, claims data generally include people who have health insurance, receive health care, and file a claim.5 Further, ___________ 5 Other data sources, such as medical records and direct observation, also depend on patients having an encounter with the health care system. When people who do not use health care services are omitted from quality measurement, the extent of problems of access and under-use of recommended care are likely to be underestimated. Population based surveys about individuals’ health are an alternate data source that does not depend on patient encounters with the health care system. -14- individual payers, including private health plans, as well as the State and Federal governments, possess claims data for only those people for whom they are responsible for payment. To compare quality across payers, claims data need to be pooled. Although there is no coordinated strategy to capture all claims data in the US, some organizations have combined data from multiple plans for comparative purposes. For example, private information management companies such as MEDSTAT, Health Benchmarks, and Ingenix have pooled the claims data they receive from their different clients. Similarly, claims data are used to construct many of the measures in NCQA’s Quality Compass® -- a database with quality of care information from hundreds of HMOs. What information is included in claims data? Although the precise contents of claims files vary by health plan or insurer, most claims forms capture patient characteristics, provider identifiers, treatment and diagnostic information, and payment information. Health plans tend to use claims forms similar to those used by the Centers for Medicare and Medicaid Services (CMS) to process claims for government beneficiaries (Weiner, Parente et al. 1995; McGlynn, Damberg et al. 1998). In particular, the Uniform Bill (UB-92) is the CMS form used to bill for inpatient hospital services and the CMS-1500 is used to bill for outpatient services. Tables 2.1 and 2.2 list the standard data elements for hospital and outpatient claims forms respectively. -15- Table 2.1 List of Standard Data Elements: Hospital Claims Forms Patient Characteristics Patient identifiers Name (last, first, middle initial) Address (street, city, state, zip code) Date of birth Gender Marital status Provider identifiers Hospital/facility identifier Physician identifier Diagnostic and treatment information Admission date Discharge date Type of admission Admitting diagnosis code Codes and description of service Service date Service units Principal and other diagnoses (up to 9) Principal and other procedures (up to 6) Date of procedure(s) Discharge status Insurance/payment information Payer identifier (e.g., Medicare or Medicaid) Group name Insured’s name Insured’s identification number Insured’s group number Insured’s address and telephone number Relationship to insured Employer name Employer location Treatment authorization codes Total charges Amount paid SOURCE: (CMS 1998) -16- Table 2.2 List of Standard Data Elements: Outpatient Claims Forms Patient characteristics Patient identifier Name (last, first, middle initial) Address (street, city, state, zip code) Telephone number Date of birth Marital status Employment/student status Other health insurance coverage Provider identifiers Physician identifier Physician’s employer identification number Diagnostic and treatment information Date of current illness, injury or pregnancy Condition related to employment or accident Admission and discharge dates related to service Principal and other diagnoses (up to 4) Place of service Date of service Code for procedures, services or supplies Name and identifier of referring/ordering physician Date of disability Date patient able to return to work Insurance/payment information Payer identifier (e.g., Medicare or Medicaid) Group name Insured’s name Insured’s identification number Insured’s group number Insured’s address and telephone number Relationship to insured Employer name Employer location Treatment authorization codes Accept assignment of Medicare benefits Total charges Amount paid SOURCE: (CMS 1998) Since the Medicare program does not include pharmacy benefits, there is not a US government prototype for the contents of a prescription claim form. Therefore, I reviewed the prescription claims -17- forms used by three private payers6 to gauge the typical contents of prescription claims forms (see Table 2.3). I did not find any significant differences in the contents among the reviewed claims forms. Table 2.3 List of Standard Data Elements: Prescription Claims Forms Patient characteristics Patient identifier Name (last, first, middle initial) Address (street, city, state, zip code) Date of birth Gender Provider identifiers Physician identifier Pharmacy address Pharmacy identifier Treatment information Prescription number Code for dispensed medication Date prescription filled New versus refill prescription Drug name and strength Quantity Days supply Insurance/payment information Group name Insured’s name Insured’s identification number Insured’s group number Insured’s address and telephone number Relationship to insured Secondary insurance Total cost of prescription ___________ 6 I reviewed the contents of the prescription claims forms for two health plans (Health Net and PCS HealthSystems®) and one pharmacy benefit management company (AdvanceRx). The forms were available for download via the Internet at the following web sites: http://www.healthnet.com/members/forms/pdf/8670.pdf (accessed October 21, 2001) http://statenc.advparadigm.com/pdf/APCS_CLAIM_FORM.pdf (accessed October 21, 2001) http://www.pcshs.com/benefits/forms/standard.pdf (accessed October 21, 2001) -18- The clinical data contained in hospital, outpatient, and prescription claims are limited to codes for diagnoses, services, and medications. These codes are frequently from, or can be linked to, standardized coding systems. The standardized coding systems for diagnoses, services, and pharmaceuticals are described below. Diagnostic codes. Information in claims data about patients’ clinical conditions is most often in the form of diagnostic codes specified by the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM). Use of ICD-9-CM codes is widespread since providers are generally required to report their diagnostic assessments via ICD-9-CM codes to be reimbursed for a patient encounter, especially hospitalizations. 10,300 codes. ICD-9-CM contains more than The National Center for Health Statistics (NCHS) and the CMS are the U.S. governmental agencies responsible for overseeing changes and modifications to the ICD-9-CM (Iezzoni 1990; Centers for Disease Control and Prevention 2001). The diagnostic codes of ICD-9-CM are organized within broad categories. Some of these categories represent various types of conditions (e.g., infectious and parasitic diseases, neoplasms), while others reflect anatomic locations (e.g., circulatory, digestive, respiratory systems) and one category is reserved for “symptoms, signs, and ill-defined conditions.” Three-, four-, and five-digit codes are listed, representing increasing levels of specificity. For example, the three-digit code 250 indicates diabetes mellitus, while the fourth digit specifies the manifestation (e.g., 250.5, diabetes with ophthalmic manifestations) and the fifth digit reflects the type (e.g., 250.52, diabetes with ophthalmic manifestations, adult-onset type). For some disease classifications, only four digits are specified. Service codes. Information about performed procedures or delivered services may be in the form of ICD-9-CM procedure codes, codes specified by the Current Procedure Terminology, Fourth Edition (CPT4), codes from the Healthcare Common Procedure Coding System (HCPCS), or Uniform Billing (UB-92) Revenue Codes. HIPPA final rules require the use of different coding systems for procedures depending on where the procedure was performed (Code of Federal Regulations 2000). Inpatient -19- hospital procedures must be reported using the ICD-9 Procedure Coding System (ICD-9-PCS), while a combination of CPT-4 or HCPCS codes are required for physician services and other health care services. ICD-9-PCS. Volume III of ICD-9-CM includes procedure codes that are maintained by the CMS. There are almost 4,300 ICD-9 procedure codes; the codes contain up to four digits. CPT-4. The CPT is a systematic listing and coding of health care procedures and services. five-digit code. in 1966. Each procedure or service is identified with a The American Medical Association (AMA) developed CPT The codes are organized into six sections: evaluation and management, anesthesiology, surgery, radiology (including nuclear medicine and diagnostic ultrasound), pathology and laboratory, and medicine (except anesthesiology). Within each section there are subsections with anatomic, procedural, condition, or descriptor subheadings. The CPT system is revised annually to reflect significant updates in medical technology and practice. The most recent version of CPT, CPT 2001, contains 7,928 codes and descriptors. HCPCS. The HCPCS reports supplies, professional services, and procedures for payment. HCPCS is a three-level coding system, where level I is equivalent to CPT. Level II, or National, HCPCS codes are five-digit alpha-numeric codes used to identify those coding categories not included in CPT such as ambulance services and durable medical equipment, prosthetics, othotics and supplies (DMEPOS). Level II codes are the result of the combined work of CMS, the Health Insurance Association of America (HIAA), the American Dental Association (ADA), and the Blue Cross/Blue Shield Association (BCBSA). The level III, or local, HCPCS codes are maintained and assigned by Medicaid State agencies, Medicare contractors, and private insurers for use in their specific programs or local areas of jurisdiction. Level III codes are established for items or services not having the frequency of use, wide geographic use, or general applicability needed to establish a new level I or level II code. Revenue codes. UB-92 revenue codes are frequently entered on claims for payment for the cost centers within a hospital that have separate charges. The codes help identify some of the services that are -20- delivered to patients, but CPT, ICD-9, and HCPCS codes are much more specific. The revenue codes have 3 digits and are listed in two sections: accommodations and ancillary. The accommodations revenue codes specify the type of room the patient had (e.g., private or semiprivate room and the number of beds), the type of unit where the care was received (e.g., medical/surgical, OB, or psychiatric), and whether the provided care was more intensive than is typically rendered in the general medical or surgical units. The ancillary revenue codes include charges for items such as nursing services, durable medical equipment, laboratory or radiology services, and other therapeutic services. The National Uniform Billing Committee (NUBC) maintains the list of UB-92 Revenue Codes. Pharmaceutical codes. Claims files that track dispensed medications generally use the National Drug Code (NDC) System. Each human drug is assigned a unique 10-digit, 3-segment NDC. The FDA assigns the first segment, the labeler code. A labeler is any firm that manufactures, repacks or distributes a drug product. The second segment, the product code, identifies a drug’s strength, dose, and formulation for a particular firm. The third segment, the package code, identifies package sizes. The labeler assigns both the product and package codes. Summary: Availability of Claims Data for Quality Measurement Administrative data are generated when a person is covered by private or public insurance and when claims are filed so that patients or providers can be reimbursed for health services. Widespread availability of claims data and the use of standardized coding systems facilitate quality measurement. However, the clinical data found in claims files are usually limited to codes for diagnoses, services, procedures, and pharmaceuticals. ACCURACY OF CLAIMS DATA When selecting a data source for quality measurement, it is important to assess accuracy. Below, findings from previous studies about agreement between medical records and claims data are reviewed and sources of error in claims data are described. -21- Agreement between medical record and claims data The accuracy of claims data has been evaluated in several studies (Romano and Mark 1994; Dresser, Feingold et al. 1997; Steinwachs, Stuart et al. 1998; Lawthers, McCarthy et al. 2000). Generally, agreement with medical records is used to gauge the accuracy of claims data. medical records review is also subject to measurement error. However, Errors in data abstracted from medical records have several sources including incorrect or incomplete documentation, illegibility of provider notes, missing laboratory or other reports, and varying levels of abstractor skills (Quam, Ellis et al. 1993; Luck, Peabody et al. 2000). Nevertheless, medical records are rich in clinical data and are frequently used as the standard against which to judge the accuracy of other data sources (Fowles, Fowler et al. 1997; Bergmann, Byers et al. 1998). Studies assessing the accuracy of coding in claims data generally refer to the overall rate of agreement between claims data and medical records, the sensitivity of the claims data, and the specificity of the claims data. Overall agreement is the rate at which the claims and medical records data agree about whether a patient has a given medical condition or received a specific service. of identifying true positives. Sensitivity is the likelihood Therefore, the sensitivity of claims data is the rate at which they are able to identify patients who, according to medical records, have the condition or received the health care intervention of interest. Specificity is the likelihood of identifying true negatives, the rate at which claims data correctly indicate that the condition or procedure of interest did not exist or occur, assuming the medical records are correct. For most conditions and procedures, claims data have better specificity than sensitivity (Fisher, Whaley et al. 1992; Jollis, Ancukiewicz et al. 1993; Romano and Mark 1994; Dresser, Feingold et al. 1997; Fowles, Fowler et al. 1998). In regard to quality measurement, good specificity suggests that among those patients identified by the medical records data as not meeting the eligibility and scoring criteria, it is likely that the claims data assessments will agree. However, weak sensitivity implies that there -22- will be measurement error when claims data are used to construct an indicator because eligibility and scoring will be underestimated relative to medical record assessments. Most studies that have evaluated agreement between claims data and medical records have used data from hospitalizations. The National Diagnosis Related Group (DRG) Validation Study used data from 1984-1985 and found that the overall agreement rate between diagnoses coded in the claims data and documented in the medical record was 78.2%, but the level of agreement ranged from 52.7 to 91.4% across conditions (Fisher, Whaley et al. 1992). A study of 1988 data in a hospital discharge database in California found that the sensitivity of coding for eight conditions ranged from 65 to 100%, while the specificity ranged from 98.8 to 100%. Hypertension was the most under-reported condition; sensitivity for the remaining conditions7 was 88% or more. In the same study the ranges for the sensitivity and specificity of coding for 16 procedures were 21 to 94% and 99.5 to 100% respectively (Romano and Mark 1994). Non-invasive procedures tended to be under-reported, while the sensitivity of coding exceeded 90% for bronchoscopy, hemodialysis, endoscopy, ateriography, mechanical ventilation, and chemotherapy.8 Some evidence suggests that the quality of claims data has improved over time, apparently because accurate discharge information is now a requirement for reimbursement (Fisher, Whaley et al. 1992; Jollis, Ancukiewicz et al. 1993). A few studies have analyzed the ability of ambulatory claims data to identify patients with specific conditions and whether particular services were provided. When using a combination of encounter and pharmacy claims to identify health plan members with hypertension, there was a 96% agreement rate with medical records about who had hypertension ___________ 7 In addition to hypertension, the study by Romano and Mark (1994) analyzed the coding of cancer, chronic liver disease, chronic renal disease, chronic cardiovascular disease, chronic lung disease, cerebrovascular degeneration, and diabetes. 8 The remaining procedures evaluated in the study were: lumbar puncture, barium radiograph, computed tomography scan, electroencephalogram, cardiac stress test, electrocardiographic (ECG) monitoring, pulmonary capillary wedge pressure (PCWP) monitoring, ultrasound, radionuclide scan, and packed red blood cells transfusion. -23- (Quam, Ellis et al. 1993). Other studies have found high rates of agreement, ranging from 95 to 98%, about whether Pap smears, cholesterol screening, and mammograms were performed. The administration of immunizations to children and early initiation of prenatal care had agreement rates of 70 and 67% respectively (Dresser, Feingold et al. 1997; Fowles, Fowler et al. 1997). The lower rates of agreement for immunization and prenatal care were attributed to reimbursement policies where these services did not need to be separately billed for reimbursement (i.e., global billing) and thus were not captured in claims data. Sources of Inaccuracy in Claims Data The accuracy of claims data is affected by inappropriate or incomplete coding, whether the health plan is responsible for payment, reimbursement policies, and the clinical content and coding guidelines in the standardized coding systems. These sources of inaccuracy are described below. Inappropriate and incomplete coding. Codes may be either incorrect or absent from a claims file for a variety of reasons. To begin, medical record technicians may make transcription errors (e.g., transpose numbers in the codes), apply the incorrect code to what the physician has written in the chart (e.g., code a new myocardial infarction (MI) as an old MI), or fail to code all of the diagnoses documented in the medical record (Jollis, Ancukiewicz et al. 1993; Dans 1998). Claims records may also be inaccurate because of provider coding practices. The provider, for example, may try to protect patient confidentiality and insurability by recording an alternate diagnosis for sensitive conditions such as mental illness or HIV (Dans 1998), or may miscode conditions (by exaggerating condition severity or changing billing diagnoses) to assure the health insurance company pays for the care and that the patient can avoid an appeals process for care that the provider perceives to be necessary (Wynia, Cummins et al. 2000; Werner, Alexander et al. 2002). Further, as a way to reduce administrative burden, providers may not use all applicable -24- codes; they may use “super bills” that list the most common diagnoses and procedures so that the closest code is checked off instead of writing in a more accurate code (Quam, Ellis et al. 1993; Garnick, Hendricks et al. 1994). No record of rendered service. A health plan’s claims data are generally limited to claims for the services for which they must pay. Several factors affect whether a health plan is responsible for payment. To begin, if claims are not submitted to the insurance company by either the patient or provider, then no record of the service rendered will exist. Another example of when health plans do not need to pay claims is when a patient receives services outside the health plan (e.g., at a public health clinic) or obtains care not included in his or her benefits package. Similarly, a health plan does not need to pay pharmacy claims if a patient pays for the prescription medication out-of-pocket or purchases an over-thecounter substitute. Further, if the patient or provider submits claims before a deductible is exceeded, it may not be maintained in the files because many insurers save information only on paid claims. Finally, health insurance plans are less likely to have information about services related to long-term care, workers’ compensation, and injuries resulting from accidents because other organizations are responsible for payment. Reimbursement policies. Health plan reimbursement policies such as capitation, bundling, and carve-outs, can also limit one’s ability to identify delivered care through claims data (Dresser, Feingold et al. 1997; Dans 1998). For example, prenatal care may be capitated and billed once toward the end of the pregnancy. This makes it difficult to identify when the initial prenatal care visit actually occurred and what services were delivered. Childhood immunizations are an example of bundled services because they may be billed as a component of “well-child” visits and thus not identifiable as separate services in claims data. Some types of services, such as mental health or pharmacy benefits, may be “carved- -25- out” so that a separate organization is responsible for payment. In these cases, the primary insurer may not have access to claims data for the carved-out service, and these data may be required for quality measurement. Coding content and guidelines. For quality measurement, we often need to group people by diagnosis to determine who is eligible for any given quality of care indicator. Unfortunately, using ICD-9-CM codes to identify similar people is frequently less precise than would be ideal. To begin, there generally are no ICD-9-CM codes to specify the severity of a condition (e.g., ICD-9-CM codes do not differentiate between mild asthma and moderate to severe asthma). However, additional information such as pharmacy claims or the types of encounters (e.g., emergency room visits or hospitalizations rather than office visits), can be used to infer disease severity. ICD-9-CM codes also fail to distinguish between suspected and confirmed diagnoses and whether a diagnosis is new or pre-existing. For example, a single diagnostic code for diabetes could mean (a) the patient is being evaluated for diabetes, but the diagnosis has not been confirmed (i.e., the diagnostic code is used to describe a rule out diagnosis of diabetes), (b) the patient has been newly diagnosed with diabetes, or (c) the patient has a history (i.e., prevalent diagnosis) of diabetes. Examining claims data over a period of time, rather than for a single encounter, can better discern whether a patient has a specific disease and distinguish between rule-out, new, and prevalent diagnoses. For instance, if there is only one encounter coded for diabetes over a two-year period that includes several encounters and it is early in the period, then it is more likely that the diagnostic code was used as a rule-out code and the patient does not have diabetes. A new diagnosis of diabetes would be best indicated when the first coding of diabetes is identified, there are subsequent codes for diabetes, and earlier encounters are not coded for a diagnosis diabetes. If the patient had multiple prior visits with a diagnostic code for diabetes, then one could more confidently infer that the patient had a prevalent diagnosis -26- of diabetes. Increasing the number of diabetic coded visits required to infer the diagnosis highlights the inherent trade-off between sensitivity and specificity. Specifically, if multiple encounters are required to establish that a patient has diabetes, it is more likely that diabetics will be missed, but also less likely that people will be erroneously identified as being diabetic. If there is not a visit coded for diabetes, it suggests that the patient is not diabetic. However, the absence of a code for diabetes could also occur because the patient did not seek care for the condition, the provider failed to code the diagnosis during the time for which the claims data are available, or the visit occurred before the deductible was met. As the amount of time for which data are available increases, the likelihood of errors associated with being unable to detect the presence of a condition and whether the diagnosis is new or prevalent decreases. Identifying patients with a particular diagnosis for quality measurement is further complicated by vague ICD-9-CM coding guidelines (Iezzoni 1990; McCarthy, Iezzoni et al. 2000). The rules that govern ICD-9-CM coding frequently lack specific clinical definitions and this increases the likelihood that patients with identical presentations and diagnoses will be given different codes. A patient admitted to the hospital for chest pain, for example, could be assigned a code corresponding to either precordial pain or angina. Therefore, when using claims data to identify people with similar diagnoses, it may be appropriate to use multiple codes. However, as additional codes are introduced, the likelihood of including people who do not fit the criteria of the quality indicator increases. In sum, using diagnostic codes to group people by diagnosis is subject to error. Errors are more likely to occur as the level of clinical detail increases, such as needing to distinguish disease severity or identify new diagnoses, and when multiple ICD-9 codes could be used to communicate identical conditions. Summary: Accuracy of Claims Data The accuracy of claims data is frequently evaluated by assessing agreement with information from medical records. Agreement varies by -27- the diagnoses and procedures being compared and the associated billing practices. However, the specificity of claims data is typically better than its sensitivity. Errors in claims data are typically due to inaccurate or incomplete coding by providers, claims not being submitted for reimbursement, insufficiently detailed diagnosis codes, and vague guidelines for using ICD-9-CM codes. APPLICATIONS OF CLAIMS DATA TO MEASURE QUALITY Claims data have been used to study several aspects of health care performance. Beginning in the 1970s, claims data were used to quantify dramatic variations in medical practice across geographic areas (Wennberg and Gittelsohn 1973; Wilt, Cowper et al. 1999). Claims data have also been used to assess patient access to and utilization of health services (Lozano, Connell et al. 1995; Lo Sasso and Freund 2000; Fortney, Borowsky et al. 2002). Clinical outcomes are measured with claims data through efforts such as the Complications Screening Program (CSP) and the Healthcare Cost and Utilization Project Quality Indicators (HCUP QIs) (Johantgen, Elixhauser et al. 1998). There are several efforts that successfully measure processes with claims data. The Health Plan Employer Data and Information Set (HEDIS) draws on claims data to construct 26 indicators of technical quality. HEDIS is primarily used to compare the performance of HMOs, but other types of health plans such as PPOs are also beginning to use and report their performance on HEDIS measures. Quality of care indicators have also been developed by the CMS and its contractors, Quality Improvement Organizations (QIOs), to monitor the quality of care provided to Medicare beneficiaries for six conditions. However, monitoring of only two of these conditions (breast cancer and diabetes) rely solely on claims data (Jencks, Cuerdon et al. 2000). Although these measurement activities provide information about quality, they represent just a fraction of the processes that are known to improve outcomes. The objective of this dissertation is to provide a foundation that could lead to broader use of claims data for quality measurement. Essential to this objective is the answer to the following: Are there dimensions of technical quality that are not measured with claims data, -28- but could be? To address this question, my analysis begins with an extensive list of health care processes that would be ideally measured to assess the quality of technical care. Then, I identify what is feasible to measure with claims data from this comprehensive list of indicators. I examine hundreds of quality of care indicators to characterize the dimensions of clinical quality that could be measured with claims data. The types of data elements that would increase the capacity for quality measurement with claims data are also identified. To gauge the validity of quality of care measurement with claims data, I then analyze the factors associated with better and worse agreement between claims and medical records data.