* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The Community-Acquired Pneumonia Symptom
Survey
Document related concepts
Transcript
The Community-Acquired Pneumonia Symptom Questionnaire* A New, Patient-Based Outcome Measure To Evaluate Symptoms in Patients With CommunityAcquired Pneumonia Donna L. Lamping, PhD; Sara Schroter, PhD; Patrick Marquis, MD; Alexia Marrel; Isabelle Duprat-Lomon, MD; and Pierre-Philippe Sagnier, MD Study objectives: To develop and validate a patient-based outcome measure to evaluate symptoms in patients with community-acquired pneumonia (CAP). Design: A psychometric study within an international, prospective, randomized, double-blind study. The CAP-symptom questionnaire (CAP-Sym) is a new, 18-item, patient-reported outcome measure that evaluates the bothersomeness of CAP-related symptoms during the past 24 h using a 6-point Likert scale. We used “gold standard” psychometric methods to comprehensively evaluate the acceptability, reliability, validity, and responsiveness of the CAP-Sym. Setting: Sixty-four centers in 13 countries (France, Germany, Hungary, Israel, Italy, Norway, Poland, Portugal, South Africa, Spain, Sweden, Switzerland, United Kingdom). Patients: Five hundred fifty-six patients with CAP, recruited from outpatient clinics, general practice, and hospital centers. Interventions: Randomization 1:1 to moxifloxacin (400 mg once daily), oral or standard oral treatment (amoxicillin, 1 g tid, or clarithromycin, 500 mg bid), alone or in combination, for up to 14 days. Results: Standard psychometric tests confirmed the acceptability (item nonresponse, itemendorsement frequencies, item/scale floor and ceiling effects), reliability (internal consistency, item-total and inter-item correlations, test-retest reliability), validity (content, construct, convergent, discriminant, known groups), and responsiveness of the CAP-Sym. Conclusions: The CAP-Sym is a practical and scientifically sound patient-based outcome measure of CAP-related symptoms that has been developed using “gold standard” methods. As the only fully validated measure of symptoms in patients with CAP, which is quick and easy to administer and is more responsive than the generic Medical Outcomes Study 36-Item Short-Form Health Survey, the CAP-Sym provides a practical and rigorous method for improving the evaluation of outcomes in clinical trials and audit. (CHEST 2002; 122:920 –929) Key words: community-acquired pneumonia; outcomes; patient-based assessment; questionnaire; symptoms Abbreviations: CAP ⫽ community-acquired pneumonia; CAP-Sym ⫽ community-acquired pneumonia symptom questionnaire; CAP-Sym 12 ⫽ 12-item community-acquired pneumonia symptom questionnaire; CAP-Sym 18 ⫽ 18item community-acquired pneumonia symptom questionnaire; ICC ⫽ intraclass correlation coefficient; MCS ⫽ SF-36 Mental Component Summary score; PCS ⫽ SF-36 Physical Component Summary; PSI ⫽ pneumonia severity index; SF-36 ⫽ Medical Outcomes Study 36-Item Short-Form Health Survey pneumonia (CAP), defined C asommunity-acquired pneumonia not acquired in a hospital or long-term care institution, is a leading cause of morbidity and mortality worldwide. A recent US study reported an annual CAP incidence of 5.6 million cases, with approximately 20% requiring hospitalization.1 In the United Kingdom, approximately 50,000 people are admitted to hospital annu- *From the Health Services Research Unit (Drs. Lamping and Schroter), Department of Public Health and Policy, London School of Hygiene and Tropical Medicine, London, UK; MAPI Values (Dr. Marquis and Ms. Marrel), Lyon, France; and Bayer plc and Drs. Duprat-Lomon and Sagnier), Stoke Court, UK. This work was funded by Bayer plc. Drs. Lamping and Schroter have received research funding and support for attending con- ferences from Bayer. MAPI Values (Dr. Marquis and Ms. Marrel) has received funding from Bayer for consulting, questionnaire development, and linguistic validation. Correspondence to: Donna L. Lamping, PhD, Health Services Research Unit, Department of Public Health and Policy, London School of Hygiene and Tropical Medicine, Keppel St, London WC1E 7HT, United Kingdom; e-mail: [email protected] 920 Downloaded From: http://publications.chestnet.org/pdfaccess.ashx?url=/data/journals/chest/21982/ on 05/13/2017 Clinical Investigations ally with CAP.2 Among hospitalized patients, mortality ranges from 2 to 21% and rises to ⬎ 50% among patients with severe disease,3 making CAP the most common cause of death due to infectious disease.4 Recently introduced guidelines for the management of CAP5–9 provide algorithms to guide clinical decision making about the choice of antimicrobials. As an alternative to aminopenicillins and/or macrolides, current guidelines also recommend a new “respiratory” fluoroquinolone as a potential first-line option. In addition to their known efficacy, fluoroquinolones offer the potential advantage of quicker symptom resolution and improved quality of life due to their rapid bactericidal activity. It is therefore important that new antimicrobial treatments in CAP be evaluated on the basis of rigorous assessment of patient-based outcomes such as symptoms and quality of life in addition to clinical outcomes. Most studies evaluate treatment efficacy on the basis of clinical outcomes such as mortality,10 –12 bacteriologic response,10,13–15 temperature,16 respiratory and heart rate,17 clinical cure/response,13,15 nature and severity of adverse events/safety,10,13,15,17 and hospitalization.10 –12,18 More recent studies have also evaluated outcomes on the basis of the health-care costs associated with outpatient visits,19 inpatient hospitalizations,17,18 the use of antibiotics/antimicrobials,10,11 and time to return to work or usual activities.10 Despite clear recognition that patient-based outcomes are a key component in evaluating health outcomes,20 –22 only three studies have evaluated treatment efficacy in CAP using rigorous, validated measures of quality of life,10,17,19 and none have assessed symptoms using scientifically robust measures. Two studies10,13 that have evaluated symptoms used unvalidated, clinician-reported ratings of patients’ symptoms, and two other studies16,19 assessed patient-reported symptoms using scales that have not been fully evaluated for reliability, validity, and responsiveness. Our extensive review of the literature and expert opinion from pneumologists, clinical researchers, and outcome researchers pointed to the need for a practical and scientifically rigorous patient-based outcome measure to evaluate symptoms in CAP in clinical trials and audit. We describe the development and validation of the CAP symptom questionnaire (CAP-Sym), a new, patient-based measure of symptoms in CAP. We used rigorous psychometric methods23 to guide the development and evaluation of the CAP-Sym. These “gold standard” scientific methods, borrowed from the social sciences for application in health care,24,25 allow regulatory bodies, clinicians, researchers, and patient advocacy groups to determine whether an instrument is a “good” measure that provides scientifically credible information. Psychometrics provide www.chestjournal.org well-established scientific methods for measuring subjective judgements using numeric scales and evaluating the quality of measurement scales (ie, reliability, validity, responsiveness). Rigorous criteria are now available for evaluating the scientific robustness of health-outcome measures.26,27 We28 –30 and others31–33 have used these methods extensively to develop and validate outcome measures in several areas of clinical medicine. We undertook a comprehensive evaluation of the acceptability, reliability, validity, and responsiveness of the CAP-Sym questionnaire as part of an international, prospective, randomized, double-blind study to compare the effectiveness of treatment with moxifloxacin oral tablets to standard oral regimes in patients with CAP. Materials and Methods Questionnaire Development We interviewed 33 patients with CAP in the United States and France to identify content domains and questions for the CAPSym. The interview sample included patients at different stages of the condition: from onset to up to 7 days after onset (5 US patients, 2 French patients), 8 to 21 days after onset (10 US patients, 7 French patients), and at the end of oral antimicrobial treatment (at least 28 days after onset; 5 US patients, 4 French patients). The patients’ mean age was 52 years, and 58% (n ⫽ 19) were men. All patients were treated with oral antibiotics at the onset of CAP, and eight patients received additional IV treatment after the end of oral treatment. Trained interviewers conducted telephone or face-to-face interviews using an interview guide. Interviews included open and closed questions asking patients about their daily life with CAP, their symptoms, the circumstances in which they were most bothered/limited because of CAP, and the consequences of CAP and its treatment. Patients’ verbatim reports were used to develop questions for the CAP-Sym based on a predefined format to evaluate patients’ views about the bothersomeness of their symptoms. The questionnaire was developed in English (for use in the United Kingdom and South Africa) and then translated into 12 other languages: French, German (for use in Germany and Switzerland), Spanish, Italian, Portuguese, Swedish, Norwegian, Polish, Hungarian, Hebrew (for use in Israel), Russian (for use in Israel), and Afrikaans (for use in South Africa). Linguistic validation was performed according to the standard forward/ backward methodology34: (1) a single forward translation from English to the target language; (2) review of the translation by linguistic experts; (3) backtranslation into English; and (4) amendments to the forward translation based on the backtranslation. Each language version of the questionnaire was pretested by interviewing two pulmonary specialists in each country to check for completeness, relevance, and the appropriateness of the wording used by patients to describe their condition. Modifications to the questionnaire were then made and final translations agreed. CAP-Sym Questionnaire The CAP-Sym (Appendix A) measures 18 CAP-related symptoms: coughing, chest pains, shortness of breath, coughing up CHEST / 122 / 3 / SEPTEMBER, 2002 Downloaded From: http://publications.chestnet.org/pdfaccess.ashx?url=/data/journals/chest/21982/ on 05/13/2017 921 Table 1—Psychometric Tests and Criteria Definition/Test Criteria for Acceptability 1. Item analysis/reduction Psychometric Property Identify items for possible elimination due to weak psychometric performance;* assessed on the basis of: Unrotated principal component factor analysis (to determine whether all 18 items are measuring a single factor). Item analyses for all 18 items. 2. Acceptability The quality of data; assessed by completeness of data and score distributions. Principal component factor analysis: All items should load on the first unrotated factor ⬎ 0.30. Applied to all 18 items: Missing data ⬍ 5%. No item redundancy (inter-item correlations ⬍ 0.75). Item-total correlations ⱖ 0.25. Evidence of item responsiveness as assessed by significant improvement between baseline and test of cure assessments. Maximum endorsement frequencies ⬍ 80% (ie, the proportion of respondents who endorse each response category), including floor/ceiling effects ⬍ 80% (ie, response categories with high endorsement rates at the bottom/top ends of the scale, respectively). Aggregate adjacent endorsement frequencies ⬎ 10%. Applied to items: Missing data ⬍ 5%. Maximum endorsement frequencies ⬍ 80% (see above), including floor/ceiling effects ⬍ 80% (see above). Applied to summary scores: Missing data ⬍ 5%. Floor/ceiling effects ⬍ 80%. Skewness values between ⫹ 1 to ⫺ 1. 3. Reliability 3.1 Internal consistency 3.2 Test-retest reliability 4. Validity 4.1 Content validity 4.2 Construct validity 4.2.1 Within-scale analyses 4.2.2 Analyses against external criteria 4.2.2.1 Known group differences/hypothesis testing The extent to which items comprising a scale measure the same construct (eg, homogeneity of the scale); assessed by Cronbach ␣ coefficients44 and item-total correlations. The stability of a measuring instrument; assessed by administering the instrument to respondents on two different occasions and examining the correlation between test and retest scores.† Cronbach ␣ coefficients for summary scores ⬎ 0.70.44 Item-total correlations ⱖ 0.25.25 ICCs for summary scores ⬎ 0.80.25 The extent to which the content of a scale is representative of the conceptual domain it is intended to cover;‡ assessed qualitatively during the questionnaire development stage through pretesting with patients, expert opinion, and literature review. Qualitative evidence from pre-testing with patients, expert opinion, and literature review that items in the scale are representative of CAP symptoms. Evidence that a single entity (construct) is being measured and that items can be combined to form a summary score; assessed on the basis of evidence of good internal consistency, moderately high itemtotal correlations, and results from principal component factor analysis. Internal consistency (Cronbach ␣ coefficient) ⬎ 0.70. Item-total correlations ⱖ 0.25. Evidence from factor analysis that a single construct is being measured. The ability of a scale to differentiate known groups; assessed by comparing CAP-Sym scores of patients defined as clinically cured, according to the clinical variable “clinical evaluation of cure” between baseline and the days 7 to 10 (test of cure) assessments, with those of patients defined as clinical failures. Note: the comparative validity of the diseasespecific CAP-Sym against the generic SF-36 was also evaluated by assessing the ability of the SF-36 Vitality scale to differentiate patients defined as clinical cure/failure. CAP-Sym scores should be significantly higher (ie, higher symptom bothersomeness) in patients in the clinical failure group than in patients in the clinically cured group. SF-36 Vitality scores should be significantly lower (ie, lower energy) in patients defined as clinically cured vs clinical failures. 922 Downloaded From: http://publications.chestnet.org/pdfaccess.ashx?url=/data/journals/chest/21982/ on 05/13/2017 Clinical Investigations Table 1—Continued Psychometric Property Definition/Test Criteria for Acceptability 4.2.2.2 Convergent validity Evidence that the scale is correlated with other measures of the same or similar constructs; assessed on the basis of correlations between CAP-Sym scores and other patient-based (SF-36) and clinical (temperature, PSI) outcome measures. 4.2.2.3 Discriminant validity Evidence that the scale is not correlated with other measures of different constructs; assessed on the basis of correlations with age and sex. The ability of a scale to detect clinically significant change following a treatment of known efficacy,45,46 assessed by comparing mean scores for change in CAP-Sym scores at three assessment points (ie, between baseline and days 3–5, days 7–10, and days 28–35) using two standard methods: Effect size, calculated for responsiveness at the three assessment points as the mean difference (change score) in symptom scores from baseline to follow-up divided by the standard deviation of the baseline score; effect sizes and standardized response means of 0.20 are considered small, 0.50 moderate, and ⱖ 0.80 or greater as large.46 Standardized response mean, calculated for responsiveness at the three assessment points as the mean difference (change score) in symptom scores from baseline to follow-up divided by the SD of the change score. Note: the comparative responsiveness of the disease-specific CAP-Sym against the generic SF-36 was assessed by comparing effect sizes. Criteria for acceptability depend on the degree of conceptual similarity between the CAPSym scale and the other validation measures. Specific hypotheses follow: For patient-based outcome measures Moderate correlations between the CAPSym and SF-36 (because the two instruments are measuring constructs that are related but distinct—symptoms vs quality of life). Higher correlations between the CAP-Sym and SF-36 PCS/Vitality scores than between the CAP-Sym and SF-36 MCS scores because the CAP-Sym is more closely related to physical than mental health. For clinical measures Low correlations between CAP-Sym and temperature and the PSI.40–42 Low correlations between CAP-Sym scores and age and sex. 5. Responsiveness Effect sizes and standardized response means should increase in magnitude across time, ie, CAP-Sym and SF-36 scores should improve over time. Larger effect sizes indicate better responsiveness. *A standard item-reduction strategy was used to identify and eliminate items from the questionnaire that showed weak psychometric properties. To test the robustness of the item-reduction strategy, cross-validation analyses using the same tests and criteria were performed separately on two random split-half subsamples from the pooled dataset. Results of the item-reduction analyses performed on the two randomly selected subsamples were then compared to results obtained in the pooled sample. †The length of the test-retest interval must be short enough to ensure that clinical change in the symptom being measured is unlikely to occur, but sufficiently long to ensure that respondents do not recall their responses from the first assessment. In conditions such as CAP, where rapid changes in symptoms are expected to occur over a very brief time (ie, within a few hours), a very short test-retest interval of 1 to 2 h is necessary. This ensures that stability per se is being evaluated, rather than clinical change in symptoms during the test-retest interval, which will underestimate reliability. ‡A scale to measure CAP-related symptoms should include questions based on the wide range of symptoms that characterize the condition. If a CAP symptom questionnaire did not include an item about cough, content validity might be considered doubtful as an important dimension of the condition had been excluded. phlegm/sputum (secretion from the chest), coughing up blood, sweating, chills, headache, nausea, vomiting, diarrhea, stomach pain, muscle pain, lack of appetite, trouble concentrating, trouble thinking, trouble sleeping, and fatigue. It is a patient reported www.chestjournal.org questionnaire that is administered by interview. Patients are asked to rate each symptom for bothersomeness during the past 24 h using a 6-point Likert scale. In this study, the researcher asked the patient “In the past 24 h, how much have you been CHEST / 122 / 3 / SEPTEMBER, 2002 Downloaded From: http://publications.chestnet.org/pdfaccess.ashx?url=/data/journals/chest/21982/ on 05/13/2017 923 bothered by . . . ”, then read each symptom aloud and recorded how much the patient rated the bothersomeness of the symptom on the 6-point response scale (0 ⫽ patient did not have the symptom/problem; or patient had the symptom/problem and reported that it bothered him/her: 1 ⫽ not at all, 2 ⫽ a little, 3 ⫽ moderately, 4 ⫽ quite a bit, 5 ⫽ extremely). All 18 items are summed to produce a CAP-Sym score (range, 0 to 90). High values indicate poorer outcomes (ie, higher symptom bothersomeness). Results Respondent Characteristics As shown in Table 2, a total of 556 patients from 13 countries completed the baseline assessment. The sample included 321 men (58%) and 235 women (42%), who ranged in age from 17 to 97 years (mean, 50.41 years). Item Reduction CAP 2000 Study The CAP 2000 study is a multicenter, prospective, randomized, double-blind trial to compare the effectiveness of moxifloxacin and standard recommended therapy in CAP. Full details of the study are described elsewhere.35 Briefly, patients from 13 countries, recruited from outpatient clinics, general practice, and hospital centers, were randomized to one of two treatment arms: moxifloxacin (400 mg once daily) oral or standard treatment (amoxicillin, 1 g tid, or clarithromycin, 500 mg bid, or both amoxicillin and clarithromycin) for at least 5 days and up to 14 days of treatment. The choice of standard treatment was made by study clinicians prior to randomization and represented the most appropriate local first-line therapy based on clinical presentation, potential pathogens, and local susceptibility data. All randomized patients were followed up to study termination on an intention-to-treat basis. Outcomes were assessed at baseline (study entry), days 3 to 5 during treatment, days 7 to 10 (test of cure), and days 28 to 35 after treatment. Clinical outcomes and patient-reported symptoms and energy were assessed at all four assessment points, and quality of life was assessed at baseline and days 28 to 35. Clinical outcome measures included temperature and the pneumonia severity index (PSI).36 We used the CAP-Sym to assess symptoms, the acute version of the Medical Outcomes Study 36-Item Short-Form Health Survey (SF-36)37 to assess quality of life, and the vitality subscale of the SF-36 to assess energy. Use of health-care resources (ie, medications, including alternative antibacterial therapy, diagnostic and therapeutic procedures, hospitalization, and other health-care visits) was recorded throughout the study period. Ethics approval was obtained from the relevant committees in each country, and written informed consent was obtained from all patients prior to study entry. Psychometric Evaluation of the CAP-Sym for We used “gold standard” psychometric tests and criteria item reduction and to evaluate the acceptability, reliability, validity, and responsiveness of the CAP-Sym (Table 1). The first step in the psychometric field testing of the CAP-Sym was to perform standard item-reduction analyses to determine whether the initial 18-item CAP-Sym (CAP-Sym 18) could be reduced to a smaller number of items. This was done by selecting items that performed best on psychometric tests. The second step was to perform extensive psychometric analyses to evaluate the acceptability, reliability, validity, and responsiveness of the CAP-Sym. All psychometric analyses were performed on pooled data from all 13 countries, as sample sizes for individual countries/language versions were generally not sufficiently large to guarantee the robustness of the item-reduction analyses. This approach to performing analyses on pooled data has been used in previous psychometric validations of other commonly used “gold standard” international outcome measures.38 All psychometric analyses were performed on baseline CAP-Sym data except responsiveness analyses, which used data from all four assessment points. 23,25 The unrotated principal component factor analysis confirmed that items were measuring a single construct. One item (coughing up blood) that did not load on the first unrotated factor ⬎ 0.30 was sufficiently near to this criterion (0.27) to support this assumption. There was a low proportion of missing data for all 18 items (0 to 0.2%), suggesting that none of the items should be eliminated because of a high nonresponse rate. Examination of item-endorsement frequencies showed that responses were generally well distributed across all response categories. However, other psychometric tests/criteria led to the elimination of six items: trouble thinking (item redundancy; inter-item correlation ⬎ 0.75), coughing up blood (low item-total correlation ⬍ 0.25), diarrhea (poor item responsiveness), and vomiting and stomach pain (aggregate adjacent endorsement frequencies ⬍ 10%). Item-reduction analyses produced a shorter 12-item CAP-Sym (CAP-Sym 12; see Appendix for items in CAP-Sym 12). Results from cross-validation analyses of the itemreduction strategy performed separately on the two Table 2—Respondent Characteristics (n ⴝ 556)* Characteristics Sex Men Women Age, yr Range Mean (SD) Country France Germany Hungary Israel Italy Norway Poland Portugal South Africa Spain Sweden Switzerland United Kingdom Data 321 (58) 235 (42) 17–97 50.41 (18.65) 44 56 71 56 17 11 61 6 81 52 41 8 52 *Data are presented as No. (%) or No. unless otherwise indicated. 924 Downloaded From: http://publications.chestnet.org/pdfaccess.ashx?url=/data/journals/chest/21982/ on 05/13/2017 Clinical Investigations random split-half subsamples confirmed the robustness of the item-reduction strategy. Item-reduction analyses in both subsamples resulted in very similar items being eliminated in these two subsamples as in the main pooled data set. There were only minor differences in the actual items that were eliminated across all three samples. We evaluated the psychometric properties of both the CAP-Sym 18 and the CAP-Sym 12. This was done to enable a direct comparison of the measurement properties of the full-length and item-reduced versions of the CAP-Sym to determine whether item reduction resulted in a more robust measure. with patients, opinion from experts in CAP, and a review of literature supports the content validity of the CAP-Sym. Acceptability Construct Validity (Known Group Differences/ Hypothesis Testing): Table 4 presents mean CAPSym scores for patients defined as clinically cured or clinical failures. As hypothesized, CAP-Sym 18 and CAP-Sym 12 scores are significantly lower (indicating lower symptom bothersomeness) in patients defined as clinically cured than in the small number of patients (n ⫽ 7) who did not demonstrate clinical improvement (p ⫽ 0.03 and 0.02, respectively). Similarly, SF-36 Vitality scores are significantly lower (indicating lower energy) in patients defined as clinically cured than in those who did not demonstrate clinical improvement (p ⫽ 0.03). The full-length and item-reduced versions of the CAP-Sym show good acceptability. Table 3 shows a low proportion of missing data, low floor/ceiling effects, and skewness values within the recommended range for the CAP-Sym 18 and the CAPSym 12. Reliability Internal Consistency: As shown in Table 3, Cronbach ␣ coefficients for the CAP-Sym 18 and the CAP-Sym 12 indicate high internal consistency. Values exceed the standard criterion of 0.70. Item-total correlations ranged from 0.22 to 0.56 (mean, 0.40) for the CAP-Sym 18, and from 0.28 to 0.60 (mean, 0.42) for the CAP-Sym 12. One item in the CAPSym 18 (coughing up blood; 0.22) failed the criterion of ⬎ 0.25, whereas all items in the CAP-Sym 12 passed this criterion. Mean inter-item correlations were 0.20 and 0.23 for the CAP-Sym 18 and CAPSym 12 scales, respectively. Test-Retest Reliability: As shown in Table 3, both the CAP-Sym 18 and the CAP-Sym 12 show good test-retest reliability. Intraclass correlation coefficients (ICCs) were ⬎ 0.95. Validity Content Validity: The content validity of the CAPSym was evaluated during the development of the questionnaire. Evidence from qualitative interviews Table 3—Acceptability and Reliability of the CAP-Sym Test Missing data, % Floor/ceiling effects, % Internal consistency, Cronbach ␣ Test-retest reliability, ICC www.chestjournal.org CAP-Sym 18 CAP-Sym 12 0.2 0 0.82 0.96 0.2 0 0.78 0.96 Construct Validity (Within-Scale Analyses): Evidence of high internal consistency (Table 3) and findings from the principal component factor analysis support the construct validity of the CAP-Sym 18 and the CAP-Sym 12. Moderately high item-total correlations, high ␣ coefficients, and the results of the factor analysis indicate that a single construct is being measured, and that the items can be combined to form summary scores. Construct Validity (Convergent Validity): Table 5 shows correlations between the CAP-Sym and the SF-36. All correlations support hypotheses. The CAP-Sym is moderately correlated with SF-36 Physical Component Summary (PCS) and SF-36 Mental Component Summary (MCS) scores and with SF-36 Vitality scores. Both the CAP-Sym 18 and the CAPSym 12 are correlated more highly with SF-36 PCS than with SF-36 MCS scores. As expected, the CAP-Sym and SF-36 are uncorrelated with temperature or the PSI. Construct Validity (Discriminant Validity): Low correlations between the CAP-Sym and age and sex (Table 5; all correlations ⬍ 0.17) support the discriminant validity of the CAP-Sym 18 and the CAPSym 12. These results suggest that responses to the CAP-Sym are not biased in terms of age or sex. Table 4 —Known Group Differences Validity of the CAP-Sym and SF-36 Group CAP-Sym 18 CAP-Sym 12 SF-36 Vitality Clinical cure Clinical failure p Value 12.42 21.14 0.034 9.51 16.86 0.017 63.35 42.86 0.030 CHEST / 122 / 3 / SEPTEMBER, 2002 Downloaded From: http://publications.chestnet.org/pdfaccess.ashx?url=/data/journals/chest/21982/ on 05/13/2017 925 Table 5—Convergent and Discriminant Validity of the CAP-Sym Measures CAP-Sym 18 CAP-Sym 12 SF-36 PCS SF-36 MCS SF-36 Vitality Age Sex Temperature (baseline) PSI ⫺ 0.35 ⫺ 0.25 ⫺ 0.33 ⫺ 0.16 0.12 0.14 ⫺ 0.09 ⫺ 0.39 ⫺ 0.27 ⫺ 0.37 ⫺ 0.16 0.14 0.17 ⫺ 0.10 Responsiveness Table 6 shows effect sizes for change in CAP-Sym scores between baseline and days 3 to 5, days 7 to 10, and days 28 to 35. As hypothesized, CAP-Sym 18 and CAP-Sym 12 scores show improvement from baseline to follow-up at all three assessment points and increase in magnitude across time, indicating good responsiveness. Effect sizes are large at all three assessments. Similar results for the responsiveness of the CAP-Sym were found using standardized response means (results not shown). As shown in Table 6, the SF-36 also shows good evidence of responsiveness. Improvement as measured by SF-36 PCS and SF-36 MCS scores demonstrated large effect sizes between baseline and days 28 to 35 after therapy. Improvement as measured by SF-36 Vitality scores was small between baseline and days 3 to 5, large between baseline and days 7 to 10, and even larger between baseline and days 28 to 35. Compared with the CAP-Sym, effect sizes for the SF-36 are lower, suggesting that it may be less responsive to treatment. Similar results for the responsiveness of the SF-36 were found using standardized response means (results not shown). Discussion The CAP-Sym is a practical and scientifically sound patient-based outcome measure of CAPrelated symptoms that has been developed using “gold standard” methods. As the only fully validated measure of symptoms in CAP, which is quick and Table 6 —Responsiveness (Effect Sizes) of the CAP-Sym and SF-36 Assessment point Baseline to days 3–5 Baseline to days 7–10 Baseline to days 28–35 SF-36 CAP-Sym CAP-Sym PCS MCS Vitality 18 12 0.98 1.54 1.75 1.09 1.67 1.86 1.54 0.91 0.22 1.20 1.55 easy to administer and is more responsive than the generic SF-36, the CAP-Sym provides a rigorous method for improving the evaluation of treatment outcomes in CAP. A short questionnaire that takes ⬍ 2 min to complete, it can be easily incorporated into clinical trials and routine audit. This new outcome measure offers a valuable tool for evaluating the efficacy of new treatments for CAP, as it provides evidence that will be scientifically credible to regulatory bodies, clinicians, researchers, and patient advocacy groups. Both the CAP-Sym 18 and the CAP-Sym 12 meet standard criteria for acceptability, reliability, validity, and responsiveness. As the two versions show very similar psychometric properties, it is difficult to make a strong case for a psychometric advantage of either the CAP-Sym 18 or the CAP-Sym 12. The CAP-Sym 12 shows a slight psychometric advantage (ie, all item-total correlations meet the criterion, whereas one item fails the criterion in the CAP-Sym 18), as well as the practical advantage of being shorter, and therefore has a slightly lower respondent burden. However, the CAP-Sym 18 includes all symptoms and could therefore be argued to have optimal content and face validity. We recommend both measures for use in clinical trials and audit. The CAP-Sym is designed to be administered by interview. This method of questionnaire administration typically results in little missing data. If the CAP-Sym is administered by self-completion or postal survey rather than by interview, it is likely that there will be more missing data. In this case, an alternate scoring method would be more appropriate. We suggest that if the CAP-Sym is to be administered by self-completion, missing data should be imputed using the same algorithm recommended for scoring the SF-36.39 Using this method, a personspecific estimate is imputed for missing questions in cases in which the patient has answered at least 50% of the items on the CAP-Sym. It is important to consider possible methodologic limitations in the development of the CAP-Sym. First, were the patients who participated in the CAP 2000 and the field testing of the CAP-Sym a representative sample? Comparison with previous studies shows that patients in our study were similar in age and gender to patients with CAP in another large international trial.15 Second, does the lack of correlation between the CAP-Sym and clinical outcome measures limit its use or interpretation as an outcome measure in clinical trials? Discrepancies between measures of outcome assessed from the patient’s point of view and clinical assessments have been demonstrated in many areas of health care.40 – 42 This does not mean that patient-based outcome measures of symptoms or quality of life are 926 Downloaded From: http://publications.chestnet.org/pdfaccess.ashx?url=/data/journals/chest/21982/ on 05/13/2017 Clinical Investigations not valid. Rather, this common finding confirms that clinical and patient-based measures are evaluating different and not necessarily related aspects of outcome, both of which must be included in any comprehensive evaluation of outcomes. Moreover, we have shown that none of the clinician-rated signs and symptoms scales used in the CAP 2000 study were adequately reliable to be used to evaluate treatment outcome or to validate the CAP-Sym.43 Third, as expected in a study of this design and the population studied, the number of clinical failures was very small. It could be argued, therefore, that the test of validity based on differences in mean CAP-Sym scores between patients defined as clinically cured and those who did not demonstrate clinical improvement is less robust than if applied to a sample with a higher number of clinical failures. However, as other pieces of evidence from different psychometric tests provide further confirmation of the validity of the CAP-Sym, we do not believe that this arguably weaker piece of evidence compromises the overall validity of the CAP-Sym. It would, nevertheless, be important to further evaluate the ability of the CAP-Sym to discriminate groups in trials that produce more mixed results in terms of treatment efficacy. Our conclusions about the acceptability, reliability, validity, and responsiveness of the CAP-Sym are based on the results of field testing of 556 patients with CAP in 13 countries. Subsequent studies are needed to evaluate the psychometric properties of the 13 individual language versions of the CAP-Sym. As data begin to accumulate from the use of the CAP-Sym, we will be able to establish normative population values for CAP-Sym scores in different countries. It is also possible that the CAP-Sym may prove to be useful as a measure of outcomes in patients with hospital-acquired pneumonia or those with CAP requiring hospitalization, although the measure should be revalidated for use in more severe conditions. Findings from this study suggest a possible advantage of using the disease-specific CAP-Sym in clinical trials as it may be more responsive in detecting treatment effects than the generic SF-36. Further evaluation of the comparative responsiveness of the CAP-Sym and SF-36 should be carried out in order to confirm possible differences in sensitivity to treatment effects. Evaluating the scientific adequacy of patientbased health-outcome measures is a key task in developing recommendations about which measures should be used in clinical trials and to routinely monitor health care. Use of the CAP-Sym as an outcome measure in clinical trials ensures the availability of scientifically credible information about the effect of new treatments on patient-reported sympwww.chestjournal.org toms. For example, the CAP-Sym could be used to evaluate new pharmaceutical treatments or to compare the efficacy of oral vs IV administration. Routine use of the CAP-Sym in clinical practice will enable clinicians and managers to assess outcomes in CAP on a continuing basis. Such information can be used to identify strengths and deficiencies in quality of care, and thus can be used to improve medical practice. ACKNOWLEDGMENT: We thank the study investigators in each of the clinical sites. References 1 Niederman MS, McCombs JS, Unger AN, et al. The cost of treating community acquired pneumonia. Clin Ther 1998; 20:820 – 837 2 Guidelines for the management of community acquired pneumonia in adults admitted to hospital: The British Thoracic Society. Br J Hosp Med 1993; 49:346 –350 3 Marrie TJ. Community acquired pneumonia: epidemiology, etiology, treatment. Infect Dis Clin North Am 1998; 12:723– 740 4 Bartlett JG, Mundy LM. Community acquired pneumonia. N Engl J Med 1995; 333:1618 –1624 5 Bartlett JG, Dowell SF, Mandell LA, et al. Practice guidelines for the management of community-acquired pneumonia in adults. Clin Infect Dis 2000; 31:347–382 6 Mandell LA, Marrie TJ, Grossman RF, et al. Canadian guidelines for the initial management of community acquired pneumonia: an evidence-based update by the Canadian Infectious Diseases Society and the Canadian Thoracic Society. Clin Infect Dis 2000; 31:383– 421 7 ERS Task Force Report: guidelines for management of adult community-acquired lower respiratory tract infections; European Respiratory Society. Eur Respir J 1998; 11:986 –991 8 Niederman MS, Mandell LA, Anzueto A, et al. Guidelines for the management of adults with community-acquired pneumonia: diagnosis, assessment of severity, antimicrobial therapy, and prevention. Am J Respir Crit Care Med 2001; 163:1730 –1754 9 Heffelfinger JD, Dowell SF, Jorgensen JH, et al. Management of community-acquired pneumonia in the era of pneumococcal resistance. Arch Intern Med 2000; 160:1399 –1408 10 Gleason PP, Kapoor WN, Stone RA, et al. Medical outcomes and antimicrobial costs with the use of the American Thoracic Society guidelines for outpatients with community-acquired pneumonia. JAMA 1997; 278:32–39 11 Suchyta MR, Dean NC, Narus S, et al. Effects of a practice guideline for community-acquired pneumonia in an outpatient setting. Am J Med 2001; 110:306 –309 12 Dean NC, Silver MP, Bateman KA, et al. Decreased mortality after implementation of a treatment guideline for communityacquired pneumonia. Am J Med 2001; 110:451– 457 13 Roa CC, Dantes RB. Clinical effectiveness of a combination of bromhexine and amoxicillin in lower respiratory tract infection. Drug Res 1995; 45:267–272 14 Lim WS, Macfarlane JT, Boswell TCJ, et al. Study of community acquired pneumonia aetiology (SCAPA) in adults admitted to hospital: implications for management guidelines. Thorax 2001; 56:296 –301 CHEST / 122 / 3 / SEPTEMBER, 2002 Downloaded From: http://publications.chestnet.org/pdfaccess.ashx?url=/data/journals/chest/21982/ on 05/13/2017 927 15 Petitpretz P, Arvis P, Marel M, et al. Oral moxifloxacin vs high-dosage amoxicillin in the treatment of mild-to-moderate, community-acquired pneumococcal pneumonia in adults. Chest 2001; 119:185–195 16 Metlay JP, Schulz R, Li Y-H, et al. Influence of age on symptoms at presentation in patients with communityacquired pneumonia. Arch Intern Med 1997; 157:1453–1459 17 Marrie TJ, Lau CY, Wheeler SL, et al. A controlled trial of a critical pathway for treatment of community-acquired pneumonia. JAMA 2000; 283:749 –755 18 Fine MJ, Pratt HM, Obrosky DS, et al. Relation between length of hospital stay and costs of care for patients with communityacquired pneumonia. Am J Med 2000; 109:378–385 19 Metlay JP, Fine MJ, Schulz R, et al. Measuring symptomatic and functional recovery in patients with community-acquired pneumonia. J Gen Intern Med 1997; 12:423– 430 20 Fitzpatrick R, Fletcher A, Gore S, et al. Quality of life measures in health care. I: Applications and uses in assessment. BMJ 1992; 305:1074 –1077 21 Stewart AL, Ware JE Jr, eds. Measuring functioning and well-being: the Medical Outcomes Study approach. Durham, NC: Duke University Press, 1992 22 Quality of life and clinical trials [editorial]. Lancet 1995; 346:1–2 23 Nunnally JC, Bernstein IH. Psychometric theory. 3rd ed. New York, NY: McGraw Hill, 1994 24 Jenkinson C, ed. Measuring health and medical outcomes. London, UK: UCL Press, 1994 25 Streiner DL, Norman GR. Health measurement scales: a practical guide to their development and use. 2nd ed. Oxford, UK: Oxford University Press, 1995 26 Lohr KN, Aaronson NK, Alonso J, et al. Evaluating qualityof-life and health status instruments: development of scientific criteria. Clin Ther 1996; 18:979 –992 27 McDowell I, Jenkinson C. Development standards for health measures. J Health Serv Res Policy 1996; 1:238 –246 28 Lamping DL, Rowe P, Black N, et al. Development and validation of an audit instrument: the Prostate Outcomes Questionnaire. Br J Urol 1998; 82:49 – 62 29 Lamping DL, Rowe P, Clarke A, et al. Development and validation of the Menorrhagia Outcomes Questionnaire. Br J Obstet Gynaecol 1998; 105:766 –779 30 Foss AJE, Lamping DL, Schroter S, et al. Development and validation of a patient-based measure of outcome in ocular melanoma. Br J Ophthalmol 2000; 84:347–351 31 McDowell I, Newell C. Measuring health: a guide to rating scales and questionnaires. 2nd ed. Oxford, UK: Oxford University Press, 1996 32 Bowling A. Measuring disease: a review of disease-specific quality of life measurement scales. 2nd ed. Buckingham, UK: Open University Press, 2001 33 Bowling A. Measuring health: a review of quality of life measurement scales. 2nd ed. Milton Keynes, UK: Open University Press, 1997 34 Acquadro C, Jambon B, Ellis D, et al. Language and translation issues. In: Spilker B, ed. Quality of life and pharmacoeconomics in clinical trials. Philadelphia, PA: LippincottRaven 1996, 575–585 35 Höffken G, Corris PA, Muir JF, et al. Assessment of the management of community-acquired pneumonia (CAP): a comparison of moxifloxacin (MXF) to standard first-line oral monotherapy or a combination regimen. Presented at: 41st International Conference on Antimicrobial Agents and Chemotherapy, Chicago, IL, September 2001 36 Fine MJ, Auble TE, Yealey DM, et al. A prediction rule to identify low-risk patients with community-acquired pneumonia. N Engl J Med 1997; 336:243–250 37 Ware JE, Kosinski M, Keller SD. SF-36 physical and mental health summary scales: a user’s manual. Boston, MA: The Health Institute, 1994 38 Rosen RC, Riley A, Wagner G, et al. The international index of erectile dysfunction (IIEF): a multidimensional scale for assessment of erectile dysfunction. Urology 1997; 49:822– 830 39 Ware JE Jr, Snow KK, Kosinski M, et al. SF-36 Health Survey: manual and interpretation guide. Boston, MA: The Health Institute, 1993 40 Guyatt GH, Thompson PJ, Berman LB, et al. How should we measure function in patients with chronic heart and lung disease? J Chronic Dis 1985; 38:517–524 41 Juniper EF, Guyatt GH, Ferrie PJ, et al. Measuring quality of life in asthma. Am Rev Respir Dis 1993; 147:832– 838 42 Cleary PD. Subjective and objective measures of health: which is better? J Health Serv Res Policy 1997; 2:3– 4 43 Lamping DL, Schroter S, Sagnier PP, et al. Psychometric evaluation of the CAP-Sym questionnaire: a new, patientbased measure of symptoms in community-acquired pneumonia. Value Health 2001; 4:65 44 Cronbach LJ. Coefficient ␣ and the internal structure of tests. Psychometrika 1951; 16:297–334 45 Cohen J. Statistical power analysis for the behavioral sciences. New York, NY: Academic Press, 1977 46 Kazis LE, Anderson JJ, Meenan RF. Effect sizes for interpreting changes in health status. Med Care 1989; 27(3 suppl):S178 –S189 928 Downloaded From: http://publications.chestnet.org/pdfaccess.ashx?url=/data/journals/chest/21982/ on 05/13/2017 Clinical Investigations Appendix www.chestjournal.org CHEST / 122 / 3 / SEPTEMBER, 2002 Downloaded From: http://publications.chestnet.org/pdfaccess.ashx?url=/data/journals/chest/21982/ on 05/13/2017 929