Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Plateau principle wikipedia , lookup
Neuropharmacology wikipedia , lookup
Drug design wikipedia , lookup
Drug discovery wikipedia , lookup
Pharmaceutical industry wikipedia , lookup
Theralizumab wikipedia , lookup
Drug interaction wikipedia , lookup
Prescription costs wikipedia , lookup
Pharmacogenomics wikipedia , lookup
A STATISTICAL LOOK AT RISK FACTORS FOR CORONARY HEART DISEASE David J. Shannon, Pfizer Central Research SUMMARY The overall aim of this presentation is to show how an applied statistician can explore a problem using the techniques available in SAS and SAS/GRAPH. The problem of considering an individual's risk of getting Coronary Heart Disease (CHD) as defined by a derived equation was followed through from theoretical considerations to an application to clinical trials. This application showed that one drug was likely to be more favourable than another from the viewpoint of reducing the risk of getting CHD. 1. Introduction A long term epidemiological study,(l), was set up in Framingham, Mass., U.S.A., over 30 years ago. A cohort of the population was given regular medical examinations. The information from these regular examinations now provides a large database which is extremely valuable to research workers who wish to look at, among other things, disease states which change relatively slowly over a specified time period. From the data collected the Framingham team have been able to study the incidence of Coronary Heart Disease (CHD) and relate this to the various medical measurements which have been carried out. Hence they have identified a set of variables, or Risk Factors, which have been shown to be significant predictors for calculating the risk of getting CHD. The aim of this presentation is (1) To describe briefly the statistical methodology used to link the incidence of CHD with the main Risk Factors. (2) To look at graphical displays of the inter-relationship between the Risk Factors for generated data. (3) To apply the derived CHD Risk equation to the results of a set of clinical trials where the intervention measure was treatment to reduce blood pressure (hypertension). 2. Statistical Methodology and Risk Factors 2.1 Statistical Methodology The statistical technique used to link the incidence of CHD with potential risk factors is Multiple Logistic Regression which can be described as follows : The logistic function takes the form A takes the form 1 /( l+exp(-A) ), where "" b. X· • a +~ a + b, X, + b, X. + •••• \.-. .. .. '" The X's are the variables being used to identify Risk Factors, a is the intercept and the b's are the regression coefficients which are estimated from the sample data. The form of the Multiple Logistic Regression function is 1 -------------------.---------- P 1 + exp[ -( a + .f. b. X.) ",=1 '" (2.1) ] ... During the course of the Framingham study the variables, X~ , were measured at regular intervals and the incidence of CHD was recorded in the intervals between these measurements. The presence or absence of a Coronary Event ( 1 for Present, 0 for Absent) was linked to the measured X variables. From the subsequent statistical analysis a set of significant variables or Risk Factors for CHD emerged. 2.2, Risk Factors Research work following on from the Framingham study publications identified the following variables to be significant predictors of CHD Risk (Risk Factors) : i~ :~ ", , G I, 1. 2. 3. 4. 5. 6. 7. 8. Systolic Blood Pressure (mmHg) Total Cholesterol (mg/dl) HDL Cholesterol (mg/dl) Cigarette Smoking (Yes or No) Glucose Intolerance (Yes or No) ECG - Left Ventricular Hypertrophy (Yes or No) Age (years) Sex (Male, Female) Some of these Risk Factors are measured on the continuous scale ( Systolic Blood Pressure, Total and HDL Cholesterol and Age ) and some are dichotomous variables ( Cigarette Smoking, Glucose Intolerance, ECH-LVH and Sex ). 51 This further research work lead to ~n equation being derived to link CHD Risk with the above listed Risk Factors similar in form to equation 2.1. Certain restrictions have been placed on the upper and lower limits for the variables measured on the continuous scale as follows : Age (years) Total Cholesterol (mm/dl) HDL-Cholesterol (mm/dl) Systolic Blood Pressure (mmHg) 3. Male Female 35-65 185-335 30-65 105-195 45-65 185-335 40-70 105-195 Graphical Displays of CHD Risk for Generated Data These displays have been done using SAS/GRAPH, PROC 3D and the data have-been generated using the basic programming facilities of SAS. Since we are looking at 8 variables here, 4 of which are continuous and 4 dichotomous, it is necessary to select a subset of all possible displays which could be obtained. It is hoped that this subset will be able to indicate the changing risk profile over the range of values of all the risk factors. For the 4 dichotomous variables both values have been used, i.e. Smoker or not, Male or Female, ECG-LVH or not, Glucose Intolerance or not. For the 4 continuous variables ( Age, Systolic Blood Pressure, Total and HDL Cholesterol ) the selection of observations to display has been made as follows 1. ~otal and HDL Cholesterol has been chosen over the range, of values specified by the equation. 2. Systolic Blood Pressure has been chosen at values of 120, 140,160 and 180 mmHg. 3. Age has been chosen to be 50 years for all the above combinations. Ages of 40 and 60 years have been chosen for one particular value of each of the 3 dichotomous variables ( Smoker, ECG-LVH, and Glucose Tolerant ) for Males and Females. Each of the plots shows the Risk (Probability %) of getting CHD within 6 years, displayed on the Z-axis, and Total & HDL Cholesterol, displayed on the X- and Y-axes. The list of graphs included for illustration are Figures 1 (1.1 & 1.2) to 3 (3.1 & 3.2). Several other graphs produced, not displayed here, all illustrated similar features. The main conclusions from these displays of theoretical outcome are 1. CHD Risk increases with increasing Blood Pressure, increasing Total Cholesterol and decreasing HDL Cholesterol. This can be seen by looking at figures 1.1 and 1.2 where the planes increase from left to right, and, the plane is much steeper in figure 1.2 (Sys.B.P. = 180) than in figure 1.1 (Sys.B.P. = 120). 2. CHD Risk increases for Cigarette Smokers. See figures 2.1 and 2.2. 3. CHD Risk is greater for Males than Females. See figures 3.1 and 3.2. 4. CHD Risk increases when Glucose Intolerance is present. 5. CHD Risk increases in the presence of Left Ventricular Hypertrophy. 6. CHD Risk increases with Age. Personal observations of these theoretical displays would lead one to consider that persuading an individual to change their state ( where possible ) by treatment or change of life style is likely to decrease the risk of getting CHD. The next section deals more specifically with particular clinical trials and the possible inter-relationship between Blood Pressure and Blood Lipids (Total and HDL Cholesterol), and, their effect on CHD Risk. 4. The Application of the Derived CHD Risk Equation to Clinical Trials Data. 4.1 General Considerations A set of clinical trials which had been set up to assess the effect of antihypertensive therapies was considered to be suitable for the application of our derived CHD Risk Equation. The reasons for doing this were as follows : 4.1.1 The trials had generated sufficient data to enable the equation to be used successfully. 4.1.2 From a consideration of the mode of action of the 2 antihypertensive therapies used in the trials it had been hypothesised that, while both of them showed significant antihypertensive effect (lowering blood pressure), one of them tended to have an adverse effect on blood lipids ( Total and HDL Cholesterol ) while the other had an advantageous effect. Hence it was considered to be of importance to establish what effect the use of these 2 drugs would have on the Risk of getting CHD. 4.2 Information on the Clinical Trials The trials were of 20 weeks duration on active therapy with a 4 week wash-out period to eliminate previous antihypertensive therapy effect followed by a 4 week single blind placebo period preceding active therapy. The patients were randomised (double blind) to either Drug A or Drug B. Variables measured which are used in the CHD Risk Equation are 1. 2. 3. 4. 5. 6. Systolic Blood Pressure (Supine and Standing). Total Cholesterol. HDL Cholesterol. ECG - assess the presence or absence of LVH. Age. Sex. Information on cigarette smoking and glucose intolerance was not collected in these trials. The data were analysed on the assumption of the more severe condition (i.e., smoker and glucose intolerance) for each. The assessment of the data prior to analysis was as follows : CHD Risk was calculated at Baseline (i.e. prior to going onto active therapy). It was calculated again at the end of 20 weeks active therapy (Final), and, the change in risk between Baseline and Final was analysed to assess treatment effect. Restrictions to patients entering the analysis were as follows : If a patient had data missing or falling outside the acceptable limits, as given in section 2.2, at Baseline or Final for any of the 6 variables listed above that patient was excluded from the analysis. 54 4.3 Analysis of the Data Four hundred and twenty five patients were included in these trials (219 on Drug A and 206 on Drug B). Of these patients 248 (133 on Drug A and 115 on Drug B) were excluded at Baseline and 70 (39 on Drug A and 31 on Drug B) were excluded at Final evaluation leaving 107 (47 on Drug A and 60 on Drug B) available for analysis. It needs to be emphasised again here that the trials were not designed specifically for an application of a multivariate CHD Risk equation. This explains why only a limited number of patients are available for analysis since an out of range or missing value on anyone of the 6 measured variables at either baseline or final was sufficient to exclude a patient from analysis. The above assessment of the data to exclude patients with data missing or outside limits was done using SAS programming facilities and the PROC PRINT and PROC TABULATE facilities. At each stage a listing and tabulation of the relevant data were obtained and the programming facilities simplified the task of reducing the dataset to include analysable patients only. An assessment of the break-down of the patient numbers by sex and treatment is as follows : Drug A (%) Drug B (%) Male 34 (72) 41 (68) Female 13 (28) 19 (32) Total 47 (100) 60 (100) This shows that the groups are comparable by sex even though there is some imbalance in overall numbers between the treatment groups. It is of interest to look at the results obtained for the main variables being used in the CHD Risk Equation to see how these compare with the expectations observed for reduced CHD Risk in Section 3 above. Looking at Table 1 it can be seen that the 2 treatment groups are comparable at Baseline for Age, Systolic Blood pressure, and, Total and HDL Cholesterol. Considering the changes between Baseline and Final, also in Table 1 (Final-Baseline), it can be seen that: 1. The change in Systolic Blood Pressure is comparable for the 2 treatment groups. 2. Total Cholesterol is decreased for Drug A and increased for Drug B. 3. HDL Cholesterol is increased for Drug A and decreased for Drug B. The analysis of change from Baseline was carried out using PROC TTEST and the results are shown in Table 3 (analyses 2-4) showing only a significant change from Baseline for HDL Cholesterol (analysis 4). But, remembering the main conclusions from Section 3, this change in Cholesterol profile would tend to indicate a potential decrease in CHD Risk for the Drug A treatment group. Applying the CHD Risk equation to the data gave the following results Figures 4.1 and 4.2 show the Risk (%) of getting CHD pre and post treatment for Drug A and Drug B respectively. The diagonal lines on the graph are the lines of "no change" pre and post treatment. Points above the line indicate increased Risk and points below the line decreased Risk. It is clear that a larger proportion of points lie below the line for Drug A whereas the opposite is true for Drug B. This is confirmed by a closer look at the data which shows that 32/47 (68%) had decreased Risk in the Drug A group compared with 22160 (37%) in the Drug B group. A further point of interest is that 22/47 (47%) on Drug A showed a decreased Risk of greater than 20% compared with 6/60 (10%) on Drug B. The above is an exploratory data analysis indicating what is happening in the 2 treatment groups over time. Now, the formal statistical analysis, using PROC TTEST, is considered. The analysis of change in Risk from Baseline to Final was carried out on the Log Odds Ratio, Ln( 1 1 - PI' 1 - Pe. where PL and P~ are the Risk of getting CHD at Baseline and Final respectively. The summary statistics are given in Table 2 showing further that Drug A has a more favourable Risk profile than Drug B. The analysis of the Log Odds Ratio showed that the Drug A group had significantly reduced (p = 0.0003) CHD Risk compared with Drug B. This is shown in analysis 1 of Table 3. The distribution of change in Risk for the 2 treatment groups is shown in figures 5.1 and 5.2 for Drugs A and B respectively. This formal analysis confirms what has been observed in the exploratory data analysis. Acknowledgement I would like to thank my colleagues, especially Dr.P.Berry and Hr.G.Downing for advice and help given in preparing this presentation. David J.Shannon, Computational Sciences Dept., Pfizer Central Research, Sandwich, Kent, U.K. 56 ,". ~ References ~ f f 1. Gordon T. & Kannel V.B., Multiple risk functions for predicting coronary heart disease : The concept, accuracy and application, American Heart Journal, 1982, Vol. 103, No.6, Pages 1031-1039. 2. SAS User's Guide: Basics, Version 5, SAS Institute Inc.,1985 3. SAS User's Guide: Statistics, Version 5, SAS Institute Inc.,1985 4. SAS/GRAPH User's Guide, Version 5, SAS Institute Inc.,1985 Notes 1. There is an extensive literature set available for the Framingham Study. Many of the relevant papers are listed in the references given in Reference 1 above. 2. All SAS and SAS/GRAPH jobs run to complete this presentation used SAS Version 5.03 (VMS) on the VAX 8800 installed in the Computational Sciences Dept., Pfizer Central Research, U.K. 3. A comment (or plea) from a SAS End User (statistician) to SAS Institute. It is of value to note how relatively easy it is to display clearly summary statistics using PROC TABULATE (Tables 1 & 2). This contrasts strongly with the inability to alter the format of PROC TTEST (Table 3) which has very limited use as a table which can be put directly into a report. The major weakness of SAS statistical procedures, for applied statisticians at least, is, in general, their inability to allow the user easy access to user defined formatting for reporting purposes. TABLE 1 SUMMARY STATISTICS FOR C.H.D. RISK FACTORS FOR: SMOKER, GLUCOSE INTOLERANCE, STANDING SYSTOLIC BLOOD PRESSURE MEANS FOR AGE, STANDING SYSTOLIC BLOOD PRESSURE, TOTAL & HDL CHOLESTEROL TREATMENT I 1----------------------------------------------------- 1 Druq A I Druq B 1--------------------------+-------------------------I BASELINE I Mean IStd.Err. I Number I Mean IStd.Err. I Number 1-------------------------------+--------+----7---+--------+--------+--------+-------IAGE I 49.41 1.31 471 50.31 1.01 60 1-------------------------------+--------+--------+--------+--------+--------+-------ISYSTOLIC B.P. I 152.81 2.51 471 150.71 1.91 60 1-------------------------------+--------+--------+--------+--------+--------+-------ITOTAL CHOLESTEROL I 248.51 5.01 471 245.01 4.51 60 1-------------------------------+--------+--------+--------+--------+--------+-------IHDL CHOLESTEROL I 48.71 1.51 471 51.01 1.11 60 1-------------------------------+--------+--------+--------+--------+--------+-------- FINAL 1-------------------------------+--------+--------+--------+--------+--------+--------1 ISYSTOLIC B.P. I 141.51 2.71 471 139.41 2.21 601 1-------------------------------+--------+--------+--------+--------+--------+--------1 ITOTAL CHOLESTEROL" I 243.11 4.01 471 246.91 4.31 601 1-------------------------------+--------+--------+--------+--------+--------+--------1 IHDL CHOLESTEROL I 49.11 1.51 471 46.41 1.11 601 1-------------------------------+--------+--------+--------+--------+--------+--------1 FINAL - BASELINE 1-------------------------------+--------+--------+--------+--------+--------+--------1 ISYSTOLIC B.P. I -11.31 2.61 471 -11.31 1.61 601 1-------------------------------+--------+--------+--------+--------+--------+--------1 ITOTAL CHOLESTEROL I -5.51 4.01 471 1.81 3.51 601 1-------------------------------+--------+--------+--------+--------+--------+--------1 I HDL CHOLESTEROL I 0.4I 1. 0 I 47 I -4.6 I 0. 7 I 60 I TABLE 2 FOR: : MEANS FOR THE RISK (%) OF GETTING CORONARY HEART DISEASE SMOKER, GLUCOSE INTOLERANCE, STANDING SYSTOLIC BLOOD PRESSURE MEANS FOR BASELINE, F"INAL AND % CHANGE FROM BASELINE TO FINAL I I TREATMENT I I 1-----------------------------------------------------I I I Druq A I Druq B I 1--------------------------+--------------------------I I I I Mean IStd.Err. I Number I Mean IStd.Err.1 Number I 1-------------------------------+--------+--------+--------+--------+--------+--------1 I BASELINE I 8.851 1.091 471 7.841 0.861 601 1-------------------------------+--------+--------+--------+--------+--------+--------1 I FINAL I 8.171 1.141 471 8.311 0.861 601 1-------------------------------+--------+--------+--------+--------+--------+--------1 1% CHANGE FROM BASELINE I -8.361 4.971 471 15.171 4.501 601 58 TABLE 3 FOR ANALYSIS OF THE RISK (%) OF GETTING CORONARY HEART DISEASE SMOKER, GLUCOSE INTOLERANCE, STANDING SYSTOLIC BLOOD PRESSURE CHANGE FROM BASELINE TO FINAL TTEST PROCEDURE ANALYSIS TREAT Drug A Drug B VARIABLE: LODDSRAT Log Odds Ratio N MEAN STD DEV STD ERROR MINIMUM MAXIMUM 44 59 0.10369894 -0.05312537 0.29476230 0.18147023 0.04443709 0.02362541 -0.45400667 -0.75970251 1.08760299 0.35597496 2.64 WITH 43 AND 58 DF FOR HO: VARIANCES ARE EQUAL, F'= VARIABLE: STDSYSD ANALYSIS TREAT Drug A Drug B Drug A Drug B Drug A Drug B MEAN STD DEV STD ERROR MINIMUM MAXIMUM 17.92945370 12.55292750 2.61527961 1. 62057597 -73.00000000 -54.00000000 37.00000000 17.00000000 2.04 WITH 46 AND 59 DF 0.0027 0.0012 VARIABLE: T CHOLD VARIANCES UNEQUAL EQUAL T DF PROB > IT I 0.0130 0.0136 79.0 105.0 0.9896 0.9892 T DF PROB > ITI -1.3712 -1. 3729 98.5 105.0 0.1734 0.1727 T DF PROB > ITI 4.0806 4.2057 85.0 105.0 0.0001 0.0001 PROB > F'= 0.0101 TOTAL CHOLESTEROL N MEAN STD DEV STD ERROR MINIMUM MAXIMUM 47 60 -5.46808511 1.83333333 27.45973421 27.17935824 4.00541390 3.50884006 -62.00000000 -113.00000000 65.00000000 78.00000000 VARIABLE: HDL CH D 1.02 WITH 46 AND 59 DF VARIANCES UNEQUAL EQUAL PROB > F'= 0.9323 HDL CHOLESTEROL N MEAN STD DEV STD ERROR MINIMUM MAXIMUM 47 60 0.40425532 -4.58333333 6.90198325 5.36874718 1.00675773 0.69310228 -13.00000000 -18.00000000 17.00000000 7.00000000 FOR HO: VARIANCES ARE EQUAL, F'= PROB > IT I 0> -11.27659574 -11.31666667 FOR HO: VARIANCES ARE EQUAL, F'= TREAT DF 66.8 101.0 en N ANALYSIS T 3.1161 3.3299 PROB > F'= 0.0006 47 60 ANALYSIS UNEQUAL EQUAL SYSTOLIC B.P. FOR HO: VARIANCES ARE EQUAL, F'= TREAT VARIANCES 1.65 WITH 46 AND 59 DF VARIANCES UNEQUAL EQUAL PROB > F'= 0.0690 ,.-« ~ n<,,: .: 11.: ':":,'-.,,' ;. >," ",',',' ~"--\,' ,;,., --::,'.•,:,. -, ~~ '/_'~.LlU.A-U:;:-;,t:,-~......:~d r.·',.,:. ~;,;,:'':;'..!.,-:'';''.~;'~_-':',,-.', :.'" "'_U!.::.~_i"',',.''';'';;'~<lS:C~. ',.1 ~,.!;:i/.;;'~u:&;),;;::oi,"17"r';G,c,::r, RISK OF GETTING C.H.D f1GURE 1.1 : SYSTOLIC MALE' SMOKER' ECG-LVH • GLUCOSE B.P.~120 INTOLERANT' AGE-SO 72 "'. 330. 0 fiGURE MALE 1.2 : SYSTOLIC B.P.:160 * SMOKER' ECG-LVH • GLUCOSE INTOLERANT * AGE=50 '1 72 330. 0 36 10t .Cho 1 . ,8 .0 60 RISK OF GETTING C.H.D. FIGURE MALE' ECG-LVH • GLUCOSE Risk (~ 2.1: SMOKER INTOLERANT' AGE-50' Sys.B.P.-,80 ) I 7Z. I 54 330. 0 36 ,6 0 6~· 36. FIGURE MALE' ECG-LVH • GLUCOSE Risk z..z. : NON-SMOKER INTOLERANT' AGE=50 • Sys.B.P.=,80 (~) 72 54 330. 0 36 ,8 5 1a5. 0 36. 61 0 RISK OF GETTING C.H.D. FIGURE • Risk SMOKER. ECG-LVH • GLUCOSE 3.1: MALE INTOLERANT. AGE-50. SYS.B.P.-160 <,,) 72 54 l!il il!il Il! iI! I!lI I ~I I!I!~, -: :~; : ~ 36 I 18 330.0 Tot.Chol. 257.5 185.0 36,00 FIGURE 3.2 • R i sk SMOKER * ECG-LVH • GLUCOSE FEMALE INTOLERANT. AGE=50 • SYS.S.P.=160 (n) 72 54 36 330.0 18 257.5 0 65. 00 57.75 50.50 HDL 43.25 Chol. 185.0 36.00 62 Risk (%) of Getting CHD Pre and Post Treatment FiQure 4.1 : Drug A 30 * 25 * P * 20 * T m • * * * 15 10 * * ** 5 ** * * * * * * * * * * * 0 0 5 10 15 20 25 30 25 30 Pre-Treatment Figure 4.2 : Drug B 30 25 * p 0 •t 20 ** * 15 * * ** * 10 ** 5 * * * * * * * 0 0 5 10 15 Pre-Treotment 63 20 FIGURE 5.1 : CHD RISK K CHANGE FROM BASELINE FOR DRUG A FREQUENCY 30 25 20 15 10 5 o 90 ~ Chonge from 120 150 120 150 Basel ine FIGURE 5.2 CHD RISK ~ CHANGE FROM BASELINE FOR DRUG 8 FREQUENCY 30 25 20 15 10 5 o 30 ~ Chonoe 64 60 90 from Bosel ine