Download A Statistical Look at Risk Factors for Coronary Heart Disease

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Plateau principle wikipedia , lookup

Medication wikipedia , lookup

Neuropharmacology wikipedia , lookup

Drug design wikipedia , lookup

Drug discovery wikipedia , lookup

Pharmaceutical industry wikipedia , lookup

Theralizumab wikipedia , lookup

Drug interaction wikipedia , lookup

Prescription costs wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Pharmacokinetics wikipedia , lookup

Bad Pharma wikipedia , lookup

Transcript
A STATISTICAL LOOK AT RISK FACTORS FOR CORONARY HEART DISEASE
David J. Shannon, Pfizer Central Research
SUMMARY
The overall aim of this presentation is to show how an applied
statistician can explore a problem using the techniques available
in SAS and SAS/GRAPH.
The problem of considering an individual's risk of getting Coronary
Heart Disease (CHD) as defined by a derived equation was
followed through from theoretical considerations to an application
to clinical trials.
This application showed that one drug was likely to be more favourable
than another from the viewpoint of reducing the risk of getting CHD.
1.
Introduction
A long term epidemiological study,(l), was set up in Framingham,
Mass., U.S.A., over 30 years ago. A cohort of the population was
given regular medical examinations. The information from these
regular examinations now provides a large database which is
extremely valuable to research workers who wish to look at, among
other things, disease states which change relatively slowly over
a specified time period.
From the data collected the Framingham team have been able to
study the incidence of Coronary Heart Disease (CHD) and relate
this to the various medical measurements which have been carried
out.
Hence they have identified a set of variables, or Risk Factors,
which have been shown to be significant predictors for
calculating the risk of getting CHD.
The aim of this presentation is
(1)
To describe briefly the statistical methodology used to
link the incidence of CHD with the main Risk Factors.
(2)
To look at graphical displays of the inter-relationship
between the Risk Factors for generated data.
(3)
To apply the derived CHD Risk equation to the results
of a set of clinical trials where the intervention measure
was treatment to reduce blood pressure (hypertension).
2.
Statistical Methodology and Risk Factors
2.1
Statistical Methodology
The statistical technique used to link the incidence of CHD
with potential risk factors is Multiple Logistic Regression
which can be described as follows :
The logistic function takes the form
A takes the form
1 /( l+exp(-A) ), where
"" b. X· •
a +~
a + b, X, + b, X. + ••••
\.-.
..
..
'"
The X's are the variables being used to identify Risk Factors,
a is the intercept and the b's are the regression coefficients
which are estimated from the sample data.
The form of the Multiple Logistic Regression function is
1
-------------------.----------
P
1
+ exp[ -( a +
.f. b. X.)
",=1
'"
(2.1)
]
...
During the course of the Framingham study the variables,
X~ , were measured at regular intervals and the incidence
of CHD was recorded in the intervals between these
measurements. The presence or absence of a Coronary Event
( 1 for Present, 0 for Absent) was linked to the measured
X variables. From the subsequent statistical analysis a
set of significant variables or Risk Factors for CHD
emerged.
2.2,
Risk Factors
Research work following on from the Framingham study publications
identified the following variables to be significant predictors
of CHD Risk (Risk Factors) :
i~
:~
",
,
G
I,
1.
2.
3.
4.
5.
6.
7.
8.
Systolic Blood Pressure (mmHg)
Total Cholesterol (mg/dl)
HDL Cholesterol (mg/dl)
Cigarette Smoking (Yes or No)
Glucose Intolerance (Yes or No)
ECG - Left Ventricular Hypertrophy (Yes or No)
Age (years)
Sex (Male, Female)
Some of these Risk Factors are measured on the continuous
scale ( Systolic Blood Pressure, Total and HDL Cholesterol
and Age ) and some are dichotomous variables ( Cigarette
Smoking, Glucose Intolerance, ECH-LVH and Sex ).
51
This further research work lead to ~n equation being derived
to link CHD Risk with the above listed Risk Factors similar
in form to equation 2.1.
Certain restrictions have been placed on the upper and
lower limits for the variables measured on the continuous
scale as follows :
Age (years)
Total Cholesterol (mm/dl)
HDL-Cholesterol (mm/dl)
Systolic Blood Pressure (mmHg)
3.
Male
Female
35-65
185-335
30-65
105-195
45-65
185-335
40-70
105-195
Graphical Displays of CHD Risk for Generated Data
These displays have been done using SAS/GRAPH, PROC 3D and
the data have-been generated using the basic programming
facilities of SAS.
Since we are looking at 8 variables here, 4 of which are
continuous and 4 dichotomous, it is necessary to select a
subset of all possible displays which could be obtained.
It is hoped that this subset will be able to indicate the
changing risk profile over the range of values of all the
risk factors.
For the 4 dichotomous variables both values have been used,
i.e. Smoker or not, Male or Female, ECG-LVH or not, Glucose
Intolerance or not. For the 4 continuous variables ( Age,
Systolic Blood Pressure, Total and HDL Cholesterol ) the
selection of observations to display has been made as follows
1.
~otal
and HDL Cholesterol has been chosen over the range,
of values specified by the equation.
2.
Systolic Blood Pressure has been chosen at values of 120,
140,160 and 180 mmHg.
3.
Age has been chosen to be 50 years for all the above
combinations. Ages of 40 and 60 years have been chosen
for one particular value of each of the 3 dichotomous
variables ( Smoker, ECG-LVH, and Glucose Tolerant )
for Males and Females.
Each of the plots shows the Risk (Probability %) of getting
CHD within 6 years, displayed on the Z-axis, and Total &
HDL Cholesterol, displayed on the X- and Y-axes.
The list of graphs included for illustration are Figures 1
(1.1 & 1.2) to 3 (3.1 & 3.2). Several other graphs produced,
not displayed here, all illustrated similar features.
The main conclusions from these displays of theoretical outcome
are
1.
CHD Risk increases with increasing Blood Pressure, increasing
Total Cholesterol and decreasing HDL Cholesterol.
This can be seen by looking at figures 1.1 and 1.2 where
the planes increase from left to right, and, the plane is much
steeper in figure 1.2 (Sys.B.P. = 180) than in figure 1.1
(Sys.B.P. = 120).
2.
CHD Risk increases for Cigarette Smokers.
See figures 2.1 and 2.2.
3.
CHD Risk is greater for Males than Females.
See figures 3.1 and 3.2.
4.
CHD Risk increases when Glucose Intolerance is present.
5.
CHD Risk increases in the presence of Left Ventricular
Hypertrophy.
6.
CHD Risk increases with Age.
Personal observations of these theoretical displays would lead
one to consider that persuading an individual to change their
state ( where possible ) by treatment or change of life style
is likely to decrease the risk of getting CHD.
The next section deals more specifically with particular clinical
trials and the possible inter-relationship between Blood Pressure
and Blood Lipids (Total and HDL Cholesterol), and, their effect
on CHD Risk.
4.
The Application of the Derived CHD Risk Equation to Clinical Trials Data.
4.1 General Considerations
A set of clinical trials which had been set up to assess the effect
of antihypertensive therapies was considered to be suitable for
the application of our derived CHD Risk Equation. The reasons
for doing this were as follows :
4.1.1
The trials had generated sufficient data to enable the
equation to be used successfully.
4.1.2
From a consideration of the mode of action of the 2
antihypertensive therapies used in the trials it had been
hypothesised that, while both of them showed significant
antihypertensive effect (lowering blood pressure), one of
them tended to have an adverse effect on blood lipids
( Total and HDL Cholesterol ) while the other had an
advantageous effect. Hence it was considered to be of
importance to establish what effect the use of these 2
drugs would have on the Risk of getting CHD.
4.2 Information on the Clinical Trials
The trials were of 20 weeks duration on active therapy with a
4 week wash-out period to eliminate previous antihypertensive
therapy effect followed by a 4 week single blind placebo period
preceding active therapy. The patients were randomised (double
blind) to either Drug A or Drug B.
Variables measured which are used in the CHD Risk Equation are
1.
2.
3.
4.
5.
6.
Systolic Blood Pressure (Supine and Standing).
Total Cholesterol.
HDL Cholesterol.
ECG - assess the presence or absence of LVH.
Age.
Sex.
Information on cigarette smoking and glucose intolerance was
not collected in these trials. The data were analysed on the
assumption of the more severe condition (i.e., smoker and glucose
intolerance) for each.
The assessment of the data prior to analysis was as follows :
CHD Risk was calculated at Baseline (i.e. prior to going onto
active therapy). It was calculated again at the end of 20 weeks
active therapy (Final), and, the change in risk between Baseline
and Final was analysed to assess treatment effect.
Restrictions to patients entering the analysis were as follows :
If a patient had data missing or falling outside the acceptable
limits, as given in section 2.2, at Baseline or Final for any of
the 6 variables listed above that patient was excluded from the
analysis.
54
4.3
Analysis of the Data
Four hundred and twenty five patients were included in these trials
(219 on Drug A and 206 on Drug B). Of these patients 248 (133 on
Drug A and 115 on Drug B) were excluded at Baseline and 70 (39 on
Drug A and 31 on Drug B) were excluded at Final evaluation leaving
107 (47 on Drug A and 60 on Drug B) available for analysis.
It needs to be emphasised again here that the trials were not
designed specifically for an application of a multivariate CHD
Risk equation. This explains why only a limited number of patients
are available for analysis since an out of range or missing value
on anyone of the 6 measured variables at either baseline or final
was sufficient to exclude a patient from analysis.
The above assessment of the data to exclude patients with data
missing or outside limits was done using SAS programming
facilities and the PROC PRINT and PROC TABULATE facilities. At
each stage a listing and tabulation of the relevant data were
obtained and the programming facilities simplified the task of
reducing the dataset to include analysable patients only.
An assessment of the break-down of the patient numbers by sex and
treatment is as follows :
Drug A (%)
Drug B (%)
Male
34
(72)
41
(68)
Female
13
(28)
19
(32)
Total
47
(100)
60
(100)
This shows that the groups are comparable by sex even though there
is some imbalance in overall numbers between the treatment groups.
It is of interest to look at the results obtained for the main
variables being used in the CHD Risk Equation to see how these
compare with the expectations observed for reduced CHD Risk in
Section 3 above. Looking at Table 1 it can be seen that the 2
treatment groups are comparable at Baseline for Age, Systolic
Blood pressure, and, Total and HDL Cholesterol.
Considering the changes between Baseline and Final, also in
Table 1 (Final-Baseline), it can be seen that:
1. The change in Systolic Blood Pressure is comparable for the 2
treatment groups.
2. Total Cholesterol is decreased for Drug A and increased for
Drug B.
3. HDL Cholesterol is increased for Drug A and decreased for Drug
B.
The analysis of change from Baseline was carried out using PROC
TTEST and the results are shown in Table 3 (analyses 2-4) showing
only a significant change from Baseline for HDL Cholesterol
(analysis 4).
But, remembering the main conclusions from Section 3, this change
in Cholesterol profile would tend to indicate a potential decrease
in CHD Risk for the Drug A treatment group.
Applying the CHD Risk equation to the data gave the following
results
Figures 4.1 and 4.2 show the Risk (%) of getting CHD pre and post
treatment for Drug A and Drug B respectively. The diagonal lines
on the graph are the lines of "no change" pre and post treatment.
Points above the line indicate increased Risk and points below
the line decreased Risk. It is clear that a larger proportion of
points lie below the line for Drug A whereas the opposite is
true for Drug B.
This is confirmed by a closer look at the data which shows that
32/47 (68%) had decreased Risk in the Drug A group compared
with 22160 (37%) in the Drug B group. A further point of interest
is that 22/47 (47%) on Drug A showed a decreased Risk of greater
than 20% compared with 6/60 (10%) on Drug B.
The above is an exploratory data analysis indicating what is
happening in the 2 treatment groups over time. Now, the formal
statistical analysis, using PROC TTEST, is considered.
The analysis of change in Risk from Baseline to Final was carried
out on the Log Odds Ratio,
Ln(
1
1 - PI'
1 - Pe.
where PL and P~ are the Risk of getting CHD at Baseline and
Final respectively.
The summary statistics are given in Table 2 showing further that
Drug A has a more favourable Risk profile than Drug B. The
analysis of the Log Odds Ratio showed that the Drug A group had
significantly reduced (p = 0.0003) CHD Risk compared with Drug B.
This is shown in analysis 1 of Table 3. The distribution of
change in Risk for the 2 treatment groups is shown in figures 5.1
and 5.2 for Drugs A and B respectively.
This formal analysis confirms what has been observed in the
exploratory data analysis.
Acknowledgement
I would like to thank my colleagues, especially Dr.P.Berry
and Hr.G.Downing for advice and help given in preparing this
presentation.
David J.Shannon,
Computational Sciences Dept.,
Pfizer Central Research,
Sandwich, Kent, U.K.
56
,".
~
References
~
f
f
1. Gordon T. & Kannel V.B., Multiple risk functions for predicting
coronary heart disease : The concept, accuracy and application,
American Heart Journal, 1982, Vol. 103, No.6, Pages 1031-1039.
2. SAS User's Guide: Basics, Version 5, SAS Institute Inc.,1985
3. SAS User's Guide: Statistics, Version 5, SAS Institute Inc.,1985
4. SAS/GRAPH User's Guide, Version 5, SAS Institute Inc.,1985
Notes
1. There is an extensive literature set available for the
Framingham Study. Many of the relevant papers are listed
in the references given in Reference 1 above.
2. All SAS and SAS/GRAPH jobs run to complete this presentation
used SAS Version 5.03 (VMS) on the VAX 8800 installed in
the Computational Sciences Dept., Pfizer Central Research, U.K.
3. A comment (or plea) from a SAS End User (statistician) to
SAS Institute.
It is of value to note how relatively easy it is to display
clearly summary statistics using PROC TABULATE (Tables 1 & 2).
This contrasts strongly with the inability to alter the format
of PROC TTEST (Table 3) which has very limited use as a table
which can be put directly into a report.
The major weakness of SAS statistical procedures, for applied
statisticians at least, is, in general, their inability to
allow the user easy access to user defined formatting for
reporting purposes.
TABLE 1
SUMMARY STATISTICS FOR C.H.D. RISK FACTORS
FOR: SMOKER, GLUCOSE INTOLERANCE, STANDING SYSTOLIC BLOOD PRESSURE
MEANS FOR AGE, STANDING SYSTOLIC BLOOD PRESSURE, TOTAL & HDL CHOLESTEROL
TREATMENT
I
1-----------------------------------------------------
1
Druq A
I
Druq B
1--------------------------+-------------------------I
BASELINE
I Mean
IStd.Err. I Number I Mean
IStd.Err. I Number
1-------------------------------+--------+----7---+--------+--------+--------+-------IAGE
I
49.41
1.31
471
50.31
1.01
60
1-------------------------------+--------+--------+--------+--------+--------+-------ISYSTOLIC B.P.
I
152.81
2.51
471
150.71
1.91
60
1-------------------------------+--------+--------+--------+--------+--------+-------ITOTAL CHOLESTEROL
I
248.51
5.01
471
245.01
4.51
60
1-------------------------------+--------+--------+--------+--------+--------+-------IHDL CHOLESTEROL
I
48.71
1.51
471
51.01
1.11
60
1-------------------------------+--------+--------+--------+--------+--------+--------
FINAL
1-------------------------------+--------+--------+--------+--------+--------+--------1
ISYSTOLIC B.P.
I
141.51
2.71
471
139.41
2.21
601
1-------------------------------+--------+--------+--------+--------+--------+--------1
ITOTAL CHOLESTEROL"
I
243.11
4.01
471
246.91
4.31
601
1-------------------------------+--------+--------+--------+--------+--------+--------1
IHDL CHOLESTEROL
I
49.11
1.51
471
46.41
1.11
601
1-------------------------------+--------+--------+--------+--------+--------+--------1
FINAL - BASELINE
1-------------------------------+--------+--------+--------+--------+--------+--------1
ISYSTOLIC B.P.
I
-11.31
2.61
471
-11.31
1.61
601
1-------------------------------+--------+--------+--------+--------+--------+--------1
ITOTAL CHOLESTEROL
I
-5.51
4.01
471
1.81
3.51
601
1-------------------------------+--------+--------+--------+--------+--------+--------1
I HDL CHOLESTEROL
I
0.4I
1. 0 I
47 I
-4.6 I
0. 7 I
60 I
TABLE 2
FOR:
: MEANS FOR THE RISK (%) OF GETTING CORONARY HEART DISEASE
SMOKER, GLUCOSE INTOLERANCE, STANDING SYSTOLIC BLOOD PRESSURE
MEANS FOR BASELINE, F"INAL AND
%
CHANGE FROM BASELINE TO FINAL
I
I
TREATMENT
I
I
1-----------------------------------------------------I
I
I
Druq A
I
Druq B
I
1--------------------------+--------------------------I
I
I
I Mean
IStd.Err. I Number I Mean
IStd.Err.1 Number I
1-------------------------------+--------+--------+--------+--------+--------+--------1
I BASELINE
I
8.851
1.091
471
7.841
0.861
601
1-------------------------------+--------+--------+--------+--------+--------+--------1
I FINAL
I
8.171
1.141
471
8.311
0.861
601
1-------------------------------+--------+--------+--------+--------+--------+--------1
1% CHANGE FROM BASELINE
I
-8.361
4.971
471
15.171
4.501
601
58
TABLE 3
FOR
ANALYSIS OF THE RISK (%) OF GETTING CORONARY HEART DISEASE
SMOKER, GLUCOSE INTOLERANCE, STANDING SYSTOLIC BLOOD PRESSURE
CHANGE FROM BASELINE TO FINAL
TTEST PROCEDURE
ANALYSIS
TREAT
Drug A
Drug B
VARIABLE: LODDSRAT
Log Odds Ratio
N
MEAN
STD DEV
STD ERROR
MINIMUM
MAXIMUM
44
59
0.10369894
-0.05312537
0.29476230
0.18147023
0.04443709
0.02362541
-0.45400667
-0.75970251
1.08760299
0.35597496
2.64 WITH 43 AND 58 DF
FOR HO: VARIANCES ARE EQUAL, F'=
VARIABLE: STDSYSD
ANALYSIS
TREAT
Drug A
Drug B
Drug A
Drug B
Drug A
Drug B
MEAN
STD DEV
STD ERROR
MINIMUM
MAXIMUM
17.92945370
12.55292750
2.61527961
1. 62057597
-73.00000000
-54.00000000
37.00000000
17.00000000
2.04 WITH 46 AND 59 DF
0.0027
0.0012
VARIABLE: T CHOLD
VARIANCES
UNEQUAL
EQUAL
T
DF
PROB > IT I
0.0130
0.0136
79.0
105.0
0.9896
0.9892
T
DF
PROB > ITI
-1.3712
-1. 3729
98.5
105.0
0.1734
0.1727
T
DF
PROB > ITI
4.0806
4.2057
85.0
105.0
0.0001
0.0001
PROB > F'= 0.0101
TOTAL CHOLESTEROL
N
MEAN
STD DEV
STD ERROR
MINIMUM
MAXIMUM
47
60
-5.46808511
1.83333333
27.45973421
27.17935824
4.00541390
3.50884006
-62.00000000
-113.00000000
65.00000000
78.00000000
VARIABLE: HDL CH D
1.02 WITH 46 AND 59 DF
VARIANCES
UNEQUAL
EQUAL
PROB > F'= 0.9323
HDL CHOLESTEROL
N
MEAN
STD DEV
STD ERROR
MINIMUM
MAXIMUM
47
60
0.40425532
-4.58333333
6.90198325
5.36874718
1.00675773
0.69310228
-13.00000000
-18.00000000
17.00000000
7.00000000
FOR HO: VARIANCES ARE EQUAL, F'=
PROB > IT I
0>
-11.27659574
-11.31666667
FOR HO: VARIANCES ARE EQUAL, F'=
TREAT
DF
66.8
101.0
en
N
ANALYSIS
T
3.1161
3.3299
PROB > F'= 0.0006
47
60
ANALYSIS
UNEQUAL
EQUAL
SYSTOLIC B.P.
FOR HO: VARIANCES ARE EQUAL, F'=
TREAT
VARIANCES
1.65 WITH 46 AND 59 DF
VARIANCES
UNEQUAL
EQUAL
PROB > F'= 0.0690
,.-« ~ n<,,: .: 11.: ':":,'-.,,' ;. >," ",',',' ~"--\,' ,;,., --::,'.•,:,. -,
~~ '/_'~.LlU.A-U:;:-;,t:,-~......:~d
r.·',.,:. ~;,;,:'':;'..!.,-:'';''.~;'~_-':',,-.', :.'" "'_U!.::.~_i"',',.''';'';;'~<lS:C~. ',.1 ~,.!;:i/.;;'~u:&;),;;::oi,"17"r';G,c,::r,
RISK OF GETTING C.H.D
f1GURE
1.1
:
SYSTOLIC
MALE' SMOKER' ECG-LVH • GLUCOSE
B.P.~120
INTOLERANT' AGE-SO
72
"'.
330. 0
fiGURE
MALE
1.2
:
SYSTOLIC B.P.:160
* SMOKER' ECG-LVH • GLUCOSE INTOLERANT * AGE=50
'1
72
330. 0
36
10t .Cho 1 .
,8
.0
60
RISK OF GETTING C.H.D.
FIGURE
MALE' ECG-LVH • GLUCOSE
Risk
(~
2.1:
SMOKER
INTOLERANT' AGE-50' Sys.B.P.-,80
)
I
7Z.
I
54
330. 0
36
,6
0
6~·
36.
FIGURE
MALE' ECG-LVH • GLUCOSE
Risk
z..z. :
NON-SMOKER
INTOLERANT' AGE=50 •
Sys.B.P.=,80
(~)
72
54
330. 0
36
,8
5
1a5. 0
36.
61
0
RISK OF GETTING C.H.D.
FIGURE
•
Risk
SMOKER.
ECG-LVH •
GLUCOSE
3.1:
MALE
INTOLERANT.
AGE-50.
SYS.B.P.-160
<,,)
72
54
l!il il!il Il! iI! I!lI I ~I I!I!~, -: :~; : ~
36
I
18
330.0
Tot.Chol.
257.5
185.0
36,00
FIGURE 3.2
•
R i
sk
SMOKER
* ECG-LVH •
GLUCOSE
FEMALE
INTOLERANT.
AGE=50
•
SYS.S.P.=160
(n)
72
54
36
330.0
18
257.5
0
65. 00
57.75
50.50
HDL
43.25
Chol.
185.0
36.00
62
Risk (%) of Getting CHD Pre and Post Treatment
FiQure
4.1
:
Drug
A
30
*
25
*
P
*
20
*
T
m
•
*
*
*
15
10
*
*
**
5
**
* * *
*
* * *
*
*
*
*
0
0
5
10
15
20
25
30
25
30
Pre-Treatment
Figure
4.2
:
Drug
B
30
25
*
p
0
•t
20
**
*
15
*
* **
*
10
**
5
*
* *
*
*
*
*
0
0
5
10
15
Pre-Treotment
63
20
FIGURE 5.1 : CHD RISK
K CHANGE
FROM BASELINE
FOR
DRUG
A
FREQUENCY
30
25
20
15
10
5
o
90
~
Chonge
from
120
150
120
150
Basel ine
FIGURE 5.2 CHD RISK
~
CHANGE
FROM BASELINE
FOR
DRUG
8
FREQUENCY
30
25
20
15
10
5
o
30
~
Chonoe
64
60
90
from Bosel ine