Download Statistical analysis of biological assays What is a bioassay?

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistical analysis of
biological assays
Yossi Levy
What is a bioassay?
Bioassays are for estimating the
potency of a drug by utilizing the
reaction caused by its application to
live experimental subjects.
Bioassay always compares a test
substance to a standard substance
Assumptions
Comparable organisms
Same active compound
Only concentration can vary
2
© Yossi Levy 2011
Bioassay
3
Role of statistics in bioassay
Advise on the general statistical principles underlying the
assay method
Devise a good experimental design that gives the most
useful and reliable results
Analyze the data making use of all the evidence on potency
4
© Yossi Levy 2011
Types of bioassay
Direct assay: response is directly measured
Indirect assay
Quantitative
Binary
Examples
Direct assay: Measure insulin level in blood
Indirect assay
Quantitative: change in weight of a certain organ
Binary: dead or alive
5
Potency
Potency is the dose of a compound required to cause a
particular response
6
© Yossi Levy 2011
Relative potency
Relative potency is the ratio of the tested compound
potency to the reference standard potency
7
Statistical models for quantitative
assays
Parallel line model
Logistic model (4 or 5 parameters)
Slope Ratio model
8
© Yossi Levy 2011
Basic requirements for applying a
quantitative bioassay model
Randomization
Responses are Normally distributed
Homogenous variances
A logarithmic transformation of the response
measure is recommended to improve
compliance with second and third requirements
when necessary.
9
Requirements for PLA model
The relationship between the logarithm of the dose and the
response can be represented by a straight line.
For any unknown (tested) substance the straight line is
parallel to that of the standard.
10
© Yossi Levy 2011
1 picture = 1000 words
11
PLA in practice
Design restrictions imposed by the ICH guidelines
Experimental design
Analysis of covariance
Tests of validity
Potency estimation and confidence limits
Handling missing values
Troubleshooting
12
© Yossi Levy 2011
Design restrictions
Each substance must be tested with the same number of
dilutions
The ratio of adjacent doses must be constant for all
treatments
There must be an equal number of experimental units to
each treatment
13
Design restrictions
9 experimental
units per
treatment
3 dilutions
per
substance
adjacent log-doses are equally
spaced
14
© Yossi Levy 2011
Experimental design
Completely randomized design – if experimental units are
reasonably homogeneous.
ICH guideline also discusses:
Randomized block design
Latin-square designs
Cross-over designs
15
Tests of validity
The bioassay PLA model is valid if
Assay must show response
Response must be linear
Response lines must be parallel
16
© Yossi Levy 2011
How to assess linearity?
Just look at R2 - naive
Add a quadratic term to the model and verify that it is nonsignificant
Model dose/dilution as a class variable, and compare the
results to the “correct” model, using log-likelihood test
Linear contrasts – compare slope between each two
adjacent doses to the next slope
17
How to asses parallelism?
Let βI be the regression coefficient of the interaction
term, logdose*substance
Significance test approach: H0: βI=0
If the corresponding p-value is less than α, we
conclude non-parallelism
If the corresponding p-value is greater than α, then
what?
18
© Yossi Levy 2011
How to asses parallelism?
Equivalence test approach: H0: βI≠0
Pre-determine acceptance limits for βI: [-A, A]
Calculate a 1- α confidence interval for βI: (βL, βU)
Reject H0 if –A< βL and βU<A
But, how one would determine A?
19
Which approach is better?
It depends
For more details see:
Evaluations of Parallelism Test Methods Using ROC Analysis
Harry Yang and Lanju Zhang, MedImmune
2009 Non-clinical Biostatistics Conference, Boston, MA
http://www.hsph.harvard.edu/ncb2009/files/ncb2009-c06yang.pdf
Key conclusion:
An optimal cut off value, in terms of test statistic, p- value
or equivalence bound can be chosen to make best tradeoff between sensitivity and specificity
20
© Yossi Levy 2011
Example
data in;
input substance $ conc y1 y2 y3;
lconc=log(conc);
cards;
RS 100 4050.538 4019.029 3260.831
RS 50
1432.281 1823.191 1422.876
RS 25
558.284 587.956 848.65
RS 12.5 302.114 336.969 414.975
RS 6.25 191.442 244.982 213.579
RS 3.125 158.749 128.868 118.364
TB 100 1366.585 2134.742 2075.934
TB 50
660.938 669.61
872.149
TB 25
453.385 412.586 424.543
TB 12.5 269.963 193.644 222.505
TB 6.25 145.862 145.862 156.593
TB 3.125 143.725 83.434
61.609
;
run;
21
Take a look
4500
4000
3500
3000
Y
2500
2000
1500
1000
500
0
0
1
2
3
4
5
log(Concetration)
22
© Yossi Levy 2011
Apply log transformation to Y
10
9
8
7
log(Y)
6
5
4
3
2
1
0
0
1
2
3
4
5
log(Concetration)
23
Add regression lines
10
9
8
7
log(Y)
6
5
4
3
2
1
0
0
1
2
3
4
5
log(Concetration)
24
© Yossi Levy 2011
Testing linearity
Run a quadratic model for each substance:
E[log(Y)]=β0+ β1·log(concentration) + β2·log(concentration)2
Reject linearity if β2 is significantly different from zero
In order to assess linearity, non-linearity must be rejected
for each of the substances
What is the problem in this approach?
25
Test linearity - RS
26
© Yossi Levy 2011
Test linearity - TB
27
Select sub-range
Linearity of the whole concentration range was not
assessed
It is possible that the response is linear in a sub-range of at
least 4 concentrations:
3.125-50 (5 concentrations)
6.25-100 (5 concentrations)
3.125-25 (4 concentrations)
Etc..
The guideline allows selecting the “best range”
28
© Yossi Levy 2011
How to select the “best range”?
It must demonstrate linearity response and parallelism
If there is more that one such sub-range, the best one
should be chosen
Most commercial software select the range with highest
R2
Better approach: select range with highest signal to
noise ratio:
S/N =
Ymax − Ymin
MSE
*The example will continue with the upper range: 12.5-100
29
Linearity testing – Upper range
30
© Yossi Levy 2011
Testing response
Linearity has been assessed. The quadratic term can be
removed from the model
Run a linear model for each substance:
E[log(Y)]=β0+ β1·log(concentration)
Reject null hypothesis of no response if β1 is significantly
different from zero
In order to assess response, null hypothesis must be
rejected for each of the substances
31
Results of response test
32
© Yossi Levy 2011
Parallelism test
Response has been assessed. Add substance and its
interaction with concentration to model
Run a linear model for whole data over the chosen range:
E[log(Y)]=β0 + β1·log(concentration) + β2· Substance +
β3·log(concentration)*Substance
Substance is modeled as a 0-1 variable
Either test whether β3 is significantly different from
zero,
Or better: construct a confidence interval for β3
33
Parallelism test results
34
© Yossi Levy 2011
Relative potency estimation
The final bioassay model over the chosen range is:
E[log(Y)]=β0 + β1·log(concentration) + β2· Substance
Substance is modeled as a 0-1 variable
This model implies that for the RS (substance=0)
E[log(Y)]=β0 + β1·log(concentration)
And for the TB (substance=1)
E[log(Y)]=(β0 + β2 )+ β1·log(concentration)
35
Relative Potency estimation
log ρˆ =
βˆ2
βˆ1
36
© Yossi Levy 2011
The final bioassay model
37
Relative Potency estimation
− 0.5817
= −0.5455
1.0662
ρˆ = exp(−0.5455) = 0.5795 ≈ 58%
log ρˆ =
38
© Yossi Levy 2011
How to calculate a confidence
interval?
You can’t.
Estimators of β are linear combinations of the Ys, that
are normally distributed
Therefore, estimators of β are normally distributed
So the distribution of
log ρˆ =
βˆ2
βˆ1
is Cauchy
Instead of a confidence interval, we calculate fiducial
limits
39
Fieller’s theorem
Let µ and ν be two unknown parameters, and let ρ= µ/ν.
Let a and b be unbiased estimators for µ and ν,
respectively, that are linear in observations that are
normally distributed.
Let the variances and covariance estimates of a and b be
v11s2, v22s2 and V12s2, respectively, where s2 is an error
mean square having m degrees of freedom.
Let t be the α/2 critical value from a t distribution with m-1
degrees of freedom, and let g=t2s2v22/b2.
Let R=a/b and estimate for ρ. Then upper and lower
confidence limits for ρ are:
40
© Yossi Levy 2011
Fieller’s theorem
1

2
2



gv12 ts 
v


2
12
 
± ⋅ v11 − 2 Rv22 + R v22 − g  v11 −
R −
v22 b 
v22  



RL , RU =
1− g
41
Proof of Fieller’s theorem
Let U=a-ρb. Then EU=0 and its estimated variance is
s2(v11-2ρv12+ ρ2v12) with m degrees of freedom.
Therefore:
[
)]
(
P U 2 ≤ t 2s2 v11 − 2ρv12 + ρ 2v22 = 1−α
The results follows for solving the quadratic
equation in ρ.
42
© Yossi Levy 2011
Fiducial limits calculation
s
v22
v11
43
Troubleshooting
Problem: Exceptionally high residual error (MSE)
Solution: this is an indication of technical problem – check
the bioassay process
Problem: Exceptionally low residual error may cause F
values to exceed critical values.
Solution: replace residual error by estimate from historical
data.
44
© Yossi Levy 2011
Non linear response
45
4-parameter model
EY = δ +
α −δ
1 + exp{− β ( x − γ )}
α - upper asymptote
δ - lower asymptote
β – slope factor
γ - horizontal location
Validity of model: α, δ and β – are same for RS and TB
Log(relative potency) = γRS - γTB
46
© Yossi Levy 2011
Quantal Bioassay
Response is discrete
Often a binary response: e.g. Dead/Alive
Dose response function is sometimes called “Tolerance
Distribution”
A logistic distribution is a natural model for such data
47
Example: Bacterial tolerance
Bacterial Dose
Dead
Alive
1.2·103
0
5
1.2·104
0
5
1.2·105
2
3
1.2·106
4
2
1.2·107
5
1
1.2·108
5
0
48
© Yossi Levy 2011
Modeling
Probability of death at level xi of drug (or bacterial
concentration) is
P (Yi ≤ xi ) = pi =
exp{α + βxi }
1 + exp{α + βxi }
Where Yi is the tolerance for subject i.
Then
 p 
log i  = α + βxi
 1 − pi 
49
LD50/ED50
The dose at which 50% of subjects produce a response
is called LD50 or ED50 (LD=lethal dose, ED=effective
dose)
Let x50=log(LD50) and p50=0.5 (probability of response at
the median of the tolerance distribution). Then
 p 
log 50  = 0 = αˆ + βˆx50
 1 − p50 
xˆ50 = − αˆ βˆ
{
LD50 = exp − αˆ βˆ
}
50
© Yossi Levy 2011
Confidence interval for LD50
Using Taylor series expansion (the delta method):
 V (αˆ ) 2 cov(αˆ , βˆ ) V ( βˆ ) 
2
⋅  2 −
+ 2 
V ( xˆ50 ) = xˆ50
αˆβˆ
βˆ 
 αˆ
Then a 100(1-α)% CI for log(LD50) is
xˆ50 ± z1−α 2 V ( xˆ50 )
51
Data and model
52
© Yossi Levy 2011
proc logistic output
LD50 = exp{9.268 0.7071} = 488942 ≈ 4.8 ⋅105
CI for log( LD50) is
13.1 ± 1.96 0.6005 = (11.6,14.6)
CI for LD50 is
exp(11.6,14.6) ≈ (10.9 ⋅10 4 ,2.1 ⋅10 6 )
53
Comparing two drugs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Drug
N
N
N
N
S
N
S
N
S
N
S
S
N
S
Dose
Dead
0.01
0
0.03
1
0.10
1
0.30
1
0.30
0
1.00
4
1.00
0
3.00
1
3.00
0
10.00
5
10.00
4
30.00
5
30.00
7
100.00
8
Alive
30
29
9
9
10
6
10
9
10
5
6
5
3
2
Total
30
30
10
10
10
10
10
10
10
10
10
10
10
10
The dilution
assumption:
Z S = ρZ N
54
© Yossi Levy 2011
Comparing two drugs
55
Comparing two drugs
Dilution assumption: zS=ρ·zN for doses of S and N with
the same probability of response
If x represents log of dose, then xS= logρ+xN
Logistic model for drug S is:
pS ( xSi ) =
1
1 + exp{− (α S + β xSi )}
56
© Yossi Levy 2011
Comparing two drugs
Therefore, for drug N, remembering that xS= logρ+xN, a
logistic model for drug N is:
p N ( x Ni ) = pS (log ρ + x Ni ) =
=
1
=
1 + exp{− (α S + β (log ρ + x Ni ))}
=
1
=
1 + exp{− ((α S + β log ρ ) + βx Ni )}
1
1 + exp{− (α N + β x Ni )}
=
=
57
Comparing two drugs
Therefore, the dilution assumption implies that:
 p (x ) 
log S Si  = α S + βxSi
1 − pS ( xSi ) 
 p (x ) 
log N Ni  = α N + βx Ni = α S + β log ρ + βxNi
1 − p N ( xNi ) 
The assumption can be tested by fitting a model
with separate intercepts and slopes and then testing
for common slope
58
© Yossi Levy 2011
Testing for common slope – SAS
output
Interaction term is not statistically significant
=> common slope
=> the dilution assumption holds
59
Run model without interaction
The model :
log{p /(1 − p)} = −2.0429 + 0.5504 ⋅ I (Drug = N ) + 0.7363 ⋅ log dose
For drug S :
log{p /(1 − p)} = −2.0429 + 0.7363 ⋅ log dose
For drug N :
log{p /(1 − p)} = −1.4925 + 0.7363 ⋅ log dose
60
© Yossi Levy 2011
The model
61
Parameter estimates
log LD50 N = −
− 1.4925
= 2.027
0.7363
LD50 N = 7.60
log LD50 S = −
− 2.0429
= 2.775
0.7363
LD50 S = 16.04
log ρ =
αˆ N − αˆ S 0.5504
=
= 0.7475
0.7363
βˆ
ρˆ = 2.11
62
© Yossi Levy 2011