Download Binomial Distribution and Applications

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Binomial Distribution
and Applications
Binomial Probability Distribution
A binomial random variable X is defined to the number
of “successes” in n independent trials where the
P(“success”) = p is constant.
Notation: X ~ BIN(n,p)
In the definition above notice the following conditions
need to be satisfied for a binomial experiment:
1. There is a fixed number of n trials carried out.
2. The outcome of a given trial is either a “success”
or “failure”.
3. The probability of success (p) remains constant
from trial to trial.
4. The trials are independent, the outcome of a trial is
not affected by the outcome of any other trial.
Binomial Distribution

If X ~ BIN(n, p), then
 n x
n!
n x
P( X  x)    p (1  p ) 
p x (1  p ) n  x x  0,1,..., n.
x!(n  x)!
 x

where
n! n  (n  1)  (n  2)  ... 1, also 0!  1 and 1!  1
n
   " n choose x"  the number of ways to obtain
x
x " successes" in n trials.
P (" success" )  p
Binomial Distribution

If X ~ BIN(n, p), then
 n x
n!
n x
P( X  x)    p (1  p ) 
p x (1  p ) n  x x  0,1,..., n.
x!(n  x)!
 x


E.g. when n = 3 and p = .50 there are 8 possible equally
likely outcomes (e.g. flipping a coin)
SSS SSF SFS FSS SFF FSF FFS FFF
X=3 X=2 X=2 X=2 X=1 X=1 X=1 X=0
P(X=3)=1/8, P(X=2)=3/8, P(X=1)=3/8, P(X=0)=1/8
Now let’s use binomial probability formula instead…
Binomial Distribution

If X ~ BIN(n, p), then
 n x
n!
n x
P( X  x)    p (1  p ) 
p x (1  p ) n  x x  0,1,..., n.
x!(n  x)!
 x

E.g. when n = 3, p = .50 find P(X = 2)
SSF
 3
3!
3!
3  2 1
  


 3 ways SFS
 2  2!(3  2)! 2!1! (2 1) 1
FSS
 3 2
P( X  2)   .5 (.5)3 2  3(.52 )(. 51 )  .375 or 3
8
 2
Example: Treatment of Kidney
Cancer


Suppose we have n = 40 patients who will be
receiving an experimental therapy which is
believed to be better than current treatments
which historically have had a 5-year survival rate
of 20%, i.e. the probability of 5-year survival is
p = .20.
Thus the number of patients out of 40 in our
study surviving at least 5 years has a binomial
distribution, i.e. X ~ BIN(40,.20).
Results and “The Question”


Suppose that using the new treatment we find
that 16 out of the 40 patients survive at least 5
years past diagnosis.
Q: Does this result suggest that the new therapy
has a better 5-year survival rate than the current,
i.e. is the probability that a patient survives at
least 5 years greater than .20 or a 20% chance
when treated using the new therapy?
What do we consider in answering
the question of interest?
We essentially ask ourselves the following:
 If we assume that new therapy is no better than
the current what is the probability we would see
these results by chance variation alone?

More specifically what is the probability of
seeing 16 or more successes out of 40 if the
success rate of the new therapy is .20 or 20% as
well?
Connection to Binomial


This is a binomial experiment situation…
There are n = 40 patients and we are counting
the number of patients that survive 5 or more
years. The individual patient outcomes are
independent and IF WE ASSUME the new
method is NOT better then the probability of
success is p = .20 or 20% for all patients.
So X = # of “successes” in the clinical trial is
binomial with n = 40 and p = .20,
i.e. X ~ BIN(40,.20)
Example: Treatment of Kidney Cancer

X ~ BIN(40,.20), find the probability that exactly 16
patients survive at least 5 years.
 40  16 24
P( X  16)   .20 .80  .001945
 16 


This requires some calculator gymnastics and some
scratchwork!
Also, keep in mind we need to find the probability of
having 16 or more patients surviving at least 5 yrs.
Example: Treatment of Kidney
Cancer

So we actually need to find:
P(X > 16) = P(X = 16) + P(X = 17) + … + P(X = 40)
+
…
+
 40  16 24
P( X  16)   .20 .80  .001945
 16 
 40  17 23
P( X  17)   .20 .80  .000686
 17 
 40 
P( X  40)   .20 40.800  0
 40 
= .002936
YIPES!
Example: Treatment of Kidney Cancer



X ~ BIN(40,.20), find the probability that 16 or more
patients survive at least 5 years.
USE COMPUTER!
probabilities
Binomial Probability calculator in
JMP are computed
automatically for greater than
or equal to and less than or
equal to x.
Enter
n = sample size
x = observed # of “successes”
p = probability of “success”
Example: Treatment of Kidney
Cancer



X ~ BIN(40,.20), find the probability that 16 or more
patients survive at least 5 years.
USE COMPUTER!
Binomial Probability calculator in JMP
P(X > 16) = .0029362
The chance that we would see 16 or more patients out of 40
surviving at least 5 years if the new method has the same chance
of success as the current methods (20%) is VERY SMALL, .0029!!!!
Conclusion

A)
B)
Because it is high unlikely (p = .0029) that we
would see this many successes in a group 40
patients if the new method had the same
probability of success as the current method we
have to make a choice, either …
we have obtained a very rare result by dumb
luck.
OR
our assumption about the success rate of the
new method is wrong and in actuality the new
method has a better than 20% 5-year survival
rate making the observed result more plausible.
Sign Test




The sign test can be used in place of the paired ttest when we have evidence that the paired
differences are NOT normally distributed.
It can be used when the response is ordinal.
Best used when the response is difficult to
quantify and only improvement can be measured,
i.e. subject got better, got worse, or no change.
Magnitude of the paired difference is lost when
using this test.
Example: Sign Test



A study evaluated hepatic arterial infusion of
floxuridine and cisplatin for the treatment of
liver metastases of colorectral cancer.
Performance scores for 29 patients were
recorded before and after infusion.
Is there evidence that patients had a better
performance score after infusion?
Example: Sign Test
Patient
Before (B)
Infusion
After (A)
Infusion
Difference
(A – B)
Patient
Before (B)
Infusion
After (A)
Infusion
Difference
(A – B)
1
2
1
-1
16
0
0
0
2
0
0
0
17
0
3
3
3
0
0
0
18
2
3
1
4
1
0
-1
19
2
3
1
5
3
3
0
20
3
2
-1
6
1
0
-1
21
0
4
4
7
1
3
2
22
0
3
3
8
0
0
0
23
1
2
1
9
0
0
0
24
0
3
3
10
0
0
0
25
0
2
2
11
1
0
-1
26
1
1
0
12
1
1
0
27
3
3
0
13
2
1
-1
28
1
2
1
14
3
1
-2
29
0
2
2
15
0
0
0
Sign Test



The sign test looks at the number of (+) and (-)
differences amongst the nonzero paired
differences.
A preponderance of +’s or –’s can indicate that
some type of change has occurred.
If in reality there is no change as a result of
infusion we expect +’s and –’s to be equally
likely to occur, i.e. P(+) = P(-) = .50 and the
number of each observed follows a binomial
distribution.
Example: Sign Test

Given these results do we have evidence
that performance scores of patients
generally improves following infusion?

Need to look at how likely the observed
results are to be produced by chance
variation alone.
Example: Sign Test
18 nonzeros differences, 11 +’s 7 –’s
Patient
Before (B)
Infusion
After (A)
Infusion
Difference
(A – B)
Patient
Before (B)
Infusion
After (A)
Infusion
Difference
(A – B)
1
2
1
-1
16
0
0
0
2
0
0
0
17
0
3
3
0
0
18
2
3
1
1
0
-1
19
2
3
1
+
+
+
3
0
4
5
3
3
0
20
3
2
-1
6
1
0
-1
-
-
21
0
4
4
7
1
3
2
+
22
0
3
3
8
0
0
0
23
1
2
1
+
+
+
9
0
0
0
24
0
3
3
10
0
0
0
25
0
2
2
11
1
0
-1
26
1
1
0
12
1
1
0
27
3
3
0
13
2
1
-1
28
1
2
1
14
3
1
-2
29
0
2
2
15
0
0
0
-
-
+
+
+
+
Example: Sign Test




If there is truly no change in performance as a result of
infusion the number of +’s has a binomial distribution
with n = 18 and
p = P(+) = .50.
We have observed 11 +’s amongst the 18 non-zero
performance differences.
How likely are we to see 11 or more +’s out 18?
P(X > 11) = .2403 for a binomial n = 18, p = .50
There is 24.03% chance we would see this many
improvements by dumb luck alone, therefore we are not
convinced that infusion leads to improvement
(Remember less than .05 or a 5% chance is what we are
looking for “statistical significance”)
Example 2: Sign Test
Resting Energy Expenditure (REE) for Patient with
Cystic Fibrosis
 A researcher believes that patients with cystic fibrosis
(CF) expend greater energy during resting than those
without CF. To obtain a fair comparison she matches
13 patients with CF to 13 patients without CF on the
basis of age, sex, height, and weight. She then
measured there REE for each pair of subjects and
compared the results.
Example 2: Sign Test
There are
11 +’s & 2 –’s
out of n = 13
paired differences.
Example 2: Sign Test
The probability of seeing this many +’s is small. We
conclude that when comparing individuals with cystic
fibrosis to healthy individuals of the same gender and size
that in general those with CF have larger resting energy
expenditure (REE) (p = .0112).