Download Document

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Confidence interval wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
One sample statistical tests,
continued…
Recall statistics for: Single
population mean (known )

Hypothesis test:
Z
observed mean  null mean

n

Confidence Interval
confidence interval  observed mean  Z/2 * (

n
)
Examples of Sample Statistics:
Single population mean (known )
Single population mean (unknown )
Single population proportion
Difference in means (ttest)
Difference in proportions (Z-test)
Odds ratio/risk ratio
Correlation coefficient
Regression coefficient
…
Sigma is unknown
NOTE: if we are actually doing an experiment, we are unlikely to know the standard deviation of the
population () ahead of time (unlike with dice, there is no theoretical variance, only a population
variance that we can never know exactly without measuring the entire population).
To estimate :
n
ˆ  s 

( xi  x ) 2
i 1
n 1
Estimated standard error of the mean:
n
 (x  x)
i
i 1
s
basically dividing by n twice…
n

n 1
n
2
Standard error of the mean
when true sigma is unknown
n
 (x  x)
i
i 1
s
n

n 1
n
2
When  is unknown, use t
rather than Z!



A t-distribution is like a Z distribution,
except has slightly fatter tails to reflect
the uncertainty added by estimating .
The bigger the sample size (i.e., the
bigger the sample size used to estimate
), then the closer t becomes to Z.
If n>100, t approaches Z.
Computer simulation of the sampling
distribution of the sample mean
when standard deviation unknown:
1. Pick any probability distribution and specify a mean and standard deviation.
2. Tell the computer to randomly generate 1000 observations from that
probability distributions
E.g., the computer is more likely to spit out values with high probabilities
3. Calculate the standard deviation of each sample and calculate 1000 Tstatistics:
n
(x  x)
2
i
Sx 
i 1
n 1

 T 
Xn  
Sx
n
4. Plot the T-statistics in histograms.
5. Repeat for different sample sizes (n’s).
n=2, underlying distribution is normal
n
Sx 

( xi  x ) 2
i 1
n 1

 T 
Xn  
Sx
n
T-distribution with only 1
degree of freedom.
n=5, underlying distribution is normal
n
Sx 

( xi  x ) 2
i 1
n 1

 T 
Xn  
Sx
n
T-distribution with 4
degrees of freedom.
n=10, underlying distribution is normal
n
(x  x)
2
i
Sx 
i 1
n 1

 T 
Xn  
Sx
n
T-distribution with 9
degrees of freedom.
n=30, underlying distribution is normal
n
(x  x)
2
i
Sx 
i 1
n 1

 T 
Xn  
Sx
n
T-distribution with
29 degrees of
freedom.
n=100, underlying distribution is normal
n
(x  x)
2
i
Sx 
i 1
n 1

 T 
Xn  
Sx
n
T-distribution with 99
degrees of freedom.
Looks a lot like Z!!
Conclusions



These simulations show that if we use the
sample estimate of variance, we don’t
quite get a normal distribution; instead we
get a “student’s” t distribution.
There is more area in the tails to reflect
the added uncertainty from estimating
standard deviation.
If N is large enough, t and Z are similar.
Student’s t Distribution
Note: t
Z as n increases
Standard
Normal
(t with df = )
t (df = 13)
t-distributions are bellshaped and symmetric, but
have ‘fatter’ tails than the
normal
t (df = 5)
0
from “Statistics for Managers” Using Microsoft® Excel 4th Edition, Prentice-Hall 2004
t
Student’s t Table
Upper Tail Area
df
.25
.10
.05
1 1.000 3.078 6.314
Let: n = 3
df = n - 1 = 2
 = .10
/2 =.05
2 0.817 1.886 2.920
/2 = .05
3 0.765 1.638 2.353
The body of the table
contains t values, not
probabilities
from “Statistics for Managers” Using Microsoft® Excel 4th Edition, Prentice-Hall 2004
0
2.920 t
t distribution values
With comparison to the Z value
Confidence
t
Level
(10 d.f.)
t
(20 d.f.)
t
(30 d.f.)
Z
____
.80
1.372
1.325
1.310
1.28
.90
1.812
1.725
1.697
1.64
.95
2.228
2.086
2.042
1.96
.99
3.169
2.845
2.750
2.58
Note: t
Z as n increases
from “Statistics for Managers” Using Microsoft® Excel 4th Edition, Prentice-Hall 2004
The T probability density
function
What does t look like mathematically? (You may at least recognize
some resemblance to the normal distribution function…)
Where:
v is the degrees of freedom
(gamma) is the Gamma function
is the constant Pi (3.14...)
The t-distribution in SAS
Yikes! The t-distribution looks like a mess! Don’t want to
integrate!
Luckily, there are charts and SAS! MUST SPECIFY
DEGREES OF FREEDOM!
The t-function in SAS is:
probt(t-statistic, df)
Ttests have a normality
assumption…


If the underlying data are not normally
distributed, it takes longer for the CLT
to kick in and the sample means do not
immediately follow a t-distribution…
This is the source of the “normality
assumption” of the ttest…
n=2, underlying distribution is exponential (mean=1, SD=1)
n
Sx 

( xi  x ) 2
i 1
n 1

 T 
Xn  
Sx
n
This doesn’t yet follow a t-distribution!
n=5, underlying distribution is exponential (mean=1, SD=1)
n
Sx 

( xi  x ) 2
i 1
n 1

 T 
Xn  
Sx
n
This doesn’t yet follow a t-distribution!
n=10, underlying distribution is exponential (mean=1, SD=1)
n
Sx 

( xi  x ) 2
i 1
n 1

 T 
Xn  
Sx
n
This doesn’t yet follow a t-distribution!
n=30, underlying distribution is exponential (mean=1, SD=1)
n
Sx 

( xi  x ) 2
i 1
n 1

 T 
Xn  
Sx
n
Still not quite a t-distribution! Note the
left skew.
n=100, underlying distribution is exponential (mean=1, SD=1)
Now, pretty close to a T-distribution!
n
(x  x)
2
i
Sx 
i 1
n 1

 T 
Xn  
Sx
n
Conclusions



If the underlying data are not normally
distributed AND n is small**, the means do
not follow a t-distribution (so using a ttest
will result in erroneous inferences).
Non-parametric tests should be used
instead.
**How small is too small? No hard and fast
rule—depends on the true shape of the
underlying distribution. Here N>30 (closer
to 100) is needed.
Practice Problem:

A manufacturer of light bulbs claims that its light
bulbs have a mean life of 1520 hours with an
unknown standard deviation. A random sample of 40
such bulbs is selected for testing. If the sample
produces a mean value of 1505 hours and a sample
standard deviation of 86, is there sufficient evidence
to claim that the mean life is significantly less than
the manufacturer claimed?

Assume that light bulb lifetimes are roughly normally
distributed.
Answer
1. What is your null hypothesis?
Null hypothesis: mean life = 1520 hours
Alternative hypothesis: mean life < 1520 hours
2. What is your null distribution?
Since we have to estimate the standard deviation, we need to make inferences from a T-curve with
39 degrees of freedom.
X 40 ~ t39 (1520 , s X 
86
40
 13.5)
3. Empirical evidence: 1 random sample of 40 has a mean of 1498.3 hours
1505  1520
 1.11
13.5
p  value  P(t39  1.11)  .137
t39 
5. Probably not sufficient evidence to reject the null. We cannot sue the light bulb manufacturer for
false advertising! Notice that using t-distribution to calculate the p-value didn’t change much!
With n>30, might as well use Z table.
Practice problem

You want to estimate the average ages of
kids that ride a particular kid’s ride at
Disneyland. You take a random sample of 8
kids exiting the ride, and find that their ages
are: 2,3,4,5,6,6,7,7. Assume that ages are
roughly normally distributed.
a. Calculate the sample mean.
b. Calculate the sample standard deviation.
c. Calculate the standard error of the mean.
d. Calculate the 99% confidence interval.
Answer (a,b)
a. Calculate the sample mean.
8
X8 
X
i 1
8
i
2  3  4  5  6  6  7  7 40


 5.0
8
8
b. Calculate the sample standard deviation.
8
s X2 

( X i  5) 2
i 1
8 1
s X  3.4  1.9
32  2 2  12  0  2(12 )  2(2 2 ) 24


 3.4
7
7
Answer (c)
c. Calculate the standard error of the mean.
sX 
sX
n

1 .9
8
 .67
Answer (d)
d. Calculate the 99% confidence interval.
mean  s X (t df , / 2 )
5.0  .67 (3.50)  (2.65, 7.35)
t7,.005=3.5
Example problem, class data:
A two-tailed hypothesis test:
A researcher claims that Stanford affiliates eat
fewer than the recommended intake of 5
fruits and vegetables per week.
We have data to address this claim: 22 people
in the class provided data on their daily fruit
and vegetable intake.
Do we have evidence to dispute her claim?
Histogram fruit and veggie
intake…
Mean=4.0 servings
Median=4.5 servings
Mode=5.0 servings
Std Dev=1.8 servings
Answer
1. Define your hypotheses (null, alternative)
H0: P(average servings)=5.0
Ha: P(average servings)≠5.0 hours (two-sided)
2. Specify your null distribution
We do not know the true standard deviation of homework times, so we
must use a T-distribution to make inferences, rather than a Zdistribution.
X 22
1.8
~T21 (5.0,
 0.38)
22
Answer, continued
3.
Do an experiment
observed mean in our experiment = 4.0 servings
4.
Calculate the p-value of what you observed
45
T21 
 2 .6
0.38
p-value < .05;
5. Reject or fail to reject (~accept) the null hypothesis
Reject! Stanford affiliates eat significantly fewer than the
recommended servings of fruits and veggies.
T21 critical value
for p<.05, two
tailed = 2.08
95% Confidence Interval
X 22 T21,.025 * (standard error )
 4.0  2.08 * (0.38)
 3.2  4.8
H0: P(average servings)=5.0
The 95% CI excludes 5, so p-value <.05
Paired data (repeated
measures)
Patient
BP Before (diastolic)
BP After
1
100
92
2
89
84
3
83
80
4
98
93
5
108
98
6
95
90
What about these
data? How do you
analyze these?
Example problem: paired ttest
Patient
Diastolic BP Before
D. BP After
Change
1
100
92
-8
2
89
84
-5
3
83
80
-3
4
98
93
-5
5
108
98
-10
6
95
90
-5
Null Hypothesis: Average Change = 0
Example problem: paired ttest
X
 8  5  3  5  10  5  36

 6
6
6
Change
-8
( 8  6) 2  ( 5  6) 2  ( 3  6) 2 ...
sx 

5
4  1  9  1  16  1
32

 2.5
5
5
-5
-3
-5
sx 
2.5
 1.0
6
60
T5 
 6
1.0
Null Hypothesis: Average Change = 0
With 5 df, T>2.571
corresponds to p<.05
(two-sided test)
-10
-5
Example problem: paired ttest
Change
95% CI : - 6  2.571* (1.0)
 (-3.43, - 8.571)
Note: does not include 0.
-8
-5
-3
-5
-10
-5
Summary: Single population
mean (unknown )

Hypothesis test:
observed mean  null mean
t n 1 
sx
n

Confidence Interval
sx
confidence interval  observed mean  t n -1,/2 * ( )
n
Summary: paired ttest

Hypothesis test:
observed mean d  0
tn 1 
sd
n

Where d=change
over time or
difference within
a pair.
Confidence Interval
sd
confidence interval  observed mean d  t n -1,/2 * ( )
n
Examples of Sample Statistics:
Single population mean (known )
Single population mean (unknown )
Single population proportion
Difference in means (ttest)
Difference in proportions (Z-test)
Odds ratio/risk ratio
Correlation coefficient
Regression coefficient
…
Recall: normal approximation
to the binomial…

Statistics for proportions are based on a
normal distribution, because the
binomial can be approximated as
normal if np>5
Recall: stats for proportions
For binomial:
 x  np
 x  np(1  p)
2
Differs by
a factor of
n.
 x  np(1  p)
For proportion:
 pˆ  p
np(1  p) p(1  p)

2
n
n
p(1  p)
 pˆ 
n
 pˆ 2 
P-hat stands for “sample
proportion.”
Differs
by a
factor
of n.
Sampling distribution of a
sample proportion
 pˆ  p
 pˆ 
s pˆ 
p=true population proportion.
p(1  p )
n
pˆ (1  pˆ )
n
pˆ ~ Normal( p,
BUT… if you knew p you wouldn’t
be doing the experiment!
pˆ (1  pˆ )
)
n
Always a normal
distribution!
Practice Problem
A fellow researcher claims that at least 15% of smokers
fail to eat any fruits and vegetables at least 3 days a week.
You find this hard to believe and decide to check the
validity of this statistic by taking a random (representative)
sample of smokers. Do you have sufficient evidence to
reject your colleague’s claim if you discover that 17 of the
200 smokers in your sample eat no fruits and vegetables at
least 3 days a week?
Answer
1. What is your null hypothesis?
Null hypothesis: p=proportion of smokers who skip fruits and veggies frequently
= .15
Alternative hypothesis: p < .15
2. What is your null distribution?
Var( p̂) = .15*.85/200 = .00064 SD( p̂) = .025
p̂ ~ N (.15, .025)
3. Empirical evidence: 1 random sample: = 17/200 = .085
4. Z = (.085-.15)/.025 = -2.6
p-value = P(Z<-2.6) = .0047
5. Sufficient evidence to reject the claim.
OR, use computer simulation…




1. Have SAS randomly pick 200 observations
from a binomial distribution with p=.15 (the
null).
2. Divide the resulting count by 200 to get
the observed sample proportion.
3. Repeat this 1000 times (or some arbitrarily
large number of times).
4. Plot the resulting distribution of sample
proportions in a histogram:
How often did we get observed
values of 0.085 or lower when
true p=.15?
Only 4/1000 times!
Emprical p-value=.004
Practice Problem
In Saturday’s newspaper, in a story about poll results from Ohio, the
article said that 625 people in Ohio were sampled and claimed that the
margin of error in the results was 4%. Can you explain where that 4%
margin of error came from?
Answer
.5 * .5 .5
5
1



 2%
625
25 250 50
4% is 2 standard errors.
Since, we're on a normal distributi on, 2 standard errors on either
S pˆ 
side of the mean, should represent 95% confidence...
Paired data proportions test…


Analogous to paired ttest…
Also takes on a slightly different form
known as McNemar’s test (we’ll see lots
more on this next term…)
Paired data proportions test…


1000 subjects were treated with
antidepressants for 6 months and with
placebo for 6 months (order of tx was
randomly assigned)
Question: do suicide attempts (yes/no)
differ depending on whether a subject is
on antidepressants or on placebo?
Paired data proportions test…
Data:
15 subjects attempted suicide in both conditions (noninformative)
10 subjects attempted suicide in the antidepressant
condition but not the placebo condition
5 subjects attempted suicide in the placebo condition
but not the antidepressant condition
970 did not attempt suicide in either condition (noninformative)
Data boils down to 15 observations…
In 10/15 cases (66.6%), antidepressant>placebo.
Paired proportions test…
Single proportions test:
Under the null hypothesis, antidepressants and placebo work
equally well. So,
Ho: among discordant cases, p (antidepressant>placebo) = 0.5
Observed p = .666
Z
pˆ  p0
.666  .5

 1.29; p  .05
( p0 )(1  p0 )
(.5)(.5)
15
n
Not enough evidence to reject the null!
Key one-sample Hypothesis
Tests…
Test for Ho: μ = μ0 (σ2 unknown): t n 1  x   0
sx
n
Test for Ho: p = po:
Z
pˆ  p 0
( p 0 )(1  p 0 )
n
Corresponding confidence
intervals…
For a mean (σ2 unknown): x  t n 1, / 2  s x
For a proportion:
( pˆ )(1  pˆ )
pˆ  Z  / 2 
n
n
Symbol overload!








n: Sample size
Z: Z-statistic (standard normal)
tdf: T-statistic (t-distribution with df degrees of
freedom)
p: (“p-hat”): sample proportion
X: (“X-bar”): sample mean
s: Sample standard deviation
p0: Null hypothesis proportion
0: Null hypothesis mean