Download 10-06 lecture

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Sufficient statistic wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Basic t-test
10/6
Hypothesis Test for Population Mean
• Goal: Infer m from M
• Null hypothesis: m = m0, usually 0
• Change scores
– Memory improvement, weight loss, etc.
• Sub-population within known, larger population
– IQ of CU undergrads
• Approach:
– Determine likelihood function for M, using CLT
– Use actual sample mean to get p-value
Likelihood Function for M
• Probability distribution for M, according to H0
• Central Limit Theorem:
– Mean equals population mean: mM = m0
– Standard deviation:  M  n
– Shape: Normal
– .zM 

M m M

M

M  m0

Probability
• z-score
M
p
n
– p-value: p(Z > z)

m0
M
Example: IQ
• Are CU undergrads smarter than population?
– Sample size 100, sample mean 103
• Likelihood function
– If no difference, what are probabilities of sample means?
– Null hypothesis: m0 = 100
– CLT: mM = 0
M = /√n = 15/10 = 1.5
> 1-pnorm(2)
– z-score: (103-100)/1.5 = 2
[1] .0228
p
M
zM
Problem: Unknown Variance
zM 
M  m0

n
?
• Test statistic depends on population parameter
– Can only depend on data or values assumed by H0
 • Could include  in null hypothesis
– H0: m = m0 &  = 0
– Usually no theoretical basis
– Cannot tell which assumption fails
• Change test statistic
– Replace population SD with sample SD
– Depends only on data and m0
– .
M  m0
t
s
n


t Statistic
• Invented by “Student” at Guinness brewery
• .t 
M  m0
s
n
• Deviation of sample mean divided by estimated standard error
• Depends only on data and m0
• Sampling distribution depends only on n
M  m0 
t
n
tn1 
Normal(0,1)  


n 1

2
 n1
n1
s
2
n1
Normal(0,1)

t Distribution
• Sampling distribution of t statistic
• Derived from ratio of Normal and (modified) 2
• Depends only on sample size
– Degrees of freedom: df = n – 1
– Invariant with respect to m,
• Shaped like Normal, but with fatter tails
– Reflects uncertainty in sample variance
– Closer to Normal as n increases
• Critical value decreases
as n increases
a = .05
df
tcrit
df
tcrit
1
6.31
5
2.02
2
2.92
10
1.81
3
2.35
30
1.70
4
2.13
∞
1.64
Steps of t-test
1.
2.
State clearly the two hypotheses
Determine null and alternative hypotheses
•
•
3.
H0: m = m0
H1: m ≠ m0
Compute the test statistic t from the data
M  m0
• t .
s
n
4.
•
 5.
Determine likelihood function for test statistic according to H0
t distribution with n-1 degrees of freedom
Get p-value
•
R: 1-pt(t,n-1)
6. Choose alpha level
7a. p > α: Retain null hypothesis, μ = μ0
7b. p < α: Reject null hypothesis, μ ≠ μ0
Example: Rat Mazes
• Measure maze time on and off drug
– Difference score: Timedrug – Timeno drug
– Data (seconds): 5, 6, -8, -3, 7, -1, 1, 2
•
•
•
•
•
Sample mean: M = 1.0
Sample standard deviation: s = 5.06
Standard error: SE = s/√n = 5.06/2.83 = 1.79
t = (M – m0)/SE = (1.0 – 0)/1.79 = .56
p = p(tn – 1 ≥ t) = .30
Two-tailed Tests
•
•
•
•
Interested in m < m0 or m > m0
Leads to t < 0 or t > 0
Need critical region to cover both tails
p-value counts both tails
– p(|tn-1| > |t|)
– 2p(tn-1 > t) if t > 0
– 2p(tn-1 < t) if t < 0