Download Significance Tests - University of Florida

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Significance Tests
• Hypothesis - Statement Regarding a
Characteristic of a Variable or set of variables.
Corresponds to population(s)
– Majority of registered voters favor health care reform
– Average salary progressions differ for male
executives whose spouses work than for those whose
spouses “stay at home”
• Significance Test - Means of using sample
statistics (and their sampling distributions) to
compare their observed values with hypothesized
value of corresponding parameter(s)
Elements of Significance Test (I)
• Assumptions
–
–
–
–
Data Type: Quantitative vs. Qualitative
Population Distribution: Some methods assume normal
Sampling Plan: Simple Random Sampling
Sample Size: Some methods have sample size
requirements for validity
• Hypotheses
– Null Hypothesis (H0): A statement that parameter(s)
take on specific value(s) (Often: “No effect”)
– Alternative Hypothesis (Ha): A statement contradicting
the parameter value(s) in the null hypothesis
Elements of Significance Test (II)
• Test Statistic: Quantity based on the sample data
to test the null hypothesis. Typically is based on a
sample statistic, parameter value under H0 , and
the standard error.
• P-value (P): The probability that we would obtain
a test statistic at least as contradictory to the null
hypothesis as our computed test statistic, if the
null hypothesis is true.
– Small P-values mean the sample data are not
consistent with the parameter value(s) under H0
Elements of Significance Test (III)
• Conclusion (Optional)
– If the P-value is sufficiently small, we reject H0 in
favor of Ha . The most widely accepted minimum
level is 0.05, and the test is said to be significant at
the .05 level.
– If the P-value is not sufficiently small, we fail to
reject (but not necessarily accept) the null hypothesis.
– Process is analogous to American judicial system
• H0: Defendant is innocent
• Ha: Defendant is guilty
Significance Test for Mean
(Large-Sample)
• Assumptions: Random sample with n  30,
quantitative variable
• Null Hypothesis: H0: m = m0 (typically no
effect or change from standard)
• Alternative Hypothesis: Ha: m  m0 (2-sided
alternative includes both > and <)
• Test Statistic: zobs  Y ^ m0  Y  m0
Y
• P-value: P=2P(Z  |zobs|)
s/ n
Example - Mercury Levels
• Population: Patients visiting private internal
medicine clinic in S.F. (High-end fish consumers)
• Variable: Mercury levels (microg/L)
• Sample: 66 Females
• Recommended maximum level: 5.0 microg/L
• Null hypothesis: H0: m = 5.0 (Mean level=RML)
• Alternative hypothesis: Ha: m  5.0 (Mean  RML)
• Sample Data:
15
15
Y  15 s  15 n  66  Y 

 1.85
66 8.12
^
Example - Mercury Levels
• Test Statistic:
zobs 
Y  m0
^
Y
15  5 10


 5.41
1.85 1.85
• P-Value:
P=2P(Z  5.41) < 2P(Z  5.00) = 2(.000000287)= .000000574  0
• Conclusion: Very strong evidence that the
population mean mercury level is above RML
Source: Hightower and Moore (2003), “Mercury Levels in High-End Consumers of Fish,
Environ Health Perspect, 111(4):A233
Miscellaneous Comments
• Effect of sample size on P-values: For a given
observed sample mean and standard deviation,
the larger the sample size, the larger the test
statistic and smaller the P-value (as long as the
sample mean does not equal m0)
• Equivalence between 2-tailed tests and
confidence intervals: If a (1-a)100% CI for m
contains m0, the P-value will be larger than a
• 1-sided tests: Sometimes researchers have a
specific direction in mind for alternative
hypothesis prior to collecting data.
Example - Crime Rates (1960-80)
• Sample: n=74 Chicago Neighborhoods
• Goal: Show the average delinquency rate in
the population of all such neighborhoods
has increased from 1960-1980
• Variable: Y = DR1980-DR1960
• H0: m = 0 (No change from 1960-1980)
• Ha: m > 0 (Higher in 1980, see Y above)
• Sample Data:
30.73
Y  41.26 s  30.73 n  74  Y 
 3.57
74
^
Example - Crime Rates (1960-80)
• Test Statistic:
zobs 
Y  m0
^
Y
41.26  0

 11.6
3.57
• P-value: (Only interested in larger positive values since 1-sided)
P  P(Z  zobs )  P(Z  11.6)  0
• Conclusion: Strong evidence that the true mean delinquency
rate among all neighborhoods that this sample was taken from
has increased from 1960 to 1980.
Source: Bursik and Grasmick (1993), “Economic Deprivation and Neighborhood Crime Rates, 19601980”, Law & Society Review, Vol. 27, pp 263-284
Significance Test for a Proportion
(Large-Sample)
• Assumptions:
– Qualitative Variable
– Random sample
– Large sample: n  10/min(p0 , 1- p0)
• Hypotheses:
– Null hypothesis: H0: p  p0
– Alternative hypothesis: Ha: p  p0 (2-sided)
– Ha+ : p > p0
Ha- : p < p0 (1-sided, prior to data)
Significance Test for a Proportion
(Large-Sample)
• Test statistic:
^
zobs 
• P-value:
^
p p0
p p0


p 0 (1  p 0 ) / n
^
p
– Ha: p  p0 P = 2P(Z  |zobs|)
– Ha+ : p > p0 P = P(Z  zobs)
– Ha- : p < p0 P = P(Z  zobs)
• Conclusion: Similar to test for a mean
Decisions in Tests
 a-level (aka significance level): Pre-specified
“hurdle” for which one rejects H0 if the P-value
falls below it. (Typically .05 or .01)
P-Value
.05
> .05
H0 Conclusion
Reject
Do not Reject
Ha Conclusion
Accept
Do not Accept
• Rejection Region: Values of the test statistic for
which we reject the null hypothesis
• For 2-sided tests with a = .05, we reject H0 if |zobs| 1.96
Error Types
• Type I Error: Reject H0 when it is true
• Type II Error: Do not reject H0 when it is false
Test Result –
Reject H0
Don’t Reject
H0
True State
H0 True
Type I Error
Correct
H0 False
Correct
Type II Error
Error Types
• Probability of a Type I Error: a-Level
(significance level)
• Probability of a Type II Error: b - depends
on the true level of the parameter (in the
range of values under Ha ).
• For a given sample size, and variability in
data, the Type I and Type II error rates are
inversely related
• Conclusions wrt H0 are the same whether a
hypothesis test of CI is conducted (fixed a)
Miscellaneous Issues
• Statistical vs Practical Significance: With
very large sample sizes, we can often obtain
very small P-values even when the sample
quantity is very close to the parameter value
under H0. Always consider the estimate as
well as P-value.
• While hypothesis tests and confidence
intervals give similar conclusions wrt H0,
the CI gives a credible set of parameter
values, which can be more specific than test
Small-sample Inference for m
• t Distribution:
– Population distribution for a variable is normal
– Mean m, Standard Deviation 
– The t statistic has a sampling distribution that is called
the t distribution with (n-1) degrees of freedom:
t
Y m
^
Y
Y m

s/ n
• Symmetric, bell-shaped around 0 (like standard normal, z distribution)
• Indexed by “degrees of freedom”, as they increase the distribution approaches z
• Have heavier tails (more probability beyond same values) as z
•Table B gives tA where P(t > tA) = A for degrees of freedom 1-29 and various A
Small-Sample 95% CI for m
• Random sample from a normal population
distribution:
^
Y  t.025,n1  Y
 s 
 Y  t.025,n1  
 n
• t.025,n-1 is the critical value leaving an upper tail area of .025 in
the t distribution with n-1 degrees of freedom
• For n  30, use z.025 = 1.96 as an approximation for t.025,n-1
t test for a mean
• Assumptions: Random sample for a quantitative
variable with a normal probability distribution
• Hypotheses:
– H0: m  m0
• Test Statistic:
Ha: m  m0 (2-sided)
tobs 
Y  m0
^
Y
Y  m0

s/ n
• P-Value: 2P(t > |tobs|)
• Conclusions as before, as well as 1-sided tests