Download HYPOTHESIS TESTING Hypothesis Tests

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
HYPOTHESIS TESTING
Hypothesis Tests: Purpose
• Sampling => sampling error exists
• We want to know if sampling error is a
likely explanation for our observed results.
Null Hypothesis:
 Assumed to be true for
purpose of hypothesis
test.
 Rejection implies
"acceptance" of
conclusion we wish to
verify
Hypothesis testing and truth
True State of Nature : Null Hypothesis is
Research Conclusion
TRUE
FALSE
DO NOT REJECT
Ho
Correct decision
Confidence level
Probability = 1 -
Error: Type II
Probability =
Error: Type I
Significance level
Probability = 
Correct decision
Power of test
Probability = 1-


REJECT Ho

Hypothesis testing Procedure:
1. Specify Null and alternative hypotheses
1. (null hypothesis is what we want to reject)
2. Choose statistical test for distribution for chosen
test statistic
1. (see notes on choosing appropriate tests)
3. Specify confidence level
1. (ie. Required precision, ie. What , what ?)
4. Compute sample statistics,
5.
Compute test statistic.
Determine probability of test statistic under
null hypothesis, using sampling
distribution
 if we took repeated samples, and the null
is true, how likely is the result we
obtained?
Hypothesis testing final step
Compare obtained probability with specified significance
level-- reject or do not reject

if the test is-is the parameter different from 0?–
then big numbers are more likely different from 0, and so
the probability of observing such a big number given
that the null is true is very small.

therefore REJECT hypotheses where the p value or
sig value are LESS than the level of significance.
Notes on "p-values" and Hypothesis
Testing
(from Sawyer and Peter (1983 ) "The Significance of Statistical Significance" Journal of Marketing
Research, v. 20 (May), pp.122-133.)
DEFN: p value is P(evidence| hypothesis)
e.g. Ho: µ1 = µ2:
t test -- test statistic: t = [(x1- x2) - (µ1- µ2)] / S(x1- x2)
and p=.05
=> if the null is true (ie. no difference) then the
probability of getting a mean difference this large or
larger is 1/20.
Interpreting “p” values
1. p=.05 does NOT imply probability of .05 of the
results being due to chance or 95% probability that
the observed value is "true".
•
(ie. P(hyp | evidence)
•
p value is calculated assuming that the difference
is due to chance with prob=1
•
so we are actually accepting or rejecting prob=1
that chance caused difference
Interpreting “p” values
2. p value is not a summary of the data
 p value is not a measure of how strong or
dependable the result is
 results are not "more" or "less" significant
i.e. p=.001 is not highly sig. or more sig. than
p=.05
Interpreting “p” values
3. Statistical significance does not equal
practical significance
Results are statistically significantly
different from null, NOT significant or highly
significant.

Sample size and Probability of
Hypothesis
• Sampling error is explicitly included in significance
tests, therefore if a given relationship is found to be
statistically significant at a given confidence level
then more confidence should be placed in the result if
the study had a smaller rather than larger sample size
• A larger sample may be more representative for nonsampling reasons but significance with a small
sample is a more conservative test.
• This is not to recommend overly small samples but
anything will be significant in extremely large
samples.
Limitations of significance tests



•
Researcher influences objectivity of test via:
Greater sample size to increase power
greater reliability of measures to increase power
post hoc change of level of significance ie. .05 to .10/
one tailed or two tailed test
control over non-manipulated variables
note: null results usually attributed to measurement
error or sample size-
Description of Results
Focus on effect sizes and confidence
intervals- not just whether or not a result is
significant. If you want to say how
confident you are about your results then
report the confidence interval!
Description of Results
• Aggregate vs. Individual results- remember even
with reasonably large effect sizes a large
proportion of individuals in two groups will be
similar or even in reverse order.
(e.g. a difference of .8 std. deviations between means
still has 52.6% of the two populations overlapped)
Data Analysis Process
1. What's the problem to be
answered?
2. What do we want to do
about it?
– Compare groups
– Compare variables
– Relationships among
variables
Data Analysis Process
3. What does the data look like:
 sample size (n)
 dependent variable(s) / independent variables
 nominal, ordinal, interval, ratio
 description, distribution
 missing values, outliers, errors (data
"cleaning")
 research design
– -- repeated measures, independent measures,
time related
– number of groups / treatments compared
– number of relationships to be investigated
Data Analysis Process
4. What is the appropriate test / test statistic
 assumptions underlying test statistic
 violations?
 need to change test to accommodate
violations?
• finalize: test, variables, options
Data Analysis Process
5. Run analysis
6. Output: did it do what we wanted it tocheck "n", variables, options
Data Analysis Process
7. Interpretation

Results: effect size, confidence interval,
statistical significance of tests, researchers'
influence on those results
 Limitations
- research design
- statistical
- alternative explanations
LEVEL OF
MEASUREMENT
One Sample
Two Or More
Samples—
Independent
Two Or More
Samples—
Related
Nominal
Chi-Square*
Chi-Square*
McNemar*
Cochran’s Q
Ordinal
KolmagorovSmirnov*
Mann-Whitney*
Kruskal Wallis
(more than 2
groups)*
Wilcoxan *
Interval—
Small sample
t-test*
t-test*
Paired sample
t-test*
Interval—
Large sample
z test*
z test*
Anova* (GLM in
spss)
Repeated
Measures
Anova*
Basic Considerations in Choosing a Multivariate Statistical Test
Level of
Measurement of
Criterion?
Nominal (N)
Number of
Criterion
Variables?
One
Dependence
Analysis
None
Interdependence
Analysis
Interval
or Ratio (I)
Ordinal (O)
Level of
Measurement
of Predictors?
N
O
Level of
Measurement
of Predictors?
I
STOP
N
O
STOP
Contingency Coefficient
Index of Predictive
Association
I
STOP
Spearman’s
Rank Correlation
Discriminant
Analysis
N
Level of
Measurement
of Predictors?
N
O
Level of
Measurement?
I
O
Factor Analysis with
Dummy Variables
Cluster Analysis
Kendall’s
Coefficient
of Concordance
STOP
Regression
Analysis
Regression Analysis
with Dummy
Variables