Download week 8 part 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Chapter 9
Hypothesis Testing II: two samples
•Test of significance for sample means
(large samples)
•The difference between “statistical
significance” and “importance”.
Basic Logic of the two sample case
 We begin with a difference between
sample statistics (means or
proportions).
 The question we test:
 “Is the difference between statistics
large enough to conclude that the
populations represented by the
samples are different?”
Basic Logic
 The H0 is that the populations are the
same.
 There is no difference between the parameters
of the two populations
 If the difference between the sample
statistics is large enough, or, if a difference
of this size is unlikely, assuming that the
H0 is true, we will reject the H0 and
conclude there is a difference between the
populations.
Basic Logic
 The H0 is a statement of “no difference”
 The 0.05 level will continue to be our
indicator of a significant difference
 We change the sample statistics to a Z
score, place the Z score on the sampling
distribution and use Appendix A to
determine the probability of getting a
difference that large if the H0 is true.
The Five Step Model
1. Make assumptions and meet test
requirements.
2. State the H0.
3. Select the Sampling Distribution and
Determine the Critical Region.
4. Calculate the test statistic.
5. Make a Decision and Interpret
Results.
Example: Hypothesis Testing in the
Two Sample Case
 Problem 9.5b (p. 243 in Healey)
 Middle class families average 8.7 email
messages and working class families
average 5.7 messages.
 The middle class families seem to use
email more but is the difference
significant?
Step 1 Make Assumptions and
Meet Test Requirements
 Model:
 Independent Random Samples
 The samples must be independent of each
other.
 LOM is Interval Ratio
 Number of email messages has a true 0 and
equal intervals so the mean is an appropriate
statistic.
 Sampling Distribution is normal in shape
 N = 144 cases so the Central Limit Theorem
applies and we can assume a normal shape.
Step 2 State the Hypotheses
 H0: μ1 = μ2
 The Null hypothesis asserts there is no
significant difference between the populations.
 H1: μ1  μ2
 The alternative, research hypothesis contradicts
the H0 and asserts there is a significant
difference between the populations.
Step 3 Select the Sampling
Distribution and Establish the
Critical Region
 Sampling Distribution = Z distribution
 Alpha (α) = 0.05
 Z (critical) = ± 1.96
Step 4 Compute the Test
Statistic
 Use Formula 9.4 to compute the
pooled estimate of the standard error.
 Use Formula 9.2 to compute the
obtained Z score.
Step 5 Make a Decision
 The obtained test statistic (Z = 20.00) falls
in the Critical Region so reject the null
hypothesis.
 The difference between the sample means
is so large that we can conclude (at α =
0.05) that a difference exists between the
populations represented by the samples.
 The difference between the email usage of
middle class and working class families is
significant.
Factors in Making a Decision
 The size of the difference (e.g., means of
8.7 and 5.7 for problem 9.7b)
 The value of alpha (the higher the alpha,
the more likely we are to reject the H0
 The use of one- vs. two-tailed tests (we are
more likely to reject with a one-tailed test)
 The size of the sample (N). The larger the
sample the more likely we are to reject the
H0.
Significance Vs. Importance
 As long as we work with random
samples, we must conduct a test of
significance.
 Significance is not the same thing as
importance.
 Differences that are otherwise trivial or
uninteresting may be significant.
Significance Vs. Importance
 When working with large samples,
even small differences may be
significant.
 The value of the test statistic (step 4) is
an inverse function of N.
 The larger the N, the greater the value of
the test statistic, the more likely it will
fall in the Critical Region and be declared
significant.
Significance Vs Importance
 Significance and importance are different
things.
 In general, when working with random
samples, significance is a necessary but not
sufficient condition for importance.
 A sample outcome could be:
 significant and important
 significant but unimportant
 not significant but important
 not significant and unimportant