Download Chapter 15 Notes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
STAT 1450 COURSE NOTES – CHAPTERS 15
TESTS OF SIGNIFICANCE: THE BASICS
Connecting Chapter 15 to our Current Knowledge of Statistics
Chapters 10 & 12 used information about a population to answer questions about a sample
(e.g., 20% of people are smokers, what is the probability that a random sample of 2 people smoke).
Now with inference, we have statistics problems where we
use the information about a sample to answer questions concerning the population.
If we want to ___________________________________,
then we should use statistics to create a
_____________________.
If, on the other hand, we want to
_________________ provided by data ____________________ concerning a
population parameter, we need to conduct a ______________________.
15.1 The Reasoning of Tests of Significance
We are now inquiring about a behavior of an event if a phenomena was repeated numerous
times. We will begin by working with simple random samples of data from Normal populations
with known standard deviations.
Situation: People drink coffee for a variety of professional, and now, social reasons.
Coffee used to merely be a beverage option on the menu.
Now, it is the main attraction for a growing number of restaurants and shoppes.
The standard “cup of coffee” is 8 oz. However, even a Tall at Starbuck’s is 12 oz.
Please answer the following questions:
a) How many ounces of coffee do you think people typically drink each day? _________
b) How many ounces of coffee do you drink daily?
_________
Note: We will only consider the population of regular coffee drinkers.
15.2 Stating Hypotheses
We have two possible hypotheses about this situation:
1. The mean amount of coffee consumed daily is not different from the value listed in a).
2. The mean amount of coffee consumed daily is different from the value listed in a).
Chapter 15, page 1
These hypotheses have names:
1. The _______________________ is the claim tested about the population parameter.
The test is designed to assess the strength of the _______________________ the null
hypothesis. Usually the null hypothesis is a statement of “no effect” or “no difference.”
It commonly assumes the ___________________________________.”
2. The _________________________________ is the claim about the population parameter
that we are trying to find ______________________.
An alternative hypothesis is ________________ if it states that a parameter
Is _____________________ or _______________________ the null hypothesis value.
It is _________________ if it states that the parameter is __________________ the null value
(it could be either smaller or larger).
Question: Is the alternative hypothesis in our situation one-sided or two-sided?
Example: Let’s use one of the values from a) to compose the null and alternative hypotheses.
Data: Suppose it is known that the standard deviation for daily coffee consumption is 9.2 oz.
The average amount of coffee consumed daily for a random sample of 48 people is 26.31 oz.
True or False:
We can conclude that the average amount of coffee consumed daily is different from our
hypothesized value.
Important note: Base your alternative hypothesis on your question of interest—do not base it on
the data.
Chapter 15, page 2
15.3 P-value & Statistical Significance
A ___________________________ calculated from the sample data
measures how far the data departs from what we would expect if the null hypothesis were true.
The further this statistic is from 0, the more the data contradicts the null hypothesis.
Note: A test statistic tells us how many standard deviations our value is away from the
hypothesized mean. A positive test statistic is above the mean. A negative one is below the
mean. We then use this information to figure out how likely it is to see results like ours if the null
hypothesis was true.
The probability, computed assuming that the null hypothesis is true, that the test statistic would
take a value as extreme or more extreme than that actually observed is called the
_________________________ of the test.

If the _____________ is ___________ enough, the data we observed would be ________
(very unlikely to have happened) if the null hypothesis were true.

If the _________ is _______________enough, the data we observed are ____________
at all (could plausibly have happened due to sampling variability) if the null hypothesis
were true.
Tests of Significance & the Justice System
Tests of Significance
Null Hypothesis
Alternative Hypothesis
Test Statistic
P-value
Justice System
The defendant gets the “benefit of the doubt.” (i.e., they are not guilty).
They are “guilty.”
Totality of Evidence collected.
The probability of observing data as extreme as what was collected if we
under the assumption that the defendant is, indeed, “not guilty.”
When, the evidence collected seems ‘likely’ (based upon the null hypothesis)
Decision
Jury rules that the defendant is ‘not guilty.”
When, the evidence collected seems ‘extremely unlikely’ (based upon the null hypothesis)
Decision
- Either we have “bad” data (mistrial, tampering, etc…)
Or
The jury rules that the defendant is ‘guilty.’
Note: Our jury system assumes innocent until proven guilty.
The actual truth of whether the person indeed committed the crime may never be known.
Question: What is the cut-off between “likely,” “unlikely,” and “extremely unlikely?”
Chapter 15, page 3
If the P-value is as small or smaller than , we say that the data are statistically significant at
level . The quantity  is called the significance level or the level of significance.
P-Value vs. 
P-value > 
P-value ≤ 
Decisions about H0
Ho
Ho
Question: Why should a significance level be set before the test has been done?
The test statistic for hypothesis testing has is based upon our work from sampling distributions
and confidence intervals.
15.4 Tests for a Population Mean
Tests of significance, allow researchers to determine the validity of certain hypotheses based
upon P-values. There are various parameters that we can test (proportions, standard deviations,
etc…). We will begin with the most common parameter to be tested, the mean; much like how
we began our confidence interval discussion by estimating the true mean, .
Draw an SRS of size n from a large population that has the Normal distribution with mean μ and
standard deviation σ. The one-sample z statistic
x 
z
 n
has the z distribution.
To test the hypothesis H 0 :    0 , compute the ______________________
x  0
z
 n
Key Words
“more than”
“increased”
Alternative Hypothesis
Ha:  > 0
P-Value
P (Z ≥ z)
“less than”
“reduced”
Ha:  < 0
P (Z ≤ z)
“different”
“is not”
Ha:  ≠ 0
2*P(Z ≥ z)
Rejection Region
Back to our example…
Chapter 15, page 4
Example: The standard deviation of daily coffee consumption is 9.2 oz. A random sample of 48
people consumed an average of 26.31 oz. of coffee daily. Is this evidence that the average
amount of coffee consumed daily is different from our original estimate?
Poll: Using your intuition, do you believe we have enough evidence against our original claim?
(a) Yes
(b) No
Let’s conduct the test of significance.
Technology Tips – Conducting Tests of Significance ( known)
TI-83/84 STAT  TESTS  ZTest  Enter.
Select Stats. Enter 0, , x , and n. Select Calculate.
(Note: Select Data when x and n are not provided. Then enter the list where the data are stored.)
JMP
Enter the data. Analyze  Distribution.“Click-and-Drag” (the appropriate variable)
into the ‘Y, Columns’ box. Click on OK.
Click on the red upside-down triangle next to the title of the variable from the
‘Y,Columns’ box. Proceed to ‘Test Mean.’ Enter 0,  and click on OK.
Chapter 15, page 5
The 4-Step Process As Applied to Tests of Significance
1. ________: What is the practical question that requires a statistical test?
2. ________:
a) Identify the parameter.
b) List all given information from the data collected.
n: ________________
c) State the null (H0) and alternative (HA) hypotheses.
H0: _________________
HA: _________________
 = __________
d) Specify the level of significance,
e) Determine the type of test.
f)
Left-tailed
Right-tailed
Two-Tailed
Sketch the region(s) of “extremely unlikely” test statistics.
3. _______:
a) Check the conditions for the test you plan to use.
Random sample?
Population : Sample Ratio?
Large enough for sample?
b) Calculate the test statistic
c) Determine (or estimate) the P-Value.
4. _________: a) Make a decision about the about the null hypothesis (Reject H0 or Fail to reject H0).
b) Interpret the decision in the context of the original claim.
(i.e., “There is enough (or not enough) evidence at the  level of significance that … )
Chapter 15, page 6
Example: Recall that IQ scores from Chapter 14 followed a Normal Distribution with = 15.
You suspect that persons from affluent communities have IQ scores above 100. A random
sample of 35 residents of an affluent community had an average IQ score of 112. Is there
significant evidence to support your claim at the =.05 level?
The 4-Step Process As Applied to Tests of Significance
1. State: What is the practical question that requires a statistical test?
2. Plan:
a) Identify the parameter.

b) List all given information from the data collected.
n: ____________________
c) State the null (H0) and alternative (HA) hypotheses.
H0: ____________________
HA: ____________________
 = _____________________
d) Specify the level of significance,
e) Determine the type of test.
Left-tailed
Right-tailed
Two-Tailed
f) Sketch the region(s) of “extremely unlikely” test statistics.
3. Solve:
Random Sample?
a) Check the conditions for the test you plan to use.
Population : Sample Ratio?
Large Enough for Normality?
b) Calculate the test statistic
c) Determine (or estimate) the P-Value.
4. Conclude: a) Make a decision about the about the null hypothesis (Reject H 0 or Fail to reject H0).
b) Interpret the decision in the context of the original claim.
(i.e., “There is enough (or not enough) evidence at the  level of significance that …
Note:
Some homework exercises will provide you with raw data.
You are to use the data to compute the sample mean and/or standard deviation.
Then proceed with computing the confidence interval or performing a test of significance.
15.5 Significance from a Table:
Chapter 15, page 7
The graphing calculator and JMP provide the most accurate P-value calculations. Tables can also
be used to estimate P-values. t least 3 new ideas that had the most impact on your knowledge of
tests of significance.
There are two methods of determining the P-Value for a z-statistic.
Table C:
1. Compare z with the critical values z*
at the bottom of Table C.
2. If z falls between two values of z*,
then the P-value falls between the
two corresponding values of P
in the “One-sided P” or the “Two-sided P”
row of Table C.
Table A:
1. Compute the P-value, which is:
a) P (Z > z) for a Right-tailed test.
b) P (Z < z) for a Left-tailed test.
c) 2*P (Z > |z|) for a two-tailed test.
2. Compare the P-value with .
Using technology to compute P-values is most preferred.
For our purposes, using Table C is a suitable alternative to technology. This may not produce the
same accuracy as the other options, but it will strengthen estimation skills.
Example: The z-statistic for a left-tailed test is z = -1.45. How significant is this result?
Five-Minute Summary:
List at least 3 concepts that had the most impact on your knowledge of tests of significance.
_______________
______________
____________
Chapter 15, page 8