Download Tails - Mary pays

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Chapter 15
Tests of Significance: The Basics
Statistical inference provides methods for drawing conclusions about a
population from sample data.
Two of the most common types of statistical inference:
1) Confidence intervals
Goal is to estimate a population parameter.
2) Tests of Significance
Goal is to assess the evidence provided by the data about some claim
concerning the population.
Basic Idea of Tests of Significance
The reasoning of statistical tests, like that of confidence intervals, is based on
asking what would happen if we repeated the sample or experiment many times.
Example
Each day Tom and Mary decide who pays for lunch based on a toss of Tom’s
favorite quarter.
Heads - Tom pays
Tails - Mary pays
Tom says that tossing quarter has an even chance of landing heads/tails.
Mary thinks she pays more often.
Mary steals the quarter, tosses it 10 times, gets 7 tails (70% tails).
She is furious and claims that tossing this coin gives unbalanced results.
There are two possibilities:
1. Tom is telling truth – the chance of tails is 50% and the observed 7 tails out of
10 tosses was only due to sampling variability.
2. Tom is lying – the chance of tails is greater than 50%.
Suppose they call you to decide between 1 and 2 (maybe because they realized
that they need a statistician to solve this problem!!).
1
To be fair to both of them, you toss the quarter 25 times. Suppose you get 21
tails.
If the coin is a fair coin, the actual probability of getting greater than or equal to
21 tails in 25 tosses is _______.
What would you conclude? Why?
Moral of the story: an outcome that would rarely happen if a claim were true is
good evidence that the claim is not true.
2
Tests of Significance
A significance test is a formal procedure for comparing observed data with a
hypothesis whose truth we want to assess.
The results of a test are expressed in terms of a probability that measures how
well the data and hypothesis agree.
(1) Stating hypotheses
A hypothesis is a statement about the parameters in a population, e.g.
State your research question as two hypotheses, the null, and the alternative
hypotheses. Remember that these are written in terms of the population
parameters!!
The null hypothesis (H0) is the statement being tested. This is assumed “true”
and compared to the data to see if there is evidence against it. Typically, H0 is a
statement of “no difference” or “no effect”.
Suppose we want to test the null hypothesis that  is some specified value, say
0. Then
H0 :
(Note: We would always express H0 using equality sign)
The alternative hypothesis (Ha) is the statement about the population parameter
that we hope or suspect is true. We are interested to see if the data supports this
hypothesis.
Ha can be one-sided (e.g.
) or two-sided (e.g.
).
3
Ex: Gellogg’s Strawberry bars
Gellogg’s says that its Strawberry bars weighs, on average, 16 oz. A consumer
union is suspicious that the bars weigh less than what is claimed. In order to
check their suspicion, they weigh the contents of randomly chosen 20 bars. These
20 bars have an average weight of 15.6 oz. Assume that the weights follow
normal distribution with std. dev. 0.7 oz. Is there evidence that the consumer
union’s suspicion is correct?
Let μ be the true average weight of the strawberry bars. The hypotheses are:
(2) Calculate P-value
We ask: Does the sample give evidence against the null hypothesis?
In the “Gellogg’s example”, this means
To answer this, we find
Test statistic: A test statistic calculated from the sample data measures how far the
data diverge from what we would expect if the null hypothesis H0 were true.
P-value: The probability that the sample mean would take a value as extreme or
more extreme than the one we actually observed if H0 is true.
A small P-value is strong evidence ____________ H0. Such a P-value says that if
H0 is true, then the observed data is unlikely to occur just by chance.
The smaller the P-value,
In the “Gellogg’s example”, P-value means “what is the probability of getting a
sample of 20 bars whose mean weight is less than or equal to 15.6 oz, if true
mean is 16 oz.?”
4
Note that we could divide this step in two parts:
(i) Calculate the test statistic (Z-score)
(ii) Calculate P-value in terms of the test statistic
P-values in terms of the test statistic:
Ha
P-value
 < 0
Pr(Z  z)
 > 0
Pr(Z  z)
  0
2Pr(Z  |z|)
Area under curve
where z is the observed value of the test statistic and the probabilities are found
using the standard normal distribution given in Table A.
A P-value is exact if the population distribution is normal; otherwise, it
approximates the true probability for large n. Because of which theorem?
In the “Gellogg’s example”, the test statistic and the corresponding P-value are
5
(3) Statistical Signigicance
Prior to testing, it is determined how small the P-value must be to be
considered decisive evidence against H0.
Significance level (usually represented as ) is the value of probability below
which we start consider significant differences. Typical  levels used are 0.1, 0.05
and 0.01.
If P-value  ,
If P-value > ,
If the P-value   we say the data are statistically significant at level .
Note that when we do not reject H0 we are not claiming H0 is true. We are just
concluding there is not sufficient evidence to reject it.
The final step is to decide if there is a strong amount of evidence to reject H0 in
favor of Ha. This is accomplished using the P-value.
In the “Gellogg’s example”, we got P-value =
What this tells us: If H0 is true (i.e., true mean weight is 16 oz), then the chance
of getting a sample whose mean weight is 15.6 oz or less is
If the significance level is  = 0.05, does it give evidence against H0?
Conclusion:
6
Tests for a Population Mean
Tests of significance: The four-step process
(1) STATE: What is the practical question that requires a statistical test?
(2) PLAN: Identify the parameter of interest, state null and alternative
hypotheses, fix the significance level  and choose the type of test that fits your
situation.
(3) SOLVE: Carry out the work in three phases:
(i) Check the conditions for the test you plan to use.
(ii) Calculate the test statistic.
(iii) Find the P-value and state the conclusion.
(4) CONCLUDE: Return to the practical question to describe your results in this
setting.
7
Ex: Suppose last year Ameritech’s repair service took an average of 3.2 days to
fix the customer complaints. One of the managers is assigned to check if this
year’s data show a different average time to fix the problems. He collects a
random sample of 30 customer complaints and finds that the average time taken
to fix them is 2.1 days. Assume that the standard deviation of the time taken to
fix the complaints is 2.5 days. Is this good evidence at 10% level that the average
time taken to fix the complaints is more than 3.2 days?
STATE:
PLAN:
SOLVE:
Check the conditions:
Calculate the test statistic:
Calculate P-value & state the conclusion:
CONCLUDE:
Is this conclusion valid even if the original population of corn yield is somewhat
non-normal?
8
Ex: Home Depot sells concrete blocks. The store manager wants to estimate the
average weight of all blocks in stock. A simple random sample of 64 blocks has a
mean weight of 65.5 lbs. Assume that the weights of blocks are normally
distributed with standard deviation 4.6 lbs.
The store manager is interested in knowing if the mean weight of all blocks is 68
lbs or not (at 5% level). State the appropriate hypotheses. Follow the four-step
process.
STATE:
PLAN:
SOLVE:
CONCLUDE:
9