Download Type I error - Lincoln High School

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Daniel S. Yates
The Practice of Statistics
Third Edition
Chapter 10:
Estimating with Confidence
Copyright © 2008 by W. H. Freeman & Company
Ex.
Suppose a sample of 50 men had a mean score of 109 on an intelligence test.
• We can estimate that the population mean m, is approximately 109.
• x bar is normally distributed.
• The mean of the sampling distribution is equal to m, the unknown population
mean.
• The standard deviation of x bar for an SRS of 50 given the population
standard deviation s = 15 is 15/(50)0.5 = 2.1
• The 68 – 95 – 99.7 rule states that about 95% of all possible sample means x
bar will be within 2 standard deviations of the population mean m.
•
In 95% of all possible samples the unknown m, lies between x bar + or – 4.2
•
We are 95% confident that m lies between 109 + 4.2; that is (104.8 , 113.2)
•
There are only two possibilities:
1.
The interval between 104.8 and 113.2 contains the true population mean
m.
2.
Our SRS was one of the few samples for which x bar is not within 4.2
points of the true m. Only 5% of all samples give such inaccurate results.
 The method we used gives the correct result 95% of the time.
Applet showing confidence intervals:
http://onlinestatbook.com/stat_sim/conf_interval/index.html
Suppose you want to construct an 80% confidence interval
m
 Confidence level is usually
chosen as > 0.90
Confidence
level
Tail area
Z*
80%
0.1
1.282
90%
0.05
1.645
95%
0.025
1.960
99%
0.005
2.576
estimate
Margin of
error
Ex.
A questionnaire of 160 hotel managers asked how long they had been with their current
company. The average time was reported as 11.78 years. Give a 99% confidence
interval for the mean number of years that the entire population of managers have been
with there current company. Assume the standard deviation of the population is s =
3.2 years.
11.78 + 2.576(3.2/√160) = 11.78 + 0.652 = (11.128, 12.432)
We are 99% confident that the true population mean lies between 11.128 and 12.432.
The method we used will give the correct result 99% of the time.
• Ideally, we would like;
1) high confidence; method almost always
gives the right result.
and
2) small margin of error; population
parameter estimated very precisely.
Margin of error decreases when;
1)
z* gets smaller; but this makes confidence level smaller
2)
s is small – sample drawn from less spread population.
3)
n, sample size is large. Quadrupling the sample size cuts margin of error in half.
How to choose a sample size for a desired margin of error.
Ex.
How many observations must be made to produce results accurate to within + 0.005
with 95% confidence? Assume s = 0.0068.
z* s/√n < 0.005 => (1.960 * 0.0068/0.005)2 < n => 7.1 < n ; choose n greater than
or equal to 8
You must round up to next integer
It is incorrect to say that the probability is 95% that
the true mean lies within a certain interval.
We can say that we are 95% confident that the mean
lies within a certain interval or ; The method we used
to calculate the interval gives the correct result in
95% of all possible sample of a particular size.
Tests of Significance
• Significance tests assess the evidence provided by the data in favor of some claim
about the population.
• Significance tests begin by stating a hypothesis about a population parameter.
• The null hypothesis Ho, is always stated as an equivalence.
Ho : m = mo
• The alternative hypothesis Ha, can be stated in one of three ways.
Ha : m ≠ mo
m < mo
m > mo
Ex.
A car manufacturer claims that one of their car models gets 33mpg. A random
sample of 30 cars is selected and the mean gas mileage of this sample x-bar is
calculated to be 31 mpg. Can we refute the claim of the automaker? Assume
s = 3.5 mpg.
Ho: m = 33 mpg
Ha: m < 33 mpg
x - bar = 31 mpg, sample std. = 3.5/√30 = 0.639
33
3.5/√30 =
0.639
- 0.639
33
31
32.361
33
P( z < -3.12) =
0.00087
0.00087
31
33
• X-bar = 31 is way out on the normal curve. So far out that a result this small
almost never occurs by chance if the true m = 33 mpg.
• This is good evidence that the automakers claim should be rejected in favor of the
alternate hypothesis, m < 33 mpg
• Generally P-values < 0.05 are considered small enough to reject the Ho. It is
statistically significant.
Significance level
• We compare the P – value with a fixed value that we regard as decisive.
• The decisive value of P is called the significance level. Symbol => a
• Choosing a = 0.05 require that the data give evidence against Ho so extreme that it
would happen in no more than 5% of the possible samples if Ho is true. a = 0.01 require
that the data give evidence against Ho so extreme that it would happen in no more than
1% of the possible samples if Ho is true.
If P-value
is low,
reject the
HO
• If the P – value is as small or smaller than a, we say that the data are
statistically significant at level a = _____. The null hypothesis should be
rejected in favor of the alternate hypothesis.
One sided
test
Two sided
test
{
Choosing an a level in significance tests
• If Ho represents an assumption that people you must convince have
believed for a long period of time, strong evidence (small a), is needed to
persuade them.
• If the consequences of rejecting Ho are drastic; ie expensive, finality. You
may want strong evidence, (small a).
• May be more useful to report the P-value so each individual may decide
for themselves.
• Even though significance levels of 0.10, 0.05 and 0.01 have been used
traditionally. The border between what levels are significant is not black
and white. Not much difference between P-values of 0.049 and 0.051.
• No significance level is sacred.
Inference as decision
Type I and Type II errors
• If we reject Ho (accept Ha) when Ho is really true, this is a
Type I error.
• If we reject Ha (accept Ho) when Ha is really true, this is
Type II error.
Ho True
Ha True
Reject Ho
Type I
Error
Correct
Decision
Reject Ha
Correct
Decision
Type II
error
Significance and Type I error
• The significance level a of any fixed level significance test is equal
to the probability of making a Type I error.
• the value of a is the probability that the test will reject the null
hypothesis Ho when Ho is really true.
Power of the test
• The probability that a fixed level a significance test will reject Ho
when Ha is true is called the power of the test.
• Increasing sample size n, increases the power of the test.
• Increasing the significance level a, increases the power of the test.
Related documents