Download Math 123- Statistics Name_______________________________

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Math 123- Statistics
Chapter 7 Notes
Name_______________________________
7.1- Introduction to Hypothesis Testing
Def- A hypothesis test is a process that uses sample statistics to test a claim about the value of a
population parameter.
Def- A statement about a population parameter is called a statistical hypothesis.
Def- A null hypothesis, H o , is a statistical hypothesis that contains a statement of equality such as
, ,  .
Def- The alternative hypothesis, H a , is the complement of the null hypothesis. It is a statement that
must be true if H o is false and it contains a statement of strict inequality such as , ,  .
Ex- State the null and alternative hypothesis for each situation.
a) The standard deviation for the height of corn is .7 feet.
b) A study claims that the mean amount of Tylenol that one needs to take to make a headache go
away is less than 130 mg.
c) The probability that people eat fish three times per week is at most .23.
Note: When doing hypothesis tests, you have two possible choices for your decision:
1. Reject H o .
2. Fail to reject H o .
Because of this, there are two types of errors that you can make:
Type I Error: H o is true, but you reject H o .
Type II Error: H o is false, but you fail to reject H o .
Ex- A tire manufacturer claims that their tires have an average life of at least 30,000 miles.
a) State the null and alternative hypothesis.
b) Determine when a Type I error occurs and when a Type II error occurs.
Note: When doing hypothesis tests, you are using your knowledge about a sample to predict what
will happen to a population. Since there is a variation in samples, you can sometimes reject H o
when it is true. To account for this variation, we allow for a margin of error in the data.
Def- In a hypothesis test, the level of significance,  , is your maximum allowable probability of
making a Type I error. i.e. P(Type I Error)=  .
Def- If H o is true, a P-value of a hypothesis test is the probability of obtaining a sample statistic with
a value as extreme as or more extreme than the one determined from the sample data.
P-values and types of tests:
Left-tailed test:
Right-tailed test:
Two-tailed test:
Def- The test statistic is the statistic that is compared with the parameter in the null hypothesis.
Population Parameter
Test Statistic
Standardized Test Statistic
Decision Rule Based on a P-value:
1. If P   , then we reject H o . (The data are statistically significant at level  .)
2. If P   , then we fail to reject H o . (When we fail to reject H o , this does not mean that we think
H o is true, it means there is not enough evidence to reject H o .
Ex- The U.S. department of labor claims that the proportion of hourly workers earning over $10 per
hour is greater than 42%. Interpret a decision if a) you reject H o and b) if you fail to reject H o .
Decision
Reject H 0 .
Fail to reject H 0 .
Decisions for Hypothesis Tests
Claim
Claim is H 0 .
Claim is H a .
There is enough evidence to
There is enough evidence to
reject the claim.
support the claim.
There is not enough evidence
There is not enough evidence
to reject the claim.
to support the claim.
Innocent ( H o )
Verdict
Innocent
(fail to reject H o )
Guilty
(reject H o )
The Truth
Guilty ( H a )
Correct decision 
Type I error

Type II error

Correct decision 
A Type I error is the worst type of error in this case because a person was innocent and they were
shown to be guilty.
A Type II error is slightly less offensive in this case because a guilty person who is shown to be not
guilty at this time always has the chance of being caught again in the future.
P(Type I error)=  If an innocent person goes to trial 100 independent times, then they will be
found guilty 5 times if  = .05. (There is a 95% chance that they will be found innocent.)
7.2- Hypothesis Testing for the Mean (  known)
There are two methods for doing hypothesis tests:
1. P-values
2. Rejection Regions
Outline of a Hypothesis Test Using P-values
1. State H 0 , H a ,  . H 0 and H a must be written in symbols and in words.
2. Determine what standardized test statistic to use for the problem and verify normality. (z, t,  2 )
3. Find the standardized test statistic (z, t,  2 ) and the corresponding P-value.
4. Make a decision to reject or fail to reject H 0 .
5. Interpret the decision in the context of the original claim. Note: See p. 356 for nice chart or chart
in 7.1 notes.
z test for a population mean 
The z-test can be used when  is known, the sample is random, and the population is normally
distributed. (If n  30 , this implies that the data is approximately normal.)
The test statistic is x .
x
The standardized test statistic is z 
.

n
Ex- A study finds that the average width of 32 samples of horse hair is .057 mm with a population
standard deviation of .018 mm. A manufacturer of wigs using horse hair claims that the mean width
of horse hair is less than .05 mm. Use   .1 to test the claim about the mean width of horse hair.
Use the P-value method. (More room given at top of next page.)
Ex- A tourist agency claims that the average daily cost of meals and lodging for a family of 4 on
vacation is at most $268. In a SRS of 45 families of 4, the average daily cost of meals and lodging
was $277 with a pop. std. dev. of $40. If   .1 , do you have enough evidence to reject the
agency’s claim? Use the P-value method.
Outline of a Hypothesis Test Using Rejection Regions
1. State H 0 , H a ,  . H 0 and H a must be written in symbols and in words.
2. Determine what standardized test statistic to use for the problem and verify normality. (z, t,  2 )
3. Find the standardized test statistic (z, t,  2 ) and the critical value(s) of the test ( z c , t c ,  c2 ).
4. Determine if the test statistic lies within the rejection region and state whether you reject or fail to
reject H 0 .
5. Interpret the decision in the context of the original claim. Note: See p. 356 for nice chart or chart
in 7.1 notes.
Decision Based on Rejection Regions
Left-tailed:
Right-tailed:
Two-tailed:
Ex- A random sample of 85 8th grade students finds that the average score on an exam is 265 with a
population standard deviation of 55. The school principal claims that the average score on the
exam is more than 260. Is there enough evidence to support the claim at the 5% level of
significance? Use rejection regions.
Ex- Use the 100 yard freestyle swimming data to complete the problem. The NCAA claims that the
average time to complete the 100 yard freestyle is 42.7 seconds. Is there enough evidence to reject
the claim at a 10% level of significance? Use rejection regions. Assume that the standard deviation
is .613 seconds.
7.3- Hypothesis Testing for the Mean (  unknown)
t-test for a population mean 
The t-test can be used when  is unknown, the sample is random, and the population is normally
distributed. (If n  30 , this implies that the data is approximately normal.)
The test statistic is x .
x
The standardized test statistic is t 
where d.f.= n – 1.
s
n
Note: The outline for a hypothesis test for a t-test is exactly the same as for a z-test. If d.f. you need
is not in the table, use the closest d.f. in the table that is smaller than the value you need.
Ex- Century 21 Realty claims that the average price of a house in Alabama is more than $240,000.
A SRS of 22 homes in Alabama shows that the average price of a home is $270,000 with a sample
standard deviation of $24,000. Use   .05 and P-values to test the claim about the mean
assuming that the data is approximately normally distributed.
Note: For looking up probabilities on the t-chart, the chart always tells you the area to the right of t.
*****This is the opposite of the z-chart.
Ex- Find the P-value for each of the following.
a) t= 1.383 d.f.= 9
left-tailed
b) t= -1.383 d.f.= 9
right-tailed
c) t= 1.383
d) t= 2.15
left-tailed
d.f.= 9
two-tailed
d.f.=18
Ex- A sample of 49 people in Chicago finds that the average amount of time that a person waits in
traffic is 47 minutes a day with a standard deviation of 20 minutes. The Chicago Transit Authority
claims that people wait an average of at least 40 minutes in traffic. Use   .1 to test the claim
using rejection regions.
Ex- The following data shows a SRS of paddle times (in minutes) for the members of a kayak team
over a three mile course. The team members paddle with Werner paddles. Conduct a hypothesis
test at a 5% level of significance to determine if paddling with Werner paddles gives an average time
that is faster than 27 minutes. Assume that the data is approximately normal.
Person
Werner paddle time
1
35
2
24
3
33
4
23
5
27
6
29
7
22
8
21
9
25
10
25
11
30
12
32
13
26
7.4- Hypothesis Testing for Proportions
z-test for a population proportion p
The z-test can be used when a binomial distribution is given such that np  5 and nq  5 . (Normality)
The test statistic is p̂ .
The standardized test statistic is z 
pˆ  p
pq
n
.
Ex- A random sample of 25 families shows that 12 own at least one dog. Furry’s Pet Store claims
that 55% of families own at least one dog. Test the claim at the 5% level using P-values.
Ex- A research analyst claims that no more than 10% of the U.S. population owns an iPhone. A
study was done where 250 people were randomly selected from across the U.S. and found that 13%
own an iPhone. Test the claim made by the research analyst using rejection regions and a level of
significance of .01.
7.5- Hypothesis Testing for Variance and Standard Deviation
 2 -test for a population variance  2 or a population standard deviation 
The  2 -test can be used when the population is normal.
The test statistic is s 2 .
The standardized test statistic is  2 
(n  1) s 2
2
where d.f.= n – 1.
Properties of a  2 Distribution (Chi-Square Distribution)
1.  2  0 .
2. The Chi-Square distribution is a family of curves each determined by the degrees of freedom.
3. The area under each  2 curve is 1.
4. Chi-Square distributions are positively skewed (skewed right).
Note: The probability table of a  2 distribution
always tells you the area to the right of  2 .
(Graph is from Wikipedia.)
Ex- Find each critical value for the  2 distribution.
a) n= 15,   .05 , left-tailed test
b) n= 15,   .05 , right-tailed test
c) n= 15,   .05 , two-tailed test
Note:  2 tests can be done using P-values, but this book focuses only on rejection regions.
Ex- A doctor believes that the variance in the number of people per year that die from car accidents
is no more than 500 per year. A random sample of the number of deaths for 16 years has a
variance of 463. Using   .01 , do you have enough evidence to reject the doctor’s claim assuming
that the population is normally distributed?
Ex- The amount of income taxes that people owe each year varies greatly from person to person
because income levels vary greatly. In a SRS of 5000 people, the std. dev. of the yearly income
taxes was about $37,209. The IRS claims that the std. dev. of yearly income taxes is only $25,989.
Test the claim given by the IRS assuming that the population is normally distributed and   .1 .