Download Ch 6 Hypothesis Testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Lesson Objective
Be able to used Cumulative Probability tables
Understand the basic principles behind Hypothesis testeing
1) X ~ B(18, 0.6). Use tables to find the following
probabilities:
(i) P(X ≤ 6)
(ii) P(X ≥ 8)
(iii) P(3 < X < 10)
2) X ~ B(14, 0.4). Use tables to find the following
probabilities:
(i) P(X > 5)
(ii) P(X < 8)
(iii) P(6 ≤ X ≤ 11)
It is alleged that I have three fair coins. One is red, one is blue and
the other is green.
I toss the red coin 20 times and get 8 heads does this suggest that
the coin is fair?
I toss the blue coin 20 times and get 4 heads does this suggest that
the coin is fair?
I toss the green coin 20 times and get 1 head does this suggest that
the coin is fair?
A hypothesis test is a way of testing whether a population
parameter (in this case the probability of success in a Binomial, p)
has a particular value.
The test works by looking to see if the experimental results
obtained are likely if the population parameter is true.
It does this by calculating how extreme the experimental results
seem. A value that is in the 5% of most extreme results is generally
considered to be so unlikely that it is more likely to have been
caused by the population parameter being different to that claimed.
We call this value
the significance
of the test
Eg Consider a coin that is thrown 20 times. If the coin is fair,
then p (the probability of getting a head) would be equal to 0.5.
Let X = Number of Heads
Then X ~ B(20, 0.5) and the probability distribution of X looks
like this:
The ideal hypothesis test involves:
Establishing the Null and Alternative hypotheses
(this is the value of p that you want to test)
Deciding on the significance level of the test
(this is the percentage of extreme cases you are
willing to believe are so extreme they must justify p
being incorrect)
Collecting suitable data using a random sampling
procedure that ensures the items are independent.
Conducting a test by doing the necessary
calculations.
Interpreting the results in terms of the original claim.
There are lots of situations where we cannot carry out a test
as rigorously as this.
In the exam, for instance, the data will already have been
collected for us!
Eg. It is alleged that I have a fair coin. I toss the coin 20 times and
get 4 heads. Test at the 5% level to see if there is evidence to
support the fact that the coin is biased against heads.
Eg. A student takes a multiple-choice test. There are 10 questions with 4
possible answers. Unfortunately the student has attended very few lessons
so has to guess. The student gets 5 questions right.
Is the student’s method of missing lessons and guessing a good strategy?
The student claims to be an inspired guesser. Carry out a test to determine if
this is true?
Lesson Objective
Be able to carry out a Hypothesis test using a Binomial distribution
Begin to understand the impact that the level of significance has on
the test and begin to recognise the difference between 1 and 2 tailed
tests.
I want to test at the 5% level to see if a 4 sided die labelled
1,2,3,4 is ‘biased’.
When I rolled the die 30 times I obtained 12 number ones.
Is the die biased?
I want to test at the 5% level to see if a 4
sided die labelled 1,2,3,4 is ‘biased’.
When I rolled the die 30 times I
obtained 12 number ones.
Is the die biased?
How many ‘ones’ would I have needed to throw to reject the null
hypothesis at the 5% level?
Need P(X≥a) < 0.025
1 – P(X ≤ a -1) < 0.025
or P(X<b) < 0.025
1) I want to test at the 5% level to see if a 6 sided die labelled
1, 2, 3, 4, 5, 6 is ‘biased’.
When I rolled the die 30 times I obtained 2 sixes.
Is the die biased?
2) In the past year 50% of patients undergoing a particular treatment reported ‘side
effects’. A Doctor claims that a new drug will be more effective at treating
patients but he is unsure how the percentage of patients reporting ‘side effects’
will change. He tests 30 patients with the new treatment and 23 patients claim to
have side effects.
Is this evidence at the 5% level to suggest that the drug has altered the percentage
of patients experiencing ‘side effects’?
3) It is claimed that the proportion of people in a town who
speak a second language is 1/5 . A test is to be conducted to see
if this proportion has changed. 30 people were asked if they
spoke a second language and 13 said that they did. Test at the
5% level to see if the proportion of people speaking a second
language has changed.
4) Repeat these tests at the 1% level of significance. Do any results change?
1) I want to test at the 5% level to see if a 6 sided die labelled
1, 2, 3, 4, 5, 6 is ‘biased’.
When I rolled the die 30 times I obtained 2 sixes.
Is the die biased?
Ho p = 1/6 H1 p ≠ 1/6 P(X≤ 2) = 0.102 accept Ho
2) In the past year 50% of patients undergoing a particular treatment reported ‘side
effects’. A Doctor claims that a new drug will be more effective at treating
patients but he is unsure how the percentage of patients reporting ‘side effects’
will change. He tests 30 patients with the new treatment and 23 patients claim to
have side effects.
Is this evidence at the 5% level to suggest that the drug has altered the percentage
of patients experiencing ‘side effects’?
Ho p = 1/2 H1 p ≠ 1/2 P(X≥23) = 0.0081 reject Ho
3) It is claimed that the proportion of people in a town who
speak a second language is 1/5 . A test is to be conducted to see
if this proportion has changed. 30 people were asked if they
spoke a second language and 13 said that they did. Test at the
5% level to see if the proportion of people speaking a second
language has changed.
Ho p = 1/5 H1 p ≠ 1/5 P(X≥13) = 0.0095 reject Ho
4) Repeat these tests at the 1% level of significance. Do any results change?
1) I want to test at the 5% level to see if a 6
sided die labelled 1, 2, 3, 4, 5, 6 is ‘biased’.
When I rolled the die 30 times I obtained 2
sixes. Is the die biased?
1) I want to test at the 5% level to see if a 6
sided die labelled 1, 2, 3, 4, 5, 6 is ‘biased’
against sixes.
When I rolled the die 30 times I obtained 2
sixes. Is the die biased?
2) A Doctor claims that a new drug will alter
the amount of time that people stay asleep.
30 patients are given the pill and 23 sleep for
less time. Is this evidence at the 5% level to
support the Doctors claim?
2) A Doctor claims that a new drug will make
people sleep for less time.
30 patients are given the pill and 23 sleep for
less time. Is this evidence at the 5% level to
support the Doctors claim?
3) It is claimed that the proportion of people in
a town who speak a second language is 1/5 .
A test is to be conducted to see if this
proportion has changed. 30 people were
asked if they spoke a second language and 13
said that they did. Test at the
5% level to see if the proportion of people
speaking a second language has change..
3) It is claimed that the proportion of people in
a town who speak a second language is 1/5 .
A test is to be conducted to see if this
proportion has increased. 30 people were
asked if they spoke a second language and 13
said that they did. Test at the
5% level.
What is the difference between the questions on the right and those
on the left?
1) I want to test at the 5% level to see if a 6
sided die labelled 1, 2, 3, 4, 5, 6 is ‘biased’.
When I rolled the die 30 times I obtained 2
sixes. Is the die biased?
1) I want to test at the 5% level to see if a 6
sided die labelled 1, 2, 3, 4, 5, 6 is ‘biased’
against sixes.
When I rolled the die 30 times I obtained 2
sixes. Is the die biased?
2) A Doctor claims that a new drug will alter
the amount of time that people stay asleep.
30 patients are given the pill and 23 sleep for
less time. Is this evidence at the 5% level to
support the Doctors claim?
2) A Doctor claims that a new drug will make
people sleep for less time.
30 patients are given the pill and 23 sleep for
less time. Is this evidence at the 5% level to
support the Doctors claim?
3) It is claimed that the proportion of people in
a town who speak a second language is 1/5 .
A test is to be conducted to see if this
proportion has changed. 30 people were
asked if they spoke a second language and 13
said that they did. Test at the
5% level to see if the proportion of people
speaking a second language has change..
3) It is claimed that the proportion of people in
a town who speak a second language is 1/5 .
A test is to be conducted to see if this
proportion has increased. 30 people were
asked if they spoke a second language and 13
said that they did. Test at the
5% level to see if the proportion of people
speaking a second language has change.
If when you conduct a test:
H1 is p ≠ something ---- Two tailed test (Sig % split between two ends)
H1 is p < something ---- One tailed test (Sig % all at one end!)
H1 is p > something ---- One tailed test (Sig % all at one end!)
1 or 2 Tailed test?
A company claims that it’s buses arrive early or on time at least
80% of the time. Rachel suspects that the buses are late more
frequently and decides to conduct a hypothesis test to test out
this claim. Over a period of 25 days the buses arrive late 7 times.
1 or 2 Tailed test?
A hospital believes that a new drug can alter the chances of a
woman conceiving a male baby. A test is to be carried out to test
see if this is true. In a test of 30 new mothers taking the drug 20
had male babies.
1 or 2 Tailed test?
A psychic claims to be able to read people’s minds. He asks 40
people to pick a random number between 1 and 10, he correctly
guesses the number chosen by 8 people.
1 or 2 Tailed test?
A die is rolled 40 times and 3 sixes were obtained. Is this
evidence to suggest that the probability of getting a six is less
than it should be ?
1 or 2 Tailed test?
The light bulb division of the consumer standards agency tests
light bulbs to see if they last the minimum time suggested by the
manufacturer. The company claims that 90% of light bulbs last
longer than 1000 hours. The LBDCSA check a sample of 40
light bulbs and find that 34 last longer than 1000 hours.
1 or 2 Tailed test?
In a test 60% of pupils failed to answer a question on
Histograms. A teacher reckons that a particular revision website
can improve the chances of success in answering the question.
After using the website 25 out of 30 students get a question on
Histograms correct.
Virgin rail claim that their trains on a particular
line are late only ¼ of the time.
A commuter tests this claim at the 5% level.
He samples 50 journeys and the trains were late
5 times. Does this suggest the proportion of late
trains is ¼ ?
Virgin rail claim that their trains on a particular
line are late less often than they used to be,
which was a ¼ of the time.
A commuter test this claim at the 5% level by
sampling 50 journeys, in which he found the
trains were late 5 times. Is Virgin correct?
1) A company claims that it’s buses arrive early or on
time at least 80% of the time. Rachel suspects that the
buses are late more frequently and decides to conduct
a hypothesis test to test out this claim. Over a period
of 25 days the buses arrive late 7 times.
Test at 5% level
2) A hospital believes that a new drug can alter the
chances of a woman conceiving a male baby. A
test is to be carried out to test see if this is true. In
a test of 30 new mothers taking the drug 20 had
male babies.
Test at 10% level
3) The light bulb division of the consumer standards
agency tests light bulbs to see if they last the minimum
time suggested by the manufacturer. The company claims
that 90% of light bulbs last longer than 1000 hours. The
LBDCSA check a sample of 40 light bulbs and find that 34
last longer than 1000 hours.
Test at 5% level.
Lesson Objective
Hypothesis testing: Critical Values and Critical Regions
A scratch card manufacturer claims that the probability that a
contestant wins a prize with their scratch card is 1/5.
From experience I believe the actual proportion of winning scratch
cards to be lower, so I test 30 scratch cards and win only twice.
a) Is this evidence at the 5% level that the manufacturer is less lying?
b) What is the largest number of wins I would need to obtain from 30
scratch cards to reject the manufactures claim?
When conducting a hypothesis test in real life it is usual to decide on
the outcomes which would be rejected before gathering data to test
the hypothesis.
The set of outcomes that would cause you to reject Ho is called the
critical region OR rejection region.
For a test where “H1 is p <“ the critical value is the largest outcome
in the critical region
For a test where “H1 is p >” the critical value is the smallest
outcome in the critical region
Using recent data provided by the low-cost airline ‘Loseitallair’, it
is estimated that the probability that a passenger loses his suitcase
on a flight is 0.1. From newspaper reports I think the figure is
higher. On 20 different occasions I take a flight with ‘Loseitallair’.
My luggage does not arrive on 7 occasions.
a) Use a hypothesis test, with a significance level of 5%, to check if
this data confirms my suspicions?
b) Decide the critical value for the test and state the rejection region.
c) What Significance would the test need to be conducted at to ensure
that my result was the critical value for the test.
Using recent data provided by the low-cost airline Brianair, the probability of a flight
arriving on time is estimated to be 0.8.
After some rescheduling, Brianair state that their performance has improved. In a
recent survey 19 out of 20 flights arrived on time.
Construct a critical region, using a significance level of 5%. Is Brianair’s conclusion
correct?
Let p be the probability a flight is on time.
H0: p = 0.8 (probability a flight is on time is 0.8)
H1: p > 0.8 (probability a flight is on time has increased)
Significance level set as 5%
Let X be the number of times a flight is on time.
If H0 is true then we are using p = 0.8 and X ~ B(20, 0.8).
Find the lowest value of a where P(X ≥ a) ≤ 0.05
1 – P(X ≤ a – 1) ≤ 0.05
P(X ≤ a – 1) > 0.95
P(X ≤ 18) = 0.9308
P(X ≤ 19) = 0.9885
The lowest possible value of a – 1 is 19
So the lowest possible value of a = 20
So the critical value is X = 20
The critical region is X ≥ 20 which of course with n = 20 is just X = 20
A student takes a multiple-choice test. There are 20 questions with 4 possible
answers. Unfortunately the student has attended very few lessons so has to
guess. The student gets 10 questions correct. Is the pupil better at guessing than
the average person?
What is the critical region and the critical value for testing this claim at the 5%
level.
What about at the 1% level?
Solution
Let X be the number of correct guesses.
If H0 is true then we are using p = 0.25and X ~ B(20, 0.25)
Find the lowest value of a where P(X ≥ a) ≤ 0.05
1 – P(X ≤ a – 1) ≤ 0.05
Find the critical region
P(X ≤ a – 1) > 0.95
P(X ≤ 7) = 0.8982
P(X ≤ 8) = 0.9591
The lowest possible value of a – 1 is 8
So the lowest possible value of a = 9
So the critical value is X = 9
The critical region is X ≥ 9
At 1% level
P(X ≤ 10 ) = 0.9961
so X = 10 is the critical value X ≥ 10
Lesson Objective
Understand when to use two tailed tests, and how to do a two tailed
test.
Sometimes we might want to test to see if the population parameter p is simply
different to what is being claimed.
Eg. I buy a new die from my local dice shop. I want to test to see if the die is
biased by checking the number of times I get a 6 when I roll it.
Then to set up my hypothesis test, I define p to be the probability of getting a
six when I roll a die.
Ho: p = 1/6
H1: p
1/
6

If I now carry out the hypothesis
test with a 5% level of significance, I need to
share out the 5% of extreme cases that I am going to reject Ho for obtaining
between the two tails of the distribution. This effectively means I will reject H0
if the outcome that I get is in the top 2.5% or in the bottom 2.5% of unlikely
results.
A test like this is called two tailed. Whenever we test for p  something we do a twotailed test.
Page 173 - book