Download Unit08 PowerPoint for statistics class

Document related concepts
Transcript
Welcome to
Week 08
College Statistics
http://media.dcnews.ro/image/201109/w670/statistics.jpg
http://www.andreabalt.com/wp-
http://www.howtodrawjourney.com/images/da-
Now, for something even more
profound…
TRUTH
Truth
Question on a true/false test:
1) US presidents are named
“Barack”
T ___
F ___
Truth
Question on a true/false test:
1) US presidents are named
“George”
T ___
F ___
Truth
Question on a true/false test:
1) US presidents are male
T ___
F ___
Truth
Question on a true/false test:
1) US presidents are at least
35 years old T ___
F ___
Truth
So, it is easier to be “false”
than to be “true”
Truth
So, it is easier to be “false”
than to be “true”
To be “true” a statement must
be true in all cases
Truth
So, it is easier to be “false”
than to be “true”
To be “true” a statement must
be true in all cases
If not, it is “false”
The Bad News…
We live in a world where the
“truth” is not always known
http://www.testically.org/wp-content/uploads/2010/11/hmm.jpg
Guilt
In a court of law, you never
REALLY know if someone is
guilty or not
www.torontoinjurylawyerblog.com
Guilt
Even with an eyewitness, the
witness could be:
- Mistaken
- Lying
Guilt
There is always a level of
UNCERTAINTY
Guilt
Two standards of proof of guilt
in US courts of law:
1) Criminal cases – “beyond a
reasonable doubt”
2) Civil cases – “a
preponderance of the
evidence”
Guilt
In probability terms, legal
authorities estimate:
“beyond a reasonable doubt” is
98-99% likelihood of guilt
based on the evidence
(Thanks to Ronald B. Standler)
Guilt
In probability terms, legal
authorities estimate:
“a preponderance
of the evidence” is
just a hair over
50% likelihood of
guilt
(Thanks to Ronald B. Standler)
Guilt
There is always the possibility
of being WRONG
http://www.gettyimages.com/detail/photo/businessman-cryingclose-up-high-res-stock-photography/AB34183
Guilt
In the US, a suspect is
considered “innocent until
proven guilty in a court of law”
www.coloradospringsdivorceattorneyblog.com
Guilt
BUT… if a suspect is not found
guilty in court, are they called:
- innocent ?
- not guilty ?
www.legaljuice.com
Guilt
The suspect is called “not
guilty” because the defense
hasn’t proved their innocence…
it is just that the prosecution
was unable to prove their guilt!
Questions?
http://i.imgur.com/aliTlT3.jpg
Hypothesis Tests
In science, an “educated guess”
is called a:
hypothesis
Hypothesis Tests
In science, using experimental
evidence to see if it supports a
hypothesis is called a:
hypothesis test
Hypothesis Tests
Vikings in Newfoundland?
http://i.cbc.ca/1.3517691.1459555583!/fileImage/httpImage/image.JPG_gen/derivatives/original_
620/digging-at-point-rosee.JPG
Hypothesis Tests
Our hypothesis was that we
wouldn’t find anything…
Hypothesis Tests
We rejected that hypothesis!
http://www.cbc.ca/news/canada/newfoundland-labrador/vikings-newfoundland-1.3515747
Hypothesis Tests
In practice, we often do
hypothesis tests “undercover”
as
Hypothesis Tests
In practice, we often do
hypothesis tests “undercover”
as
CONFIDENCE INTERVALS
Hypothesis Tests
Suppose we had a 95% confidence
interval:
5 ≤ µ ≤ 10
Suppose our hypothesis was that
µ = 7
Is 7 a likely value for µ given
our confidence interval?
Hypothesis Tests
Because µ = 7 is in our
confidence interval:
5 ≤ µ ≤ 10
It is a possible value given our
data
Hypothesis Tests
Because µ = 7 is in our
confidence interval:
5 ≤ µ ≤ 10
It is a possible value given our
data…
but so are µ = 6 µ = 8 µ = 9.3
µ = 5.1 µ = 6.79431…
Hypothesis Tests
What if the hypothesized value
for µ was 11?
5 ≤ µ ≤ 10
We are 95% confident that µ
cannot be 11 given our evidence
Hypothesis Tests
We reject the hypothesis that
µ = 11 if 5 ≤ µ ≤ 10 with
95% confidence
Hypothesis Tests
We reject the hypothesis that
µ = 11 if 5 ≤ µ ≤ 10 with
95% confidence
We will be wrong to do this 5%
of the time (100% - 95%)
Hypothesis Tests
We reject the hypothesis that
µ = 11 if 5 ≤ µ ≤ 10 with
95% confidence
We will be wrong to do this 5%
of the time (100% - 95%)
The amount of time we are
willing to be wrong is called our
“α-level”
Hypothesis Tests
The confidence interval can be
used to test hypothesized
values of µ using the mean,
standard deviation and sample
size of our sample data
Hypothesis Tests
Whether we can reject a
hypothesis or not depends on
how variable our data are!
Not too different…
Very different!
Hypothesis Tests
(see why variability is
important?)
Questions?
http://i.imgur.com/aliTlT3.jpg
Hypothesis Tests
Rejecting a hypothesis is a
strong statement
We have evidence to show
µ ≠ 11
Hypothesis Tests
If the value is included in the
confidence interval, you cannot
make a strong statement
We haven’t proved µ = 7
(because it could be a wide
range of numbers within the
interval)
Hypothesis Tests
So, we merely “fail to reject”
the hypothesis
Hypothesis Tests
Our exercise on human
temperature last week was a
test of the hypothesis that
normal human temperature is
98.6°
Hypothesis Tests
98.6°
Hypothesis Tests
PROJECT QUESTION
You have a hypothesis that
normal human body temperature
is 98.6°
You have experimentally found
that measured using an IR
thermometer, the inside mouth
temperature is between 89.9°
and 93.6° with 95% confidence
Hypothesis Tests
PROJECT QUESTION
89.9° < temp < 93.6°
What do you decide about your
hypothesis that human body
temperature is 98.6°?
Hypothesis Tests
PROJECT QUESTION
89.9° < temp < 93.6°
Reject your hypothesis that
human body temperature is
98.6°
What is the probability that
you are wrong to reject this
hypothesis?
Hypothesis Tests
PROJECT QUESTION
89.9° < temp < 93.6°
Reject your hypothesis that
human body temperature is
98.6°
What is the probability that
you are wrong to reject this
hypothesis?
5%
Questions?
http://i.imgur.com/aliTlT3.jpg
Hypothesis Tests
You need to answer an "Is
there a difference" question
Is there any difference
between these two populations?
Does some new process improve
results?
Hypothesis Tests
There is a TRUE (population)
answer to your question
Hypothesis Tests
You will NEVER find the true
answer to most questions
because of variability:
in your measurements
in the data itself
in the measuring tool
in the samples you get from
your population
Hypothesis Tests
Are the statistics demons mad
at you today?
Hypothesis Tests
Reality of life: things aren't clear,
certain and constant
They are fuzzy, uncertain and
variable
Hypothesis Tests
This is the basis of statistics getting a measurement of the
fuzziness - "variability"
Hypothesis Tests
A hypothesis is a statement
about the properties of the
population
Hypothesis Tests
It may be obtained from theory,
hearsay, historical studies, etc.
Hypothesis Tests
A null hypothesis states
"there is no difference between
populations"
or
"a process has no effect"
Hypothesis Tests
It is symbolized: H0
Hypothesis Tests
Because it is easier to prove
something false than to prove it
true…
H0 is the hypothesis we want to
reject
Hypothesis Tests
We want to show the
populations are different or the
process has an effect - called
the alternate hypothesis or Ha
Hypothesis Tests
Usually we set Ha before H0,
since it is the one we are
interested in
Hypothesis Tests
Null hypotheses about
population means are typically
like:
μ = some value
Hypothesis Tests
Alternative hypotheses about
means can be:
μ ≠ some value
(called a two-tailed test)
μ < some value
μ > some value
(called one-tailed tests)
Hypothesis Tests
A two-tailed test will reject H0
either if the experimental
values we get are too high or
too low
Hypothesis Tests
α is split between the upper
and lower tails
Hypothesis Tests
A one-tailed test will reject H0
only on the side we think is
likely to
be true
Hypothesis Tests
You will be able to reject H0
more often for a one-tailed
test – if you pick the right tail!
Hypothesis Tests
PROJECT QUESTION
Your owner's manual says you
should be getting 30 mpg
highway
After owning the car for six
months, you are only getting 27
mpg highway
Hypothesis Tests
PROJECT QUESTION
Is that different enough to
reject the company's claim?
What is your α-level?
What is H0?
What is Ha?
Hypothesis Tests
PROJECT QUESTION
Is that different enough to
reject the company's claim?
What is your α-level?
5% or 0.05
What is H0?
What is Ha?
Hypothesis Tests
PROJECT QUESTION
Is that different enough to
reject the company's claim?
What is your α-level?
5% or 0.05
What is H0?
μ = 30 mpg
What is Ha?
Hypothesis Tests
PROJECT QUESTION
Is that different enough to
reject the company's claim?
What is your α-level?
5% or 0.05
What is H0?
μ = 30 mpg
What is Ha?
μ < 30 mpg
Hypothesis Tests
PROJECT QUESTION
We could also write it as:
H0: μ ≥ 30 mpg
Ha: μ < 30 mpg
Hypothesis Tests
PROJECT QUESTION
Is this a one-tailed or a twotailed test?
Hypothesis Tests
PROJECT QUESTION
Is this a one-tailed or a twotailed test?
one-tailed
Is it right-tailed or left-tailed?
Hypothesis Tests
PROJECT QUESTION
Is it right-tailed or left-tailed?
left-tailed
Questions?
http://i.imgur.com/aliTlT3.jpg
Hypothesis Tests
The experiment is designed to
gather valid information to test
the likelihood of that null
hypothesis being true
Hypothesis Tests
So, since we want to show the
null hypothesis is NOT true, we
want to show that
getting the results we got (if
the null hypothesis IS true)
is very unlikely
Hypothesis Tests
If we get those “unlikely” data
Hypothesis Tests
Then we reject the null
hypothesis and have
statistically proved our
alternative hypothesis and
Hypothesis Tests
CELEBRATE!
Hypothesis Tests
But any experiment runs the
risk of weird results
The objective of hypothesis
testing is to estimate the
likelihood of weird results
Hypothesis Tests
One type of error consists of
rejecting a true hypothesis
We call this a “Type 1 error”
Hypothesis Tests
If this happens, people will
accuse us of rigging our data to
prove Ha
So, we want
this to happen
very rarely
Hypothesis Tests
The probability of a Type 1
error is called
Hypothesis Tests
The probability of a Type 1
error is called an α-level
Hypothesis Tests
Typically we use α = 0.05 (5%)
or 0.01 (1%)
Hypothesis Tests
If is crucial to set your α-level
before you do the experiment
or gather any data
Hypothesis Tests
If is crucial to set your α-level
before you do the experiment
or gather any data
Otherwise people will
accuse you of setting
the level to ensure
rejecting H0
Hypothesis Tests
You can make the opposite
mistake:
fail to reject H0 when it is
false
Called a Type 2 error
The probability of this kind of
error is denoted by β (beta)
Hypothesis Tests
We HATE Type 2 errors
because they mean we FAILED
to prove what we wanted to
prove!
(Remember,
we want
to reject H0)
Hypothesis Tests
Usually β is computed after the
experiment (not determined in
advance by the experimenter)
Hypothesis Tests
Generally, the larger α value
that you permit, the smaller β
value you will end up with
Conversely, if you demand a
smaller α, you will usually get a
larger β
Hypothesis Tests
Other factors affecting β:
sample size
it’s harder to detect a
difference if it’s really
really tiny
Hypothesis Tests
Likelihood of making the right
decision and rejecting the
(false) null hypothesis is:
1 - β
called the “power of the test”
Hypothesis Tests
For a given α value, we would
like the test to be as
"powerful" as possible, give us
the best chance of rejecting a
false null hypothesis
Hypothesis Tests
PROJECT QUESTION
Which is more powerful, a onetailed or a two-tailed test?
Hypothesis Tests
PROJECT QUESTION
Which is more powerful, a onetailed or a two-tailed test?
one-tailed (if you guess the
right side)
Hypothesis Tests
This setup allows us only to
disprove a null hypothesis,
never prove it
Hypothesis Tests
We either disprove it, or we
fail to disprove it
Hypothesis Tests
We NEVER accept the null
hypothesis
Hypothesis Tests
"Fail to reject" the null
hypothesis is the defaultdecision
Hypothesis Testing
This results not from evidence
in favor of the null hypothesis
but from the absence of
evidence against it
Hypothesis Tests
Rejecting the null hypothesis is
a strong conclusion, stating
that (with no more than α given
chance of error) the null
hypothesis is wrong
Hypothesis Tests
The confidence interval for the
hypothesis test will be kinda the
opposite of what we did before
Now we will create a confidence
interval for 𝒙 based on our
hypothesized value for μ and see
if our 𝒙 falls in it
Hypothesis Tests
How to do it!
Hypothesis Tests
How to do it!
Set your α-level
(how often you are
willing to be wrong)
Hypothesis Tests
How to do it!
Set your α-level
Define your Ha and H0
Hypothesis Tests
How to do it!
Set your α-level
Define your Ha and H0
Get your data (for a
confidence interval, you
need the hypothesized
μ, s and n (or se)
Hypothesis Tests
How to do it!
Set your α-level
Define your Ha and H0
Get your data
Find your critical value (for
two-sided α=5%
it is ≈2)
Hypothesis Tests
How to do it!
Set your α-level
Define your Ha and H0
Get your data
Find your critical value
Calculate the confidence
interval using μ rather
than 𝒙
Hypothesis Tests
How to do it!
Set your α-level
Define your Ha and H0
Get your data
Find your critical value
Calculate the confidence
interval for 𝒙
The test will be: Is 𝒙 in it?
Hypothesis Tests
PROJECT QUESTION
Back to our mpg!
H0: μ ≥ 30 mpg
Ha: μ < 30 mpg
x = 27
And suppose we know that:
se = 4 mpg
Hypothesis Tests
PROJECT QUESTION
H0: μ ≥ 30 mpg
Ha: μ < 30 mpg
x = 27
se = 4 mpg
Are we going to reject H0 for
values of x greater than 30 or
less than 30?
Hypothesis Tests
PROJECT QUESTION
H0: μ ≥ 30 mpg
Ha: μ < 30 mpg
x = 27
se = 4 mpg
If the critical value for a onesided confidence interval test
at the 5% level is 1.64, create
a test of our hypothesis
Hypothesis Tests
PROJECT QUESTION
H0: μ ≥ 30 mpg
Ha: μ < 30 mpg
x = 27
se = 4 mpg
Reject H0 if x < 30 - (1.64)(4)
< 23.44
What is our conclusion?
Hypothesis Tests
PROJECT QUESTION
H0: μ ≥ 30 mpg
Ha: μ < 30 mpg
x = 27
se = 4 mpg
Reject H0 if x < 30 - (1.64)(4)
< 23.44
What is our conclusion?
fail to reject H0
Questions?
http://i.imgur.com/aliTlT3.jpg
Hypothesis Tests
If you reject H0 with an αlevel of 0.05, we also say our
x value is
“significant at the .05 level”
or we say we found a
“significant difference”
Hypothesis Tests
We can make our x more likely
to be significant by (as usual):
TAKING A LARGER
SAMPLE SIZE
Hypothesis Tests
Because we can “cheat the
system” by taking a huge
sample size that will find any
teeny, tiny difference to be
significant, we have a backup
plan
Hypothesis Tests
We also set levels of “practical
significance” - what numerical
difference would convincingly
show a significant difference
Hypothesis Tests
These levels of practical
significance come from our
knowledge of the variables we
are measuring
Hypothesis Tests
If we had taken a sample of
10,000,000 to calculate our mpg
average and se, we could easily
have had an se of 0.1 mpg
Probably we wouldn’t really think
that was a significant difference
in mileage
Hypothesis Tests
A practically significant
difference would be the amount
in mpg that you would think is
different enough from 30 mpg to
be important
Hypothesis Tests
We set a level of practical
significance at the same time
we set the α-level
Hypothesis Tests
PROJECT QUESTION
What would be your level of
practically significant
difference for mpg?
Questions?
http://i.imgur.com/aliTlT3.jpg