Download Statistical Significance - Palisades School District

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
The Role of Probability in Statistics:
Statistical Significance
Introduction to Probability and Statistics
Ms. Young
Objective
Understand the concept of statistical
significance and the essential role that
probability plays in defining it.
Statistical Significance

A set of measurements or observations are considered to be
statistically significant if they probably DID NOT occur by
chance


Ex. ~ Tossing a coin 100 times and getting 80 heads and 20 tails
would be statistically significant because it probably did not occur
by chance
Example 1:

Determine whether each scenario is statistically significant or not

A detective in Detroit finds that 25 of the 62 guns used in crimes
during the past week were sold by the same gun shop.

This finding is statistically significant. Because there are many gun shops in
the Detroit area, having 25 out of 62 guns come from the same shop seems
unlikely to have occurred by chance.
Example 1 Cont’d…

In terms of the global average temperature, five of the years between
1990 and 1999 were the five hottest years in the 20th century.
Having the five hottest years in 1990–1999 is statistically significant
 By chance alone, any particular year in a century would have a 5 in 100, or 1
in 20, chance of being one of the five hottest years. Having five of those
years come in the same decade is very unlikely to have occurred by chance
alone
 This statistical significance suggests that the world may be warming up


The team with the worst win-loss record in basketball wins one game
against the defending league champions.

This one win is not statistically significant because although we expect a
team with a poor win-loss record to lose most of its games, we also expect it
to win occasionally, even against the defending league champions
Example 2

A researcher conducts a double-blind experiment that tests whether a
new herbal formula is effective in preventing colds. During a threemonth period, the 100 randomly selected people in a treatment group
take the herbal formula while the 100 randomly selected people in a
control group take a placebo. The results show that 30 people in
the treatment group get colds, compared to 32 people in the
control group. Can we conclude that the new herbal formula is
effective in preventing colds?
Whether a person gets a cold during any three-month period depends on
many unpredictable factors. Therefore, we should not expect the number of
people with colds in any two groups of 100 people to be exactly the same.
 In this case, the difference between 30 people getting colds in the
treatment group and 32 people getting colds in the control group seems
small enough to be explainable by chance.
 So the difference is not statistically significant, and we should not conclude
that the treatment is effective.

Quantifying Statistical Significance


Determining if something is statistically significant can be obvious in
some cases (i.e, 80 heads vs. 20 tails), but how do you decide if
something is statistically significant if the numbers are closer (i.e., 55
heads vs. 45 tails)?
Probability is used to quantify statistical significance by determining
the likelihood that a result may have occurred by chance

.05 level of significance: if the probability that something DID occur by
chance is less than or equal to .05, or 5%, then it is statistically significant
at the .05 level


.01 level of significance: if the probability that something DID occur by
chance is less than or equal to .01, or 1%, then it is statistically significant at
the .01 level


In other words, if the probability that something did occur by chance is small
(5%), then the probability that it did not occur by chance is big (95%), which
means it is statistically significant because it probably did not occur by chance
In other words, if the probability that something did occur by chance is small (1%),
then the probability that it did not occur by chance is big (99%), which means it is
statistically significant because it probably did not occur by chance
Something that is significant at the .01 level is also significant at the .05
level (since 1% is less than 5%), but something significant at the .05 level is
not necessarily significant at the .01 level (since something could be
significant at the .05 level if it’s under 5%, but doesn’t have to be as low as
1%)
Example 3

In the test of the Salk polio vaccine, 33 of the 200,000 children in the
treatment group got paralytic polio, while 115 of the 200,000 in the
control group got paralytic polio. Calculations show that the probability
of this difference between the groups occurring by chance is less than
0.01. Describe the implications of this result.

The results are significant at the .01 level. This means there is a 1% chance
or less that the results occurred by chance, therefore the results probably
did not occur by chance which means that there is good reason to believe
that the treatment works.
Fundamentals of Hypothesis Testing
Introduction to Probability and Statistics
Ms. Young
Objective

After this section you will understand the
goal of hypothesis testing and the basic
structure of a hypothesis test, including how
to set up the null and alternative hypotheses,
how to determine the possible outcomes of a
hypothesis test, and how to decide between
these possible outcomes.
Statistical Claims




“Of our 350 million users, more than 50% log on to
Facebook everyday”
“Using Gender Choice could increase a woman’s chance
of giving birth to a baby girl up to 80%”
“According to the U.S. Census Bureau, Current
Population Surveys, March 1998, 1999, and 2000, the
average salary of someone with a high school diploma
is $30,400 while the average salary of someone with
a Bachelor's Degree is $52,200.”
How could we determine whether these claims are
true or not?

Hypothesis Testing
Formulating the Hypothesis

A hypothesis is a claim about a population
parameter
Could either be a claim about a population mean, μ,
or a population proportion, p
 All of the claims on the previous slide would be
considered hypotheses


A hypothesis test is a standard procedure for
testing a claim about a population parameter

There are always at least two hypotheses in any
hypothesis test; the null & alternative hypotheses
Null Hypothesis

The null hypothesis, represented as H 0 (read
as “H-naught”), is the starting assumption for
a hypothesis test

The null hypothesis always claims a specific value
for a population parameter and therefore takes the
form of an equality

Take the claim, “using Gender Choice could increase a
woman’s chance of giving birth to a baby girl up to 80%”
for example. If the product did not work, it would be
expected that there would be an approximately equally
likely chance of having either a boy or a girl. Therefore,
the null hypothesis (the claim not working) would be:
null hypothesis -
H 0: p  0.5
Alternative Hypothesis

The alternative hypothesis, represented as H a , is a claim that
the population parameter has a value that differs from the
value claimed in the null hypothesis, or in other words, the claim
does hold true

The alternative hypothesis can take one of the following forms:

left tailed H a : population parameter < claimed value
 Ex. ~ A manufacturing company claims that their new hybrid model
gets 62 mpg. A consumer group claims that the mean fuel
consumption of this vehicle is less than 62 mpg.
 This alternative hypothesis would be considered left-tailed since
the claimed value is smaller (or to the left) of the null value
null hypothesis − 𝐻0 : 𝜇 = 62 mpg alternative hypothesis - H a :   62 mpg

right tailed H a : population parameter > claimed value
 Ex. ~ The claim that Gender Choice increases a woman’s chance of
having a baby girl up to 80% would be testing values above the null
value of .5, and would therefore be right-tailed
null hypothesis − 𝐻0 : 𝑝 = .5
alternative hypothesis - H a : p  0.5
Alternative Hypothesis Cont’d…

two tailed H a : population parameter  claimed value

Ex. ~ A wildlife biologist working in the African
savanna claims that the actual proportion of
female zebras in the region is different from the
accepted proportion of 50%.

Since the claim does not specify whether the alternative
hypothesis is above 50% or below 50%, it would be
considered two-tailed in which case the values above and
below would be tested
null hypothesis − 𝐻0 : 𝑝 = .5
alternative hypothesis - H a : p  0.5
Possible Outcomes of a Hypothesis Test

There are two possible outcomes to a
hypothesis test:
Reject the null hypothesis in which case we have
evidence in support of the alternative hypothesis
 Do Not reject the null hypothesis in which case we
do not have enough evidence to support the
alternative hypothesis


NOTE – Accepting the null hypothesis is not a possible
outcome since it is the starting assumption.


The test may provide evidence to NOT REJECT the null
hypothesis, but that does not mean that the null hypothesis
is true
Be sure to formulate the null and alternative
hypotheses prior to choosing a sample to avoid
bias
Example 1

For the following case, describe the possible
outcomes of a hypothesis test and how we
would interpret these outcomes

The manufacturer of a new model of hybrid car
advertises that the mean fuel consumption is equal
to 62 mpg on the highway (μ = 62 mpg). A consumer
group claims that the mean is less than 62 mpg
(μ < 62 mpg).

Possible outcomes:


Reject the null hypothesis of μ = 62 mpg in which case we
have evidence in support of the consumer group’s claim that
the mean mpg of the new hybrid is less than 62
Do not reject the null hypothesis, in which case we lack
evidence to support the consumer group’s claim
 Note – this does not necessarily imply that the
manufacturer’s claim is true though
Drawing a Conclusion from a Hypothesis Test

Using the claim that Gender Choice could increase a
woman’s chance of giving birth to a baby girl up to
80%, suppose that a sample produces a sample
proportion of, pˆ  0.52 .

Although this supports the alternative hypothesis of p  0.5 ,
is it enough evidence to reject the null hypothesis?


This is where statistical significance comes into play (introduced
earlier)
Recall that something is considered to be statistically
significant if it most likely DID NOT occur by chance

There are two levels of statistical significance



The 0.05 level ~ which means that if the probability of a
particular result occurring by chance is less than 0.05, or 5%,
then it is considered to be statistically significant at the 0.05
level
The 0.01 level ~ which means that if the probability of a
particular result occurring by chance is less than 0.01, or 1%,
then it is considered to be statistically significant at the 0.01
level
The 0.01 level would represent a stronger significance than
the 0.05 level
Hypothesis Test Decisions
Based on Levels of Statistical Significance

We decide the outcome of a hypothesis test by
comparing the actual sample result (mean or
proportion) to the result expected if the null
hypothesis is true (using z-scores). We must choose a
significance level for the decision.



If the chance that the sample result occurred by chance is
less than 0.01, then the test is statistically significant at the
0.01 level and offers STRONG evidence for rejecting the null
hypothesis.
If the chance that the sample result occurred by chance is
less than 0.05, then the test offers MODERATE evidence for
rejecting the null hypothesis.
If the chance that the sample result occurred by chance is
greater than the chosen level of significance (0.01 or 0.05),
then we DO NOT reject the null hypothesis.
P-Values

A P-Value, or probability value, is the value that
represents the probability of selecting a sample at
least as extreme as the observed sample





In other words, it is the value that allows us to determine if
something is statistically significant or not
NOTE ~ notice that the P-Value is represented using a capitol
P, whereas the population proportion is represented using a
lowercase p.
We will learn how to actually calculate the P-Value in the
following sections
A small P-value indicates that the observed result is
unlikely (therefore statistically significant) and
provides evidence to reject the null hypothesis
A large P-value indicates that the sample result is not
unusual, therefore not statistically significant - or
that it could easily occur by chance, which tells us to
NOT reject the null hypothesis
Example 2

You suspect that a coin may have a bias toward landing tails more
often than heads, and decide to test this suspicion by tossing the
coin 100 times. The result is that you get 40 heads (and 60 tails).
A calculation (not shown here) indicates that the probability of
getting 40 or fewer heads in 100 tosses with a fair coin is
0.0228. Find the P-value and level of statistical significance for
your result. Should you conclude that the coin is biased against
heads?
The P-Value is 0.0228
 This value is smaller than 5% (.05), but not smaller than 1% (.01), so
it is statistically significant at the 0.05 level which gives us moderate
reason to reject the null hypothesis and conclude that the coin is
biased against heads

Putting It All Together
Step 1. Formulate the null and alternative hypotheses, each of which must
make a claim about a population parameter, such as a population
mean (μ) or a population proportion (p); be sure this is done
before drawing a sample or collecting data. Based on the form of
the alternative hypothesis, decide whether you will need a left-,
right-, or two-tailed hypothesis test.
Step 2. Draw a sample from the population and measure the sample
statistics, including the sample size (n) and the relevant sample
statistic, such as the sample mean (x) or sample proportion (p).
Step 3. Determine the likelihood of observing a sample statistic (mean or
proportion) at least as extreme as the one you found under the
assumption that the null hypothesis is true. The precise
probability of such an observation is the P-value (probability
value) for your sample result.
Step 4. Decide whether to reject or not reject the null hypothesis, based
on your chosen level of significance (usually 0.05 or 0.01, but
other significance levels are sometimes used).
Hypothesis Tests for Population Means
Introduction to Probability and Statistics
Ms. Young
Objective

After this section you will understand and
interpret one- and two-tailed hypothesis
tests for claims made about population
means,.
Background Info

Recall that there are two possible outcomes of a
hypothesis test; to either reject or not reject the null
hypothesis


To determine whether to reject or not, a P-value needs to be
calculated and then compared to the desired level of
significance (usually .05 or .01).
To calculate a P-value, you must first understand the
concepts of a normal distribution (introduced in ch.5):



Recall that if a distribution is normal, you can use z-scores
along with a z-score table to find probabilities of certain
values occurring
Also recall that a distribution begins to take the shape of a
normal distribution when the sample size is at least 30 and
becomes more and more normal as the sample size increases
(Central Limit Theorem)
In essence, a P-value (probability value) is the probability that
is found using z-scores and the z-score table

Be sure that you are using the sample standard deviation, s ,
when calculating the z-score since you are comparing a sample
(group mean or group proportion) to the entire population
One-Tailed Hypothesis Tests

As mentioned earlier, hypothesis tests can either be one-tailed (left or
right) or two-tailed


The process for conducting a left-tailed test is the same as the process for
conducting a right-tailed test, but a two-tailed test varies slightly
Example 1 ~ Left-Tailed Hypothesis Test:

Columbia College advertises that the mean starting salary of its graduates is
$39,000. The Committee for Truth in Advertising suspects that this claim is
exaggerated and that the mean starting salary for graduates is actually
lower. They decide to conduct a hypothesis test to seek evidence to support
this suspicion. Suppose that the committee gathered a sample of 100
graduates and found that the sample mean is 𝑥 = $37,000 and the standard
deviation for that sample is s = $6,150
 Step 1: State the null and alternative hypotheses
H 0 :   $39,000

H a :   $39,000
Step 2: Draw a sample and come up with a sample statistic and the
standard deviation of that sample:
𝑛 = 100 𝑥 = $37,000 𝑠 = $6,150

Example 1 Cont’d…

Step 3: Calculate the P-value (using the normal distribution and
z-scores) and determine the level of significance

z
In order to calculate the P-value, we need to find the z-score using the Central
Limit Theorem since we are dealing with the mean of a group. Since we do not
know the population standard deviation, we will use the standard deviation
found for the sample as an estimate.
x 
 n

z
37, 000  39, 000
6150
100

z  3.25
 Using the z-score table we find that a z-score of -3.25 correlates with a
probability of .0006, or .06%. This is the P-value.
 Since this value is less than .05 it is significant at the .05 level, but even
better, this value is less than .01 which means that it is significant at the
.01 level

Step 4: Decide if you should reject or not reject the null
hypothesis


Since the P-value is significant at both levels (.05 and .01), we should reject
the null hypothesis of $39,000
What this means is that we have strong evidence to believe that Columbia
College exaggerated about the mean starting salary of their graduates being
$39,000 and that it is most likely lower.
One-Tailed Hypothesis Tests
Example 2 ~ Right-Tailed Hypothesis Test

In the United States, the average car is driven about 12,000 miles each year. The owner
of a large rental car company suspects that for his fleet, the mean distance is greater
than 12,000 miles each year. He selects a random sample of n = 225 cars from his fleet
and finds that the mean annual mileage for this sample is 𝑥 = 12,375 miles. Suppose that
the standard deviation for that sample is 2,415 miles. Interpret this claim by conducting
a hypothesis test.

Step 1: State the null and alternative hypotheses
H 0 :   12, 000 miles


H a :   12,000 miles
Step 2: Draw a sample and come up with a sample statistic and the standard
deviation of that sample
 This information was already given:
 The sample is 𝑥 = 12,375
 The standard deviation for that sample is 2,415 miles
Step 3: Calculate the P-value and determine the level of significance:
 The z-score is:
z
12,375  12, 000
 2.33
2415
225
One-Tailed Hypothesis Tests
Example 2 Cont’d…

Step 3 cont’d…



The z-score was found to be 2.33 which corresponds to a probability of .9901
on the z-score table, but that represents the area below 12,375 and we are
interested in knowing the probability of a car being driven more than that value
so we subtract .9901 from 1 (1 - .9901) and get a probability of .0099
The P-value is .0099 which is less than .01, meaning that it is significant at the
.01 level
Step 4: Decide if you should reject or not reject the null hypothesis


Since the P-value is significant at both levels (.05 and .01), we should reject
the null hypothesis of 12,000 miles
What this means is that we have strong evidence to believe that the mean
distance traveled for the rental car fleet is greater than 12,000 miles
Critical Values for Statistical Significance


Since we can decide to reject the null hypothesis if the P-value is .05 or
lower (or .01 or lower), we can use critical values as a quick guideline to
decide if we should reject the null hypothesis or not
Critical values for .05 significance level:
For a left-tailed test, the z-score that corresponds to a probability of .05 is
-1.645, so any z-score that is less than or equal to -1.645 will be statistically
significant at the .05 level
 For a right-tailed test, the z-score that corresponds to a probability of .05
(which we would look for .95 on the chart) is 1.645, so any z-score greater
than or equal to 1.645 will be statistically significant at the .05 level


Critical values for the .01 significance level:
For a left-tailed test, the z-score that corresponds to a probability of .01 is
-2.33, so any z-score that is less than or equal to -2.33 will be statistically
significant at the .01 level
 For a right-tailed test, the z-score that corresponds to a probability of .01
(which we would look for .99 on the chart) is 2.33, so any z-score greater than
or equal to 2.33 will be statistically significant at the .01 level

Critical Values for Statistical Significance
Two-Tailed Hypothesis Tests


The process for conducting a two-tailed hypothesis test is very similar to
the one-tailed tests, except the critical values are slightly different
Since a two tailed test tests both above and below the claimed value, a
.05 significance level would have to be split between the two extremes
thus looking for a z-score that corresponds to a probability of .025

The z-scores that correspond to a probability of .025 are -1.96 and
1.96, so for a two-tailed test, it is significant at the .05 level if the
z-score is less than or equal to -1.96 or greater than or equal to 1.96
Two-Tailed Hypothesis Tests

For a two-tailed test, a .01 significance level would mean that the
z-score needs to correspond to a probability of .005 (.01 split in
half)


The z-scores that correspond to a probability of .005 are -2.575 and
2.575, so if the z-score is less than or equal to -2.575 or greater
than or equal to 2.575, then it is statistically significant at the .01
level
Summary of critical values for two-tailed tests:

.05 significance level:
z  1.96 or z  1.96

.01 significance level:
z  2.575 or z  2.575
Two-Tailed Hypothesis Tests
Example 3 ~ Two-Tailed Hypothesis Test:

Consider the study in which University of Maryland researchers
measured body temperatures in a sample of n = 106 healthy adults,
finding a sample mean body temperature of x  98.20°F with a
sample standard deviation of 0.62°F. We will assume that the
population standard deviation is the same as the standard deviation
found from the sample. Determine whether this sample provides
evidence for rejecting the common belief that the mean human body
temperature is   98.60°F

Step 1: State the null and alternative hypotheses
H 0 :   98.6°F

H a :   98.6°F
Step 2: Draw a sample and come up with a sample statistic and the
standard deviation of that sample
 This information was already given:
 The sample mean is x  98.20°F
 The standard deviation for that sample is 0.62°F
Two-Tailed Hypothesis Tests
Example 3 Cont’d…

Step 3: Calculate the P-value and determine the level of significance


To calculate the P-value for a two tailed test, you must find the z-score
like you would with a one-tailed test, but the probability that corresponds
to it must then be multiplied by 2
The z-score is:
z


98.2  98.6
 6.64
0.62
106
The P-value is less than .0002 (.0001 * 2), and since the z-score of -6.64 is
significantly lower than -1.96 and -2.575, this would be statistically significant
at both levels
Step 4: Decide if you should reject or not reject the null hypothesis

The null hypothesis should be rejected which provides strong evidence that
the mean human body temperature is not 98.6°F. It may be either higher or
lower.