Download Example Is the New Drug Better?

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Taylor's law wikipedia , lookup

Eigenstate thermalization hypothesis wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
7.2 THE LANGUAGE OF STATISTICAL DECISION MAKING
DEFINITIONS:
The population is the entire group of objects or individuals under
study, about which information is desired.
A sample is a part of the population that is actually used to get
information.
Statistical inference is the process of drawing conclusions about the
population based on information from a sample of that population.
DEFINITIONS:
The null hypothesis, denoted by
H 0, is the conventional belief--the
status quo, or prevailing viewpoint, about a population.
The alternative hypothesis, denoted by
H1, is an alternative to the
null hypothesis – the statement that there is an effect, a difference, a
change in the population.
Tip: Having trouble determining the alternative hypothesis? Ask
yourself “Why is the research being conducted?” The answer is
generally the alternative hypothesis, which is why the alternative
hypothesis is often referred to as the research hypothesis.
Aspirin Cuts Cancer Risk
Problem 1
According to the American Cancer
Society, the lifetime risk of developing
colon cancer is 1 in 16. A study
suggests that taking an aspirin every
other day for 20 years can cut your risk
of colon cancer nearly in half. However,
the benefits may not kick in until at
least a decade of use.
Write the null and the alternative
hypotheses for this setting.
(a)
Is this a one-sided to the right,
one-sided to the left, or two-sided
alternative hypothesis? Hint: look at
the alternative hypothesis.
(b)
Solution 1
(a)
H0 :
Average Life Span Problem 2
Poll Results Problem 3
Suppose you work for a company that
produces cooking pots with an average
life span of seven years. To gain a
competitive advantage, you suggest
using a new material that claims to
extend the life span of the pots. You
want to test the hypothesis that the
average life span of the cooking pots
made with this new material increases.
(a)
Write the null and the alternative
hypotheses for this setting.
Based on a previous poll, the
percentage of people who said they plan
to vote for the Democratic candidate
was 50%. The presidential candidates
will have daily televised commercials
and a final political debate during the
week before the election. You want to
test the hypothesis that the population
proportion of people who say they plan
to vote for the Democratic candidate
has changed.
Is this a one-sided to the right,
one-sided to the left, or two-sided
alternative hypothesis? Hint: look at
the alternative hypothesis.
(b)
Solution 2
Taking an aspirin every other
day for 20 years will not change the risk
of getting colon cancer, which is 1 in
16.
H1: Taking an aspirin every other day for
20 years will reduce the risk of getting
colon cancer, so the risk will be less than 1
in 16.
(b) The alternative hypothesis is onesided to the left.
(a)
new
H0 : The average life span of the
Is this a one-sided to the right,
one-sided to the left, or two-sided
alternative hypothesis? Hint: look at
the alternative hypothesis.
(b)
new cooking pots is 7 years.
Solution 3
H1: The average life span of the
(a)
cooking pots is greater than 7 years.
(b)
Write the null and the alternative
hypotheses for this setting.
(a)
The alternative hypothesis is
one-sided to the right.
H0 : The percentage of the
Democratic votes in the upcoming
election will be 50%.
H1: The percentage of the
Democratic votes in the upcoming
election will be different from 50%.
(b) The alternative hypothesis is twosided.
Let's Do It! Fair Die?
In a famous die experiment, out of 315,672 rolls, a total of 106,656 resulted in
either a ‘5’ or a ‘6’. If the die is "fair”-that is, each of the six outcomes has the
same chance of occurring-then the true proportion of 5's or 6's should be 1/3.
However, a close examination of a real die reveals that the “pips” are made by small
indentations into the faces of the die. Sides 5 and 6 have more indentations than the
other faces, and so these sides should be slightly lighter than the other faces. This
suggests that the true proportion of 5’s or 6’s may be a bit higher than the “fair” value,
1/3.
State the appropriate null and alternative hypotheses for assessing if the data provide
compelling evidence for the competing theory.
H 0:
The die is fair - that is, the indentations have no effect, and the proportion of 5’s or
H1:
The die is not fair - that is, the indentations do have an effect, and the proportion
6’s is __________________.
of 5’s or 6’s is _________________.
Let's Do It! Stress Can Cause Sneezes
Stress can cause sneezes
FROM THE NEW YORK TIMES
Excerpts from the article “Stress can cause sneezes”
(The New York Times, January 21, 1997) are shown at
the right.
Studies suggest that stress doubles a
person’s risk of getting a cold. Acute stress, lasting
maybe only a few minutes, can lead to colds. One
mystery that is still prevalent in cold research is that
while many individuals are infected with the cold
virus, very few actually get the cold. On average, up
to 90% of people exposed to a cold virus become
infected, meaning the virus multiplies in the body, but
only 40 percent actually become sick. One researcher
thinks that the accumulation of stress predisposes an
infected person to illness.
Winter can give you a cold because it forces you
indoors with coughers, sneezers and wheezers.
Toddlers can give you a cold because they are
the original Germs “R” Us. But can going postal
with the boss or fretting about marriage give a
person a post-nasal drip?
Yes, say a growing number of researchers. A
psychology professor at Carnegie Mellon
University in Pittsburgh, Dr. Sheldon Cohen,
said his most recent students suggest that
stress doubles a person’s risk of getting a cold.
The percentage of people exposed to a cold virus who actually get a cold is 40%. The
researcher would like to assess if stress increases this percentage. So, the population
of interest is people who are under (acute) stress. State the appropriate hypotheses for
assessing the researcher’s theory.
H 0:________________________________________________
Tip: Having trouble determining the alternative hypothesis? Ask yourself "Why is the
research being conducted?" The answer is generally the alternative hypothesis, which is
why the alternative hypothesis is often referred to as the research hypothesis.
H1:_________________________________________________
Example
Is the New Drug Better?
Suppose that you have developed a new and very expensive drug intended to
cure some disease. You wish to assess how well your new drug performs
compared to the standard drug by testing the following hypotheses:
H 0:
H1:
The new drug is as effective as the standard drug.
The new drug is more effective than the standard drug.
A study is conducted in which the investigator administers the new drug to some
number of patients suffering from the disease and the standard drug to another
group of patients suffering from the disease. The proportion of cures for both
drugs is recorded. Based on this information, we have to decide which hypothesis
to support.
(a)
If the proportion of subjects cured with the new drug was exactly
equal to the proportion of subjects cured with the standard drug, which
hypothesis would you support?
(b)
If 75% of the subjects were cured with the new drug while 55% of
the subjects were cured with the standard drug, for a difference in cure
rates of 20%, which hypothesis would you support? (Define the difference
in cure rates as the percent of subjects cured with the new drug less the
percent cured with the standard drug.)
(c)
If the difference in cure rates was equal to 2%, which hypothesis
would you support?
(d)
How large of a difference in the cure rates is needed for you to feel
confident in rejecting the null hypothesis?
What we’ve learned: In real situations the decision maker makes a choice
among various alternative courses of actions. According to the consequences of the
decision, one researcher might decide to reject the null hypothesis and another one
might decide not to reject it.
Let’s Do It! Null and Alternative Hypotheses
State the null and alternative hypothesis that would be used to test the following
statements -- these statements are the researcher’s claim, to be stated as the
alternative hypothesis. All hypotheses should be expressed in terms of the population
mean of interest.
(a) The mean age of patients at a hospital is more than 60 years.
H0:
versus
H1:
(b) The mean caffeine content in a cup of regular coffee is less than 110 mg.
H0:
versus
H1:
(c) The average number of emergency room admissions per day differs from 20.
H0:
versus
H1:
What Errors Could We Make?
DEFINITION:
Rejecting the null hypothesis
H 0 when in fact it is true, is called Type I error.
Failing to reject the null hypothesis
II error.
H 0 when in fact it is false, is called a Type
Remember that ...
A Type I error can only be made if the null hypothesis is true.
A Type II error can only be made if the alternative hypothesis is true.
Let’s Do It! Rain, Rain, Go Away!
You plan to walk to a party this evening. Are you going to carry an umbrella with you?
You don’t want to get wet if it should rain. So you wish to test the following hypotheses:
H0: Tonight it is going to rain.
H1: Tonight it is not going to rain.
(a)
Describe the two types of error that you could make when deciding between
these two hypotheses.
(b)
What are the consequences of making each type of error?
DEFINITION:
The data collected are said to be statistically significant if they are
very unlikely to be observed under the assumption that
H 0 is true.
If
the data are statistically significant, then our decision would be to
reject H 0.
DEFINITION:
The direction of extreme corresponds to the position of the values that are more likely
under the alternative hypothesis
values are more likely under
to be to the right.
H1 than under the null hypothesis H 0.
If the larger
H1 than under H 0, then the direction of extreme is said
DEFINITION:
The value under the null hypothesis
H 0 which is least likely, but at the same time is
very likely under the alternative hypothesis
H1 is called the most extreme value.
DEFINITION:
A critical region is the set of values for which you would reject the null hypothesis
H 0.
Such values are contradictory to the null hypothesis and favor the alternative
hypothesis
H1.
A non-critical region is the set of values for which you would Not Reject the null
hypothesis
H 0.
The cut-off value or critical value is the value which marks the starting point of the set
of values that comprise the rejection region.
Testing Hypothesis about the Mean
The Central Limit Theorem:
The Distribution of
X
the Sample Mean:
If a simple random sample of size n is taken from a population having population mean
 and population standard deviation  , or If the sample size n is large enough
( n  30 ), then all possible sample averages will be symmetric distribution as follows:
 

X is  ,

n

This symmetric distribution is called the student-t-distribution.
The Student’s t-Distribution with (n-1) degrees of freedom
Properties of the Student’s t-distribution
 The t-distribution has a symmetric bell-shaped density centered at 0.
 The t-distribution is “flatter” and has “heavier tails” than the N(0,1)
distribution.
 As the sample size increases, the t-distribution approaches the N(0,1)
distribution.
Refer to Table F in you textbook for the t-distribution.
Definition
The critical value separates the critical region from the noncritical region. The
symbol for critical value is C.V.
The critical or rejection region is the range of values of the test value that
indicates that there is a significant difference and that the null hypothesis
should be rejected.
The noncritical or non-rejection region is the range of values of the test value
that indicates that the difference was probably due to chance and that the null
hypothesis should not be rejected.
A one-tailed test indicates that the null hypothesis should be rejected when the
test value is in the critical region on one side of the mean. A one-tailed test is
either a right-tailed test or left-tailed test, depending on the direction of the
inequality of the alternative hypothesis.
Example
Finding the Critical Value for  =0.01 (Right-Tailed Test) using The Tdistribution with d.f. =120 .
Solution
The critical value is for  =0.01 is 2.358.
Let’s Do It! 5 Critical Values.
a. Finding the Critical Value for  =0.05 (Left-Tailed Test) using the
T-distribution with d.f.=22
b. Finding the Critical Value for  =0.05 (Two-Tailed Test) using the
T-distribution with d.f.=27
Example
A researcher thinks that the average salary of assistant professors is
more than $42,000. A sample of 30 assistant professors has a mean
salary of $43,260. At  =0.05, test the claim that assistant professors
earn more than $42,000 a year. The standard deviation of the sample
is $5230.
Step1: State the hypotheses and identify the claim.
H0 :   $42,000
(or H0 :   $42,000 )
H1 :   $42,000  This is the researcher’s claim
Step2: Find the critical values (the cut off value for an alpha of 0.05).
Since   0.05 and the test is a right-tailed test, the critical value is t29, 0.95
=+1.699(This means that any value on the right of the mean that is more than 1.699
standard deviations will be considered extreme under the null hypothesis)
Step3: Compute the standardized t-test statistic and the p-value
T 
$43,260  $42,000
$52230 / 30
 1.32 (so, our average of 43,260 is 1.32 standard deviations
from 42,000 sample average 43,260is likely to happen under the null H0)
p-value=( Area to the right of 1.32)
= tcdf(1.32, 10^99, 29)
= 0.0986
Step4: Make the decision.
Since the test statistic, +1.32, is less than the critical value, +1.699, and is not in the
critical region, the decision is not to reject to reject the null.
Note the p-value 0.0986 > 0.05- we don’t reject H0.
Step5: Summarize the results
There is not enough evidence to support the claim that assistant professors earn more
on average than $42,000 a year.
Hypothesis Testing
In order to perform the hypothesis test, many hypotheses are tested using a statistical test
based on the following general formula:
In particular, when testing the population mean, our test value is the standardized T-Test
value T 
x  0
s/ n
Hypothesis-Testing Problems (Traditional Method)
Step1: State the null and the Alternative and identify the claim.
Step2: Find the critical value(s) from the appropriate table.
Step3: Compute the test value and the p-value.
Step 4: Summarize the results. Note: Never accept the null. We reserve the judgment
about the hypothesis is true.
Let’s Do it! Waste Data
Waste data for a supplier plant for 19 weeks are provided below to assess if, on
average, this plant performs worse than the computer at controlling waste. If we
let  be the mean percentage of waste for this plant, then we wish to know if the
mean  is greater than 0%. The sample standard deviation of 4.40 is reported.
a) State the appropriate hypotheses
b) From the alternative hypothesis, what is the direction of extreme in this case?
c) Standardize the observed sample mean under Ho and to obtain the observed ttest statistic. And Compute the p-value.
d) Using a significance level (α=0.01), what would be your decision?
e) Below is the SPSS output for a one-sample t-test. Verify you work above.
One-Sample Test
Tes t Value = 0
WASTE
t
4.783
df
18
Sig. (2-tailed)
.00015
Mean
Difference
4.832
95% Confidence
Interval
Lower
Upper
2.709
6.954
From the output we have an observed T-test statistic of t = 4.783 and a two-sided pvalue of 0.00015.Our one-sided p-value to the right of 4.783 will be half of the
2-sided value, p-value = 0.00015/2 = 0.000075.
Let's Do It!.4 pH Levels
A soil scientist is interested in studying the pH level in the soil for a certain field. In
particular, she is going to examine a random sample of soil samples and measure their
pH levels to assess if the mean field pH level is neutral (that is, equal to 7) versus the
alternative hypothesis that the mean pH level is acidic (that is, less than 7). She
observed the following pH levels in the sample: 5.8, 6.3, 6.9, 6.2, 5.5.
(a)
State the corresponding null and alternative hypotheses to be tested.
(b)
Find sample mean pH level and sample standard deviation.
(c)
Compute the observed standardized test statistic and find the corresponding pvalue.
(d)
At a level of significance   5%. Is the result statistically significant? Explain.
(e)
State your conclusion using a well-written sentence.
Let's Do It! Market Value for Homes
A real-estate appraiser wants to verify the market value for homes on the east side of
the city that are very similar in size and style. The appraiser wants to test the popular
belief that the average sales price is $37.80 per square foot for such homes versus that
the average differs from $37.80. He wants to use a significance level of 0.01.
(a) State the appropriate null and alternative hypotheses about μ, the population
mean sales price for such homes.
(b)
Suppose that the random sample of six sales were selected. The sampled
sales prices per square foot are $35.00, $38.10, $30.30, $37.20, $29.80, and $35.40.
Does it appear that the popular perception of the market value is valid for this
neighborhood? Give the value of the standardized t- test value and the p-value.
(c)
Based on your p-value in (b), are the results statistically significant at a 1%
significance level? Explain.
Summary of the One-Sample t-Test for a Population Mean:
We were interested in testing hypotheses about the population mean  . The null hypothesis is
H0 :   0 where  0 is the hypothesized value for  . The alternative hypothesis provides the
direction for the test. These hypotheses are statements about the population mean, not the sample
mean. The significance level  is selected.

We base our decision about  on the standardized sample mean, which is
t
x  0
.
s
n
This is the T- test statistic, and the distribution of the variable T under H0 is a t-distribution with n1 degrees of freedom.
Note: The test statistic is the same no matter how the alternative hypothesis is expressed.

We calculate the p-value for the test, which depends on how the alternative hypothesis is
expressed (see below).
One-sided, to the right
One-sided, to the Left
If H1:    0, then the
If H1:    0, then the
If H1:    0, then the
p-value is the area to the right
p-value is the area to the left
p-value is the area in the two tails,
of the observed test statistic
of the observed test statistic
outside of the observed test
under the H0 model.
under the H0 model.
statistic under the H0 model
t(n-1)
t(n-1)
p-value
t
T
Two-sided
t(n-1)
p-value
2
p-value
t
T
p-value
2
-t
+t
T
 Decision: A p-value less than the significance level leads to rejection of H0 .
Note: The TI graphing calculator has a STAT TESTS menu that allows you to perform hypothesis tests
for a population mean. See the TI Quick Steps appendix at the end of this chapter for details.
Homework page258: 1-4all ,33-35, 51,52, 93,94 .