Download Hypothesis Testing

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Taylor's law wikipedia , lookup

Foundations of statistics wikipedia , lookup

Omnibus test wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Statistical Inference: Hypothesis
Testing for Single Populations
1
Sourabh
What is a Hypothesis?

A hypothesis is a claim (assumption) about
a population parameter:
–
population mean
Example: The mean monthly cell phone bill
of this city is μ = Rs 429
–
2
population proportion
Example: The proportion of adults in this
city with cell phones is p = 0.68
Sourabh
What is a Hypothesis?

3
A hypothesis is an
assumption about the
population parameter.
–
A parameter is a
characteristic of the
population, like its mean
Proportion or variance.
–
The parameter must be
identified before analysis.
I assume the mean GPA
of this class is 3.5!
Sourabh
Types of Hypotheses

Research Hypothesis
–

Statistical Hypotheses
–
4
a statement of what the researcher believes will
be the outcome of an experiment or a study.
a more formal structure derived from the research
hypothesis.
Sourabh
Example Research Hypotheses



5
Older workers are more loyal to a company
Companies with more than $1 billion of
assets spend a higher percentage of their
annual budget on advertising than do
companies with less than $1 billion of assets.
The price of scrap metal is a good indicator
of the industrial production index six months
later.
Sourabh
Statistical Hypotheses

Two Parts
–
–



Null Hypothesis – nothing new is happening
Alternative Hypothesis – something new is
happening
Notation
–
6
a null hypothesis
an alternative hypothesis
–
null: H0
alternative: H1 or Ha
Sourabh
Null and Alternative Hypotheses




7
The Null and Alternative Hypotheses are
mutually exclusive. Only one of them can be
true.
The Null and Alternative Hypotheses are
collectively exhaustive. They are stated to
include all possibilities. (An abbreviated form
of the null hypothesis is often used.)
The Null Hypothesis is assumed to be true.
The burden of proof falls on the Alternative
Hypothesis.
Sourabh
The Null Hypothesis, H0

States the claim or assertion to be tested
Example: The average number of TV sets in
U.S. Homes is equal to three ( H 0 : μ  3 )

Is always about a population parameter,
not about a sample statistic
H0 : μ  3
8
H0 : X  3
Sourabh
Null and Alternative Hypotheses:
Example


A manufacturer is filling 40 oz. packages with
flour.
The company wants the package contents to
average 40 ounces.
H 0 :   40 oz
H a :   40 oz
9
Sourabh
Hypothesis Testing Process
Claim: the
population
mean age is 50.
(Null Hypothesis:
H0: μ = 50 )
Population
Is X 20 likely if μ = 50?
If not likely,
REJECT
10 Null Hypothesis
Suppose
the sample
mean age
is 20: X = 20
Now select a
random sample
Sample
Sourabh
Reason for Rejecting H0
Sampling Distribution of X
20
If it is unlikely that
we would get a
sample mean of
11
this value ...
X
μ = 50
If H0 is true
... if in fact this were
the population mean…
... then we
reject the null
hypothesis that
μ = 50.
Sourabh
Level of Significance, 

Defines the unlikely values of the sample
statistic if the null hypothesis is true
–

Is designated by  , (level of significance)
–

Defines rejection region of the sampling
distribution
Typical values are 0.01, 0.05, or 0.10
Is selected by the researcher at the beginning
12  Provides the critical value(s) of the test
Sourabh
Level of Significance and the
Rejection Region
Level of significance =
H0: μ = 3
H1: μ ≠ 3

/2
Two-tail test
/2

Upper-tail test
H0: μ ≥ 3
H1: μ < 3
13
Rejection
region is
shaded
0
H0: μ ≤ 3
H1: μ > 3
Represents
critical value
0

Lower-tail test
0
Sourabh
Errors in Making Decisions

Type I Error
– Reject a true null hypothesis
– Considered a serious type of error
The probability of Type I Error is 
Called level of significance of the test
 Set
14
by the researcher in advance
Sourabh
Errors in Making Decisions

Type II Error
– Fail to reject a false null hypothesis
The probability of Type II Error is β
15
Sourabh
(continued)
Outcomes and Probabilities
Possible Hypothesis Test Outcomes
Actual Situation
Key:
Outcome
(Probability)
16
Decision
H0 True
H0 False
Do Not
Reject
H0
No error
(1 -  )
Type II Error
(β)
Reject
H0
Type I Error
()
No Error
(1-β)
Sourabh
Type I & II Error Relationship
 Type I and Type II errors cannot happen at
the same time

Type I error can only occur if H0 is true

Type II error can only occur if H0 is false
If Type I error probability (  )
, then
Type II error probability ( β )
17
Sourabh
Hypothesis Tests for the Mean
Hypothesis
Tests for 
 Known
(Z test)
18
 Unknown
(t test)
Sourabh
Z Test of Hypothesis for the
Mean (σ Known)

Convert sample statistic ( X ) to a Z test statistic
Hypothesis
Tests for 
Known
σKnown
(Z test)
σUnknown
Unknown
(t test)
The test statistic is:
Z 
19
X μ
σ
n
Sourabh
Critical Value Approach to
Testing
20

mean For a two-tail test for the, σ known:

Convert sample statistic ( X ) to test statistic (Z
statistic )

Determine the critical Z values for a specified
level of significance  from a table or
computer

Decision Rule: If the test statistic falls in the
rejection region, reject H0 ; otherwise do not
reject H0
Sourabh
Two-Tail Tests

There are two
cutoff values
(critical values),
defining the
regions of
rejection
H0: μ = 3
H1: μ  3
/2
/2
X
3
Reject H0
-Z
21
Lower
critical value
Do not reject H0
0
Reject H0
+Z
Upper
critical
Sourabh value
Z
6 Steps in Hypothesis Testing
1.
2.
22
State the null hypothesis, H0 and the
alternative hypothesis, H1
Choose the level of significance, , and the
sample size, n
3.
Determine the appropriate test statistic and
sampling distribution
4.
Determine the critical values that divide the
rejection and nonrejection regions
Sourabh
6 Steps in Hypothesis Testing
(continued)
23
5.
Collect data and compute the value of the test
statistic
6.
Make the statistical decision and state the
managerial conclusion. If the test statistic falls
into the nonrejection region, do not reject the
null hypothesis H0. If the test statistic falls into
the rejection region, reject the null hypothesis.
Express the managerial conclusion in the
context of the problem
Sourabh
Hypothesis Testing Example
Test the claim that the true mean # of TV
sets in US homes is equal to 3.
(Assume σ = 0.8)
24
1. State the appropriate null and alternative
hypotheses
– H0: μ = 3
H1: μ ≠ 3 (This is a two-tail test)
2. Specify the desired level of significance and the
sample size
– Suppose that  = 0.05 and n = 100 are chosen
for this test
Sourabh
Hypothesis Testing Example
(continued)
3. Determine the appropriate technique
– σ is known so this is a Z test.
4. Determine the critical values
– For  = 0.05 the critical Z values are ±1.96
5. Collect the data and compute the test statistic
– Suppose the sample results are
n = 100, X = 2.84 (σ = 0.8 is assumed known)
So the test statistic is:
Z 
25
X μ
2.84  3
 .16


 2.0
σ
0.8
.08
n
100
Sourabh
(continued)
Hypothesis Testing Example

6.
Is the test statistic in the rejection region?
 = 0.05/2
26
Reject H0 if
Z < -1.96 or
Z > 1.96;
otherwise
do not
reject H0
Reject H0
-Z= -1.96
 = 0.05/2
Do not reject H0
0
Reject H0
+Z= +1.96
Here, Z = -2.0 < -1.96, so the
test statistic is in the rejection
region
Sourabh
Hypothesis Testing Example
(continued)
6(continued). Reach a decision and interpret the result
 = 0.05/2
Reject H0
-Z= -1.96
 = 0.05/2
Do not reject H0
0
Reject H0
+Z= +1.96
-2.0
27
Since Z = -2.0 < -1.96, we reject the null hypothesis and
conclude that there is sufficient evidence that the mean number
Sourabh
of TVs in US homes is not equal to 3
Connection to Confidence Intervals

For X = 2.84, σ = 0.8 and n = 100, the 95%
confidence interval is:
0.8
2.84 - (1.96)
to
100
0.8
2.84  (1.96)
100
2.6832 ≤ μ ≤ 2.9968

28
Since this interval does not contain the hypothesized
mean (3.0), we reject the null hypothesis at  = 0.05
Sourabh
One-Tail Tests

In many cases, the alternative hypothesis
focuses on a particular direction
H0: μ ≥ 3
H1: μ < 3
H0: μ ≤ 3
H1: μ > 3
29
This is a lower-tail test since the
alternative hypothesis is focused on
the lower tail below the mean of 3
This is an upper-tail test since the
alternative hypothesis is focused on
the upper tail above the mean of 3
Sourabh
Lower-Tail Tests
H0: μ ≥ 3

There is only one
critical value, since
the rejection area is
in only one tail
H1: μ < 3

Reject H0
-Z
Do not reject H0
0
μ
30
Critical value
Z
X
Sourabh
Upper-Tail Tests

There is only one
critical value, since
the rejection area is
in only one tail
Z
_
X
31
H0: μ ≤ 3
H1: μ > 3

Do not reject H0
0
Zα
Reject H0
μ
Critical
value
Sourabh
Example: Upper-Tail Z Test
for Mean ( Known)
A phone industry manager thinks that
customer monthly cell phone bills have
increased, and now average over $52 per
month. The company wishes to test this
claim. (Assume  = 10 is known)
Form hypothesis test:
H0: μ ≤ 52 the average is not over $52 per month
H1: μ > 52
32
the average is greater than $52 per month
(i.e., sufficient evidence exists to support the
manager’s claim)
Sourabh
Example: Find Rejection
Region
(continued)

Suppose that  = 0.10 is chosen for this test
Find the rejection region:
Reject H0
 = 0.10
Do not reject H0
0
33
1.28
Reject H0
Reject H0 if Z > 1.28
Sourabh
Review: One-Tail Critical Value
What is Z given  = 0.10?
0.90
Standardized Normal
Distribution Table (Portion)
0.10
 = 0.10
0.90
Z
.07
.08
.09
1.1 .8790 .8810 .8830
1.2 .8980 .8997 .9015
z
34
0 1.28
Critical Value
= 1.28
1.3 .9147 .9162 .9177
Sourabh
Example: Test Statistic
(continued)
Obtain sample and compute the test statistic
Suppose a sample is taken with the following
results: n = 64, X = 53.1 (=10 was assumed known)
–
35
Then the test statistic is:
Xμ
53.1  52
Z 

 0.88
σ
10
n
64
Sourabh
Example: Decision
(continued)
Reject H0
Reach a decision and interpret the result:
 = 0.10
Do not reject H0
1.28
0
Z = 0.88
Reject H0
Do not reject H0 since Z = 0.88 ≤ 1.28
36
i.e.: there is not sufficient evidence that the
mean bill is over $52
Sourabh
Problem:
A survey of CPA’s across the United States found that the
average net income for sole proprietor CPA’s is $74,914.
Because this survey is now more than seven years old, an
accounting researcher wants to test the figure by taking a
random sample of 112 sole proprietor accountants having the
average net income $78,695 in the United States to determine
whether the net income figure changed. The researcher could
use the steps of hypothesis testing to do so. Assume the
population standard deviation of net incomes for sole
proprietor CPA’s is $14,530.
37
Sourabh
Solution
HYPOTHESIS:
Because the researcher is testing to determine whether
the figure has changed, the alternative hypothesis is that
the average net income is not $74,914. the null
hypothesis is that the mean still equals $74,914. these
hypothesis follows
H0 : µ = $74,914
H1 : µ ≠ $74,914
38
Sourabh
Test:
To determine the appropriate statistical test and sampling distribution.
Because ` sample size is large (n=112) and the researcher is using the
sample mean as statistics, the Z formula is appropriate test statistics. i.e,
z 
X 

n
39
Sourabh
Critical Region:
Specify the type I error rate, or alpha (α), Which is 0.05 in this
problem.
Because the test is two tailed and α = 0.05, there is α /2 or 0.025
area in each of the tails of the distribution. Thus, the rejection
region is in the two ends of the distribution with 2.5% of the area in
each. There is a 0.475 area between the mean and each of the
critical values that separates the tails of the distribution (the
rejection region from the nonrejection region. By using this 0.475
area and Standard Normal table, the Critical value of z can be
obtained.
Zα/2 = ± 1.96
40
Sourabh
The 112 CPA’s who respond produce samples mean of 78,695.
The value of tesy statistic is calculated by using = $78,695, n = 112,
σ = $74,914, and µ = $74,914
z
78,695  74,914
 2.75
14,530
112
.025
-1.96
41
.025
0
1.96
Sourabh
z
BUSINESS IMPLICATION:
42
What does this result mean? Statistically, the researcher has enough evidence to
reject the figure of $74,914 as the true national average net income for sole
proprietor CPAs. Although the researcher conducted a two tailed test, the
evidence gathered indicates that the national average may have increased. The
sample mean of $78,695 is $3,781 higher than the national mean being tested. The
researcher can conclude that the national average is more than before, but
because the $78,695 is only a sample mean, it offers no guarantee that the
national average for all sole proprietor CPAs is $3,781 more. If a confidence
interval were constructed with the sample data, $78,695 would be the point
estimate. Other samples might produce different sample means. Managerially this
statistical finding may mean that CPAs will be more expensive to hire either as
full-time employees or as consultants. It may mean that consulting services have
gone up in price. For new accountants, it may mean the potential for greater
earning power. If the sample mean of $78,695is the actual new population average
for the year 2002, it would represent an increase of $3,781 over a seven-year
period. This increase may or may not be substantive depending on one’s point of
view.
Sourabh
Problem (B)
In the CPA net income example, suppose only 600 sole proprietor CPAs
practice in the United States. A sample of 112 CPAs taken from a population of
only 600 CPAs is 18.67% of the population and therefore is much more likely
to be representative of the population than a sample of 112 CPAs taken from a
population of 20,000 CPAs (.56% of the population). The finite correction factor
takes this difference into consideration and allows for an increase in the
observed value of z. The observed z value would change to
z
43
X 
 ( N  n)
( N  1)
n

78,695  74,914
3,781

 3.05
14,530 (600  112)
1,239.2
(600  1)
112
Use of the finite correction factor increased the observed z value from 2.75 to
3.05. The decision to reject the null hypothesis does not change with this new
information. However, on occasion, the finite correction factor can make the
Sourabh
difference between rejecting and failing to reject the null hypothesis.
Question:
Generally Electric has developed a new bulb whose design
specifications call for a light output of 960 lumens compared to
an earlier model that produced only 750 lumens. The company’s
data indicate that the standard deviation of light output for this
type of bulb is 18.4 lumen. From a sample of 20 bulbs, the
testing committee found an average light output of 954 lumens
per bulb. At a 0.05 significance level, can Generally Electric
conclude that its new bulb is producing the specified 960 lumen
output?
44
Sourabh
Question:
Maxwell’s Hot Chocolate is concerned about the effect of the recent
yearlong coffee advertising campaign on hot chocolate sales. The
average weekly hot chocolate sales two years ago was 984.7 pounds
and the standard deviation was 72.6 pounds. Maxwell’s has randomly
selected 30 weeks from the past year and found the average sales of
912.1 pound.
(a)State appropriate hypothesis for testing whether hot chocolate
sales have decreased.
(b)At 2 percent significance level, test this hypothesis.
45
Sourabh
Question:
Atlas Sporting Goods has implemented a special trade
promotion for its propane stove and feels that the promotion
should result in a price change for the consumer. Atlas knows
that before the promotion began, the average retail price of the
stove was $44.95 and the standard deviation was $5.75. Atlas
samples 25 of its retailers after the promotion begins and finds
the average price for the stove is now $42.95. At a 0.02
significance level, does Atlas have reason to believe that the
average retail price to the consumer has decreased?
46
Sourabh
t Test of Hypothesis for the
Mean (σ Unknown)

Convert sample statistic ( X ) to a t test statistic
Hypothesis
Tests for 
σKnown
Known
(Z test)
σUnknown
Unknown
(t test)
The test statistic is:
t n -1 
47
X μ
S
n
Sourabh
Example: Two-Tail Test
( Unknown)
The average cost of a
hotel room in New York
is said to be $168 per
night. A random sample
of 25 hotels resulted in
X = $172.50 and
S = $15.40. Test at the
 = 0.05 level.
H0: μ = 168
H1: μ  168
(Assume the population distribution is normal)
48
Sourabh
Example Solution:
Two-Tail Test
H0: μ = 168
H1: μ  168
  = 0.05
/2=.025
Reject H0
-t n-1,α/2
-2.0639
 n = 25
  is unknown, so
use a t statistic
 Critical Value:
t24 = ± 2.0639
49
t n1 
/2=.025
Do not reject H0
0
1.46
Reject H0
t n-1,α/2
2.0639
X μ
172.50  168

 1.46
S
15.40
n
25
Do not reject H0: not sufficient evidence that
true mean cost is different than $168
Sourabh
Connection to Confidence
Intervals

For X = 172.5, S = 15.40 and n = 25, the 95%
confidence interval is:
172.5 - (2.0639) 15.4/ 25
to 172.5 + (2.0639) 15.4/ 25
166.14 ≤ μ ≤ 178.86

50
Since this interval contains the Hypothesized mean (168),
we do not reject the null hypothesis at  = 0.05
Sourabh
Question:
Given a sample mean of 94.3, a sample standard
deviation of 8.4, and a sample size of 6, tests the
hypothesis that the value of the population mean
is 100 against the alternative hypothesis that is
less than 100. Use the 0.05 significance level.
51
Sourabh
Question:
The data-processing department at a large life insurance company has
installed new color video display terminals to replace the monochrome
units it previously used. The 95 operators trained to use the new
machines averaged 7.2 hours before achieving a satisfactory level of
performance. Their sample variance was 16.2 squared hours. Long
experience with operators on the old monochrome terminals showed
that they averaged 8.1 hours on the machines before their
performances were satisfactory. At the 0.01 significance level, should
the supervisor of the department conclude that the new terminals are
easier to learn to operate?
52
Sourabh
Question:
As the bottom fell out of the oil market in early 1986, educators in Texas
worried about how the resulting loss of state revenues (estimated to be about
$100 million for each $1 decrease in the price of a barrel of oil) would affect
their budgets. The state board of education felt the situation would not be
critical as long as they could be reasonably certain that the price would stay
above $18 per barrel. They surveyed 13 randomly chosen oil economists and
asked them to predict how low the price would go before it bottomed out. The
13 predictions average $21.60, and the sample standard deviation was
$4.65. At α = 0.01, is the average prediction significantly higher than $18.00?
Should the board conclude that a budget crisis is unlikely? Explain.
53
Sourabh
Hypothesis Tests for Proportions

Involves categorical variables

Two possible outcomes

54
–
“Success” (possesses a certain characteristic)
–
“Failure” (does not possesses that characteristic)
Fraction or proportion of the population in the
“success” category is denoted by p
Sourabh
z Test of Population Proportion
pˆ  p
z
pq
n
where : pˆ  sample proportion
n  p  5, and
nq  5
p  population proportion
q  1- p
55
Sourabh
Proportions

Sample proportion in the success category is
denoted by p
p
–

X number of successes in sample

n
sample size
When both nπ and n(1-π) are at least 5, p can
be approximated by a normal distribution with
mean and standard deviation
–
56
(continued)
μp  p
σp 
p(1  p)
n
Sourabh
Example: Z Test for Proportion
A marketing company
claims that it receives
8% responses from its
mailing. To test this
claim, a random sample
of 500 were surveyed
with 25 responses. Test
at the  = 0.05
significance level.
57
Check:
np = (500)(.08) = 40
n(1-p) = (500)(.92) = 460
Sourabh

Z Test for Proportion: Solution
H0: p = 0.08
H1: p  0.08
Test Statistic:
Z
 = 0.05
n = 500, p = 0.05
Reject
.025
.025
-1.96
58 -2.47
0
1.96
.05  .08
 2.47
.08(1  .08)
500
Decision:
Critical Values: ± 1.96
Reject
p p

p(1  p)
n
z
Reject H0 at  = 0.05
Conclusion:
There is sufficient
evidence to reject the
company’s claim of 8%
response rate.
Sourabh
Problem:
Feronetics specializes in the use of gene-splicing techniques to
produce new pharmaceutical compounds. It has recently developed a
nasal spray containing interferon, which it believes will limit the
transmission of the common cold within families. In the general
population 15.1 percent of all individuals will catch a rhinoviruscaused cold once another family member contracts such a cold. The
interferon spray was tested on 180 people, one of whose family
members subsequently contracted a rhinovirus-caused cold. Only 17
of the test subjects developed similar colds.
1. At a significance level of 0.05, should Feronetics conclude that the
new spray effectively reduces transmission of colds?
2.
59
What should it conclude at α = 0.02?
Sourabh
Problem:
Rick Douglas, the new manager of Food Barn, is
interested in the percent age of customers who are
totally satisfied with the store. The previous manager
had 86 percent of the customers totally satisfied, and
Risk claims the same is true today. Rick sampled 187
customers and found 157 were totally satisfied. At
the 1 percent significance level, is there evidence
that Rick’s claim is valid?
60
Sourabh
Problem:
61
From a total of 10,200 loans made by a state
employees’ credit union in the most recent 5-year
period, 350 were sampled to determine what
proportion was made to women. This sample showed
that 39 percent of the loans were made to women
employees. A complete census of loans 5 years ago
showed that 41 percent of the borrowers then were
women. At a significance level of 0.02, can you
conclude that the proportion of loans made to
women has changed significantly in the past 5
years?
Sourabh
Problem:
A television documentary on overeating claimed that
Americans are about 10 pounds overweight on
average. To test this claim, eighteen randomly
selected individuals were examined; their average
excess weight was found to be 12.4 pounds, and the
sample standard deviation was 2.7 pounds. At a
significance level of 0.01, is there any reason to
doubt the validity of the claimed 10-pound value?
62
Sourabh