Download ECON1003: Analysis of Economic Data - Ka

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia, lookup

Student's t-test wikipedia, lookup

Taylor's law wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

Time series wikipedia, lookup

Resampling (statistics) wikipedia, lookup

Misuse of statistics wikipedia, lookup

Psychometrics wikipedia, lookup

Foundations of statistics wikipedia, lookup

Statistical hypothesis testing wikipedia, lookup

Transcript
Lesson 8:
One-Sample Tests of Hypothesis
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-1
Outline
Review of CLT and CI
What is a hypothesis?
Type I and type II errors?
One-Tailed Tests
Two-Tailed Tests
p-Value in Hypothesis Testing
Tests Concerning Proportion
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-2
Central Limit Theorem #1
5 balls in the bag:
0
1
2
3
4
Draw 50 ball 1000 times with replacement. Compute the sample
mean. Plot a relative frequency histogram (empirical probability
histogram) of the 1000 sample means.
The Central Limit Theorem says
1. The empirical histogram looks like a normal density.
2. Expected value (mean of the normal distribution) = 2.
3. Variance of the sample means = 2/50=0.04.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-3
Confidence Interval #1
Five numbered balls in the bag:
?
?
?
?
?
Draw one sample of 50 balls with replacement. Compute the sample
mean and sample standard deviation. Suppose the sample mean is
10 and the sample standard deviation is 0.04. Can you tell us the
range of possible values the population mean may take, at 95%
confidence level?
m
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-4
Hypothesis testing #1
Five numbered balls in the bag:
?
?
?
?
?
Draw one sample of 50 balls with replacement. Compute the sample
mean and sample standard deviation. Suppose the sample mean is
10 and the sample standard deviation is 0.04. Do you think the balls
in this bag has a mean of 2?
2
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-5
What is a Hypothesis?
 A Hypothesis is a statement about the value of a
population parameter developed for the purpose of testing.
 Examples of hypotheses made about a population
parameter are:
 The mean monthly income for systems analysts is
$3,625.
 Twenty percent of all customers at Bovine’s Chop House
return for another meal within a month.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-6
What is Hypothesis Testing?
 Hypothesis testing is a procedure, based on sample
evidence and probability theory, used to determine
whether the hypothesis is a reasonable statement and
should not be rejected, or is unreasonable and should be
rejected.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Hypothesis Testing
Step 1: state null and alternative hypothesis
Step 2: select a level of significance
Step 3: identify the test statistic
Step 4: formulate a decision rule
Step 5: Take a sample, arrive at a decision
Do not reject null
Ka-fu Wong © 2004
Reject null and accept alternative
ECON1003: Analysis of Economic Data
Lesson8-8
Null and Alternative Hypothesis
 Null Hypothesis H0: A statement about the value of a
population parameter.
 Alternative Hypothesis H1: A statement that is accepted if
the sample data provide evidence that the null hypothesis
is false.
 Level of Significance: The probability of rejecting the null
hypothesis when it is actually true.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-9
Objectivity in formulating a hypothesis
 In court, the defendant is presumed innocent until proven beyond
reasonable doubt to be guilty of stated charges.
 The “null hypothesis”, i.e. the denial of our theory, is presumed
true until we prove beyond reasonable doubt that it is false.
 “Beyond reasonable doubt” means that the probability of
claiming that our theory is true when it is not (null hypothesis
true) is less than an a priori set significance level (usually 5%
or 1%).
 Is the defendant guilty?
 Null: the defendant is not guilty.
 Alternative: the defendant is guilty.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-10
Type I and type II errors
 Type I Error: conclude the defendant guilty when the defendant
did not commit the crime.
 Level of significance is also the maximum probability of
committing a type I error. We want to limit this Type I Error
to some small number.
 Type II Error: Conclude the defendant not guilty when the
defendant actually committed the crime.
Committed Crime
Court Decision
(Guilty or not)
Ka-fu Wong © 2004
Yes
No
Guilty
Correct
decision
Type I error
Not Guilty
Type II error
Correct
decision
ECON1003: Analysis of Economic Data
Lesson8-11
Type I and type II errors
 Type I Error: Rejecting the null hypothesis when it is actually true.
 Level of significance is also the maximum probability of
committing a type I error. We want to limit this Type I Error to
some small number.
 Type II Error: Accepting the null hypothesis when it is actually
false.
State of nature
Decision based Don’t reject
on the sample null
statistic
Reject null
Ka-fu Wong © 2004
Null true
Null false
Correct
decision
Type II error
Type I error
Correct
decision
ECON1003: Analysis of Economic Data
Lesson8-12
Critical values
 Test statistic: A value, determined from sample
information, used to determine whether or not to reject
the null hypothesis.
 Critical value: The dividing point between the region where
the null hypothesis is rejected and the region where it is
not rejected.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-13
One-Tailed Tests
 A test is one-tailed when the alternate hypothesis, H1 ,
states a direction, such as:
 H1: The mean yearly commissions earned by full-time
realtors is more than $35,000. (µ>$35,000)
 H1: The mean speed of trucks traveling on I-95 in
Georgia is less than 60 miles per hour. (µ<60)
 H1: Less than 20 percent of the customers pay cash for
their gasoline purchase. ( < .20)
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-14
Sampling Distribution for the Statistic Z for a
One Tailed Test, .05 Level of Significance
.95 probability
Critical
Value
z=1.65
.05 probability
0
Ka-fu Wong © 2004
1
2
3
4
Rejection region
Reject the null if the test
statistic falls into this region.
ECON1003: Analysis of Economic Data
Two-Tailed Tests
 A test is two-tailed when no direction is specified in the
alternate hypothesis H1 , such as:
 H1: The mean amount spent by customers at the WalMart in Georgetown is not equal to $25. (µ  $25).
 H1: The mean price for a gallon of gasoline is not equal
to $1.54. (µ  $1.54).
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-16
Sampling Distribution for the Statistic Z for a
Two Tailed Test, .05 Level of Significance
.95 probability
Critical
Value
z=-1.96
Critical
Value
z=1.96
.025 probability
.025 probability
-4 -3 -2 -1
Rejection region #1
Ka-fu Wong © 2004Reject
0
1
2
3
4
Rejection region #2
the null if the test statistic falls into these two regions.
ECON1003: Analysis
of Economic
Data
Copyright©
2002 by The
McGraw-Hill Companies, Inc. All rights reserved
Testing for the Population Mean: Large Sample,
Population Standard Deviation Known
 When testing for the population mean from a large
sample and the population standard deviation is
known, the test statistic is given by:
X m
z
/ n
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-18
EXAMPLE 1
 The processors of Fries’ Catsup indicate on the label that
the bottle contains 16 ounces of catsup. The standard
deviation of the process is 0.5 ounces. A sample of 36
bottles from last hour’s production revealed a mean weight
of 16.12 ounces per bottle. At the .05 significance level is
the process out of control?
 That is, can we conclude that the mean amount per bottle
is different from 16 ounces?
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-19
EXAMPLE 1
continued
 Step 1: State the null and the alternative
hypotheses:
H0: m = 16;
H1: m  16
 Step 2: Select the level of significance. In this case we
selected the .05 significance level.
 Step 3: Identify the test statistic. Because we know the
population standard deviation, the test statistic is z.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-20
EXAMPLE 1
continued
 Step 4: State the decision rule:
Reject H0 if z > 1.96 or z < -1.96
 Step 5: Compute the value of the test statistic and arrive
at a decision.
X m
16.12  16.00
z

 1.44
 n
0.5 36
 Do not reject the null hypothesis. We cannot conclude
the mean is different from 16 ounces.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-21
p-Value in Hypothesis Testing
 A p-Value is the probability, assuming that the null
hypothesis is true, of finding a value of the test statistic at
least as extreme as the computed value for the test.
 The “critical probability” for our decision to reject the
null.
 If the p-Value is smaller than the significance level, H0 is
rejected.
 If the p-Value is larger than the significance level, H0 is not
rejected.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-22
Computation of the p-Value
 One-Tailed Test: p-Value = P{z ≥absolute value of the
computed test statistic value}
 Two-Tailed Test: p-Value = 2P{z ≥ absolute value of the
computed test statistic value}
 From EXAMPLE 1, z = 1.44, and because it was a two-tailed
test,
the p-Value = 2P{z ≥ 1.44} = 2(.5-.4251) = .1498.
Because .1498 > .05, do not reject H0.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-23
Testing for the Population Mean: Large Sample,
Population Standard Deviation Unknown
 Here  is unknown, so we estimate it with the
sample standard deviation s.
 As long as the sample size n  30, z can be
approximated with:
X m
z
s/ n
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-24
EXAMPLE 2
 Roder’s Discount Store chain issues its own credit card.
Lisa, the credit manager, wants to find out if the mean
monthly unpaid balance is more than $400. The level of
significance is set at .05. A random check of 172 unpaid
balances revealed the sample mean to be $407 and the
sample standard deviation to be $38. Should Lisa conclude
that the population mean is greater than $400, or is it
reasonable to assume that the difference of $7 ($407-$400)
is due to chance?
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-25
EXAMPLE 2
continued
 Step 1: H0: m  $400, H1: m > $400
 Step 2: The significance level is .05
 Step 3: Because the sample is large we can use the z
distribution as the test statistic.
 Step 4: H0 is rejected if z>1.65
 Step 5: Perform the calculations and make a decision.
X  m $407  $400
z

 2.42
s n
$38 172
H0 is rejected. Lisa can conclude that the mean
unpaid balance is greater than $400.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-26
Testing for a Population Mean: Small Sample,
Population Standard Deviation Unknown
 The test statistic is the t distribution.
 The test statistic for the one sample case is given by:
X m
t 
s/ n
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-27
Example 3
 The current rate for producing 5 amp fuses at Neary Electric Co. is
250 per hour. A new machine has been purchased and installed
that, according to the supplier, will increase the production rate.
A sample of 10 randomly selected hours from last month revealed
the mean hourly production on the new machine was 256 units,
with a sample standard deviation of 6 per hour. At the .05
significance level can Neary conclude that the new machine is
faster?
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-28
Example 3
continued
 Step 1: State the null and the alternate hypothesis.
H0: m  250; H1: m > 250
 Step 2: Select the level of significance. It is .05.
 Step 3: Find a test statistic. It is the t distribution because the
population standard deviation is not known and the sample size is
less than 30.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-29
Example 3
continued
 Step 4: State the decision rule. There are 10 – 1 = 9 degrees of
freedom. The null hypothesis is rejected if t > 1.833.
Step 5: Make a decision and interpret the results.
X  m 256  250
t

 3.162
s n
6 10
The null hypothesis is rejected. The mean number
produced is more than 250 per hour.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-30
Tests Concerning Proportion
 A Proportion is the fraction or percentage that indicates
the part of the population or sample having a particular
trait of interest.
 The sample proportion is denoted by p and is found by:
Number of successes in the sample
p
Number sampled
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-31
Test Statistic for Testing a Single
Population Proportion
z
p 
 (1   )
n
The sample proportion is p and  is the population
proportion.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-32
EXAMPLE 4
 In the past, 15% of the mail order solicitations for a
certain charity resulted in a financial contribution. A new
solicitation letter that has been drafted is sent to a sample
of 200 people and 45 responded with a contribution. At
the .05 significance level can it be concluded that the new
letter is more effective?
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-33
Example 4
continued
 Step 1: State the null and the alternate hypothesis.
H0:   .15 H1:  > .15
 Step 2: Select the level of significance. It is .05.
 Step 3: Find a test statistic. The z distribution is the test
statistic.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-34
Example 4
continued
 Step 4: State the decision rule. The null hypothesis is
rejected if z is greater than 1.65.
 Step 5: Make a decision and interpret the results.
z
p 

 (1   )
n
45
 .15
200
 2.97
.15(1  .15)
200
The null hypothesis is rejected. More than 15 percent are
responding with a pledge. The new letter is more effective.
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-35
Lesson 8:
One-Sample Tests of Hypothesis
- END -
Ka-fu Wong © 2004
ECON1003: Analysis of Economic Data
Lesson8-36