Download p - Binus Repository

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Matakuliah
Tahun
: A0392 – Statistik Ekonomi
: 2006
Pertemuan 09
Pengujian Hipotesis Proporsi
dan Data Katagorik
1
Outline Materi :
• Uji hipotesis proporsi
• Uji hipotesis beda proporsi
• Uji kebebasan data katagorik
2
Summary of Test Statistics to be Used in a
Hypothesis Test about a Population Mean
Yes
s known ?
Yes
n > 30 ?
No
Yes
Use s to
estimate s
s known ?
Yes
z
x 
/ n
No
x 
z
s/ n
x 
z
/ n
No
Popul.
approx.
normal
?
No
Use s to
estimate s
x 
t
s/ n
Increase n
to > 30
3
A Summary of Forms for Null and
Alternative Hypotheses about a
Population Proportion
• The equality part of the hypotheses always
appears in the null hypothesis.
• In general, a hypothesis test about the value of a
population proportion p must take one of the
following three forms (where p0 is the
hypothesized value of the population proportion).
H0: p > p0
Ha: p < p0
H0: p < p0
Ha: p > p0
H0: p = p0
Ha: p  p0
4
Tests about a Population Proportion:
Large-Sample Case (np > 5 and n(1 - p)
> 5)
• Test Statistic
p  p0
z
p
where:
p 
p0 (1  p0 )
n
• Rejection Rule
H0: p  p
H0: p  p
H0: pp
One-Tailed
Reject H0 if z > z
Reject H0 if z < -z
Two-Tailed
Reject H0 if |z| > z
5
Example: NSC
• Two-Tailed Test about a Population Proportion:
Large n
For a Christmas and New Year’s week, the
National Safety Council estimated that 500
people would be killed and 25,000 injured on the
nation’s roads. The NSC claimed that 50% of
the accidents would be caused by drunk driving.
A sample of 120 accidents showed that 67
were caused by drunk driving. Use these data
to test the NSC’s claim with  = 0.05.
6
Example: NSC
• Two-Tailed Test about a Population
Proportion: Large n
– Hypothesis
H0: p = .5
Ha: p  .5
– Test Statistic
p0 (1  p0 )
.5(1  .5)
p 

 .045644
n
120
z
p  p0
p

(67 /120)  .5
 1.278
.045644
7
Example: NSC
• Two-Tailed Test about a Population
Proportion: Large n
– Rejection Rule
Reject H0 if z < -1.96 or z > 1.96
– Conclusion
Do not reject H0.
For z = 1.278, the p-value is .201. If we
reject
H0, we exceed the maximum allowed risk of
committing a Type I error (p-value > .050).
8
Hypothesis Testing and
Decision Making
• In many decision-making situations the
decision maker may want, and in some
cases may be forced, to take action with
both the conclusion do not reject H0 and
the conclusion reject H0.
• In such situations, it is recommended that
the hypothesis-testing procedure be
extended to include consideration of
making a Type II error.
9
Calculating the Probability of a Type II
Error in Hypothesis Tests about
a Population Mean
1. Formulate the null and alternative hypotheses.
2. Use the level of significance  to establish a
rejection rule based on the test statistic.
3. Using the rejection rule, solve for the value of the
sample mean that identifies the rejection region.
4. Use the results from step 3 to state the values of the
sample mean that lead to the acceptance of H0; this
defines the acceptance region. x
5. Using the sampling distribution of for any value of
 from the alternative hypothesis, and the
acceptance region from step 4, compute the
probability that the sample mean will be in the
acceptance region.
10
Example: Metro EMS
(revisited)
• Calculating the Probability of a Type II Error
1. Hypotheses are: H0:   and Ha: 
2. Rejection rule is: Reject H0 if z > 1.645
3. Value of the sample mean that identifies the
rejection region:
x  12
z
 1.645
3.2 / 40
4. We will accept H0 when x < 12.8323
 3.2 
x  12  1.645 
  12.8323
 40 
11
Example: Metro EMS
(revisited)
• Calculating the Probability of a Type II Error
5. Probabilities that the sample mean will be in the
acceptance region:
12.8323  
z
3.2 / 40
Values of 
14.0
13.6
13.2
12.83
12.8
12.4
12.0001
-2.31
-1.52
-0.73
0.00
0.06
0.85
1.645
b
1-b
.0104
.0643
.2327
.5000
.5239
.8023
.9500
.9896
.9357
.7673
.5000
.4761
.1977
.0500
12
Example: Metro EMS
(revisited)
• Calculating the Probability of a Type II
Error
Observations about the preceding table:
– When the true population mean  is close to
the null hypothesis value of 12, there is a high
probability that we will make a Type II error.
– When the true population mean  is far above
the null hypothesis value of 12, there is a low
probability that we will make a Type II error.
13
Power of the Test
• The probability of correctly rejecting H0
when it is false is called the power of the
test.
• For any particular value of , the power is
1 – b.
• We can show graphically the power
associated with each value of ; such a
graph is called a power curve.
14
Determining the Sample Size
for a Hypothesis Test About a
Population Mean
n
( z  zb ) 2  2
( 0   a )2
where
z = z value providing an area of  in the tail
zb = z value providing an area of b in the tail
 = population standard deviation
0 = value of the population mean in H0
a = value of the population mean used for
the
Type II error
Note: In a two-tailed hypothesis test, use z /2 not
z
15
Relationship among , b, and n
• Once two of the three values are known,
the other can be computed.
• For a given level of significance ,
increasing the sample size n will reduce b.
• For a given sample size n, decreasing 
will increase b, whereas increasing  will
decrease b .
16
Inferences About the
Difference
Between the Proportions of
Two Populations
• Sampling Distribution of p1  p2
• Interval Estimation of p1 - p2
• Hypothesis Tests about p1 - p2
17
Sampling Distribution ofp1  p2
• Expected Value
E ( p1  p2 )  p1  p2
• Standard Deviation
 p1  p2 
p1 (1  p1 ) p2 (1  p2 )

n1
n2
• Distribution Form
If the sample sizes are large (n1p1, n1(1 - p1),
n2p2,
and n2(1 - p2) are all greater than or equal to 5), the
sampling distribution of p1  p2 can be approximated
by a normal probability distribution.
18
Interval Estimation of p1 - p2
• Interval Estimate
p1  p2  z / 2  p1  p2
• Point Estimator of  p1  p2
s p1  p2 
p1 (1  p1 ) p2 (1  p2 )

n1
n2
19
Example: MRA
MRA (Market Research Associates) is
conducting research to evaluate the effectiveness of
a client’s new advertising campaign. Before the new
campaign began, a telephone survey of 150
households in the test market area showed 60
households “aware” of the client’s product. The new
campaign has been initiated with TV and newspaper
advertisements running for three weeks. A survey
conducted immediately after the new campaign
showed 120 of 250 households “aware” of the client’s
product.
Does the data support the position that the
advertising campaign has provided an increased
awareness of the client’s product?
20
Example: MRA
• Point Estimator of the Difference Between the
Proportions of Two Populations
120 60
p1  p2  p1  p2 

. 48. 40 . 08
250 150
p1 = proportion of the population of households
“aware” of the product after the new campaign
p2 = proportion of the population of households
“aware” of the product before the new campaign
p1 = sample proportion of households “aware” of the
product after the new campaign
p2 = sample proportion of households “aware” of the
product before the new campaign
21
Example: MRA
• Interval Estimate of p1 - p2: Large-Sample Case
For = .05, z.025 = 1.96:
. 48(.52) . 40(. 60)
. 48. 40  1. 96

250
150
.08 + 1.96(.0510)
.08 + .10
or -.02 to +.18
– Conclusion
At a 95% confidence level, the interval estimate
of the difference between the proportion of
households aware of the client’s product before and
after the new advertising campaign is -.02 to +.18.
22
Hypothesis Tests about p1 - p2
• Hypotheses
H0: p1 - p2 < 0
Ha: p1 - p2 > 0
• Test statistic
z
•
( p1  p2 )  ( p1  p2 )
 p1  p2

Point Estimator of p1  p2 where p
1
= p2
s p1  p2  p (1  p )(1 n1  1 n2 )
where:
n1 p1  n2 p2
p
n1  n2
23
Example: MRA
• Hypothesis Tests about p1 - p2
Can we conclude, using a .05 level of
significance, that the proportion of households aware
of the client’s product increased after the new
advertising campaign?
p1 = proportion of the population of households
“aware” of the product after the new campaign
p2 = proportion of the population of households
“aware” of the product before the new campaign
– Hypotheses
H0: p1 - p2 < 0
Ha: p1 - p2 > 0
24
Example: MRA
• Hypothesis Tests about p1 - p2
– Rejection Rule
Reject H0 if z > 1.645
– Test Statistic
250(. 48)  150(. 40) 180
p

. 45
250  150
400
s p1  p2  . 45(. 55)( 1
 1 ) . 0514
250 150
(. 48. 40)  0
. 08
z

 1. 56
. 0514
. 0514
– Conclusion
Do not reject H0.
25
Test of Independence: Contingency Tables
1. Set up the null and alternative hypotheses.
2. Select a random sample and record the observed
frequency, fij , for each cell of the contingency table.
3. Compute the expected frequency, eij , for each cell.
(Row i Total)(Column j Total)
eij 
Sample Size
26
Test of Independence:
Contingency Tables
4. Compute the test statistic.
2   
i
j
( f ij  eij ) 2
eij
2
2



 (where
5. Reject H0 if
 is the
significance level and with n rows and m
columns there are
(n - 1)(m - 1) degrees of freedom).
27
Example: Finger Lakes Homes (B)
• Contingency Table (Independence) Test
Each home sold can be classified according to
price and to style. Finger Lakes Homes’ manager
would like to determine if the price of the home and
the style of the home are independent variables.
The number of homes sold for each model and
price for the past two years is shown below. For
convenience, the price of the home is listed as either
$65,000 or less or more than $65,000.
Price
Colonial
< $65,000
18
> $65,000
12
Ranch
6
14
Split-Level
19
16
A-Frame
12
3
28
Example: Finger Lakes Homes (B)

Contingency Table (Independence) Test
• Hypotheses
H0: Price of the home is independent of the style
of the home that is purchased
Ha: Price of the home is not independent of the
style of the home that is purchased
• Expected Frequencies
Price Colonial Ranch Split-Level A-Frame
< $99K
18
6
19
12
> $99K
12
14
16
3
Total
30
20
35
15
Total
55
45
100
29
Example: Finger Lakes Homes (B)
• Contingency Table (Independence) Test
– Test Statistic
2
2
2
(
18

16
.
5
)
(
6

11
)
(
3

6
.
75
)
2 

 ... 
16. 5
11
6. 75
= .1364 + 2.2727 + . . . + 2.0833 = 9.1486
– Rejection Rule
2
With  = .05 and (2 - 1)(4 - 1) = 3 d.f.,.05  7.81
Reject H0 if 2 > 7.81
– Conclusion
We reject H0, the assumption that the price of the
home is independent of the style of the home
that is purchased.
30