Download Hypothesis Testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Eigenstate thermalization hypothesis wikipedia , lookup

Taylor's law wikipedia , lookup

Foundations of statistics wikipedia , lookup

Omnibus test wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Hypothesis Testing
 What
is Hypothesis Testing?
 Testing
for the population mean
 One-tailed testing
 Two-tailed testing
 Tests
Concerning Proportions
 Types
of Errors
Hypothesis Testing
A Hypothesis is a statement about the value of a
population parameter developed for the purpose of
testing. Examples of hypotheses made about a
population parameter are:
–
The mean monthly income for systems analysts is $3,625 (Statement on
a population mean μ).
–
Twenty percent of all restaurant customers at return for another meal
within a month (Statement on a population proportion π).
Hypothesis Testing
Hypothesis testing is a procedure, based on sample evidence and probability
theory, used to determine whether the hypothesis is a reasonable statement
and should not be rejected, or is unreasonable and should be rejected.
Given or
implied by the
problem
Decide if the
z or the t
distribution is
to be used.
Find the
Critical Values
and the
accept/reject
regions.
Find the z or t
value of the
sample and
check if it falls
in an
accept/reject
region.
Hypothesis Testing
o
H0: null hypothesis and H1: alternate hypothesis
o
H0 and H1 are mutually exclusive and collectively
exhaustive
o
H0 is always presumed to be true
o
H1 has the burden of proof
Hypothesis Testing
In problem solving, look for key words and convert them into
symbols. Some key words include: “improved, better than, as
effective as, different from, has changed,…
3 possible situations:
H0: μ = value
H1: μ ≠ value
H0: μ < value
H1: μ > value
H0: μ > value
H1: μ < value
Possible Keywords:
• Is there a Change?
• has not changed
• is larger than
• is better than
• has improved
• is less than
• is less effective
Hypothesis Testing (Two-Tailed)
H0: μ = value
H1: μ ≠ value
Reject H0 if :
α/2
Z X  Z / 2
0
or
Critical Value
t X  t / 2,( df n 1)
= -Zα/2
X 
ZX 
 n
α/2
Accept
H0
Critical Value
= Zα/2
X 
tX 
s n
Hypothesis Testing (One-Tailed)
H0: μ ≥ value
H1: μ < value
H0: μ ≤ value
H1: μ > value
Reject H0 if :
Reject H0 if :
Z X  Z
Z X   Z
or
t X  t / 2,( df n 1)
or
t X  t / 2,( df n 1)
α
α
0
0
CV= -Zα/2
Accept
H0
Accept
H0
CV= Zα/2
Hypothesis Testing (for the mean μ)
Hypothesis Testing (for the proportion π)
 Example
26, page 356
50% of students change their major within the first year. A
random sample of 100 students revealed that 48 students
changed their major within the first year. Has there been a
significant decrease in the number of students who
changed their major? Test at a 0.05 level of significance.
H0: π ≥ 0.5
H1: π < 0.5
 Example
26, page 356
H0: π ≥ 0.5
H1: π < 0.5
p=48/100 = 0.48
α = 0.05 => z of CV = -1.65 => H0 is rejected if z<-1.65
p 
z

 (1   )
n
0.48  0.5
 0.4
0.5(1  0.5)
100
-0.4 > -1.65 => Ho is not rejected. The proportion of students
changing their major has not changed.
Decisions and Consequences
Null
Hypothesis
Ho is true
Ho is false
Researcher
Accepts
Rejects
Ho
Ho
Correct
Type I error
decision
()
Type II
Error (b)
Correct
decision
Type II Error (β)
 Is
the probability that the null hypothesis is NOT rejected when
it is actually false.
 Example
(page 356): At a plant manufacturing pins using steel, past
experiences indicate that mean tensile strength of all incoming shipments µ0 is
10,000 psi and standard deviation σ is 400 psi.
 To make a decision, the manufacturer sets up the following rule to a quality
control inspector: Take a sample of 100 steel bars.
 At the .05 significance level, if the sample mean X-bar strength falls between
9,922 psi and 10,078 psi, accept a lot. Otherwise the lot is rejected.
 Suppose that the unknown population mean of an incoming lot, designated by µ1,
is really 9,900 psi. What is the probability that the inspector will fail to
reject the shipment (type II error)?
Type II Error (β)
Type II Error (β)
The z value of X c is:
z
X c  1

n
Then z  9,922  9,900  22  0.55
400 / 100
40
P(z>0.55)= .2088. So the probability of a type II error or
β is 0.5-.2088=.2912.
Always draw a picture of the normal curve and areas to
solve!
Chapter 11
Two-Sample Tests of Hypothesis
Our Objectives

Conduct a test of hypothesis about the difference between two
independent population means.

Conduct a test of a hypothesis about the difference between two
population proportions.

Conduct a test of a hypothesis about the mean difference between
paired or dependent observations.

Understand the difference between dependent and independent
samples.
Two-Sample Tests of Hypothesis

We take samples from two populations and compare the population
means.

In the one-sample test of hypothesis, we took a sample from a
population and compared the sample statistic to the population
parameter.

Example: Is there a difference in the mean value of residential real
estate sold by male agents and female agents in a particular area?
Two-Sample Test of Hypothesis:
Independent Samples
 Example
A financial accountant wishes to know whether there is a difference in
the mean rate of return for high yield mutual funds and global mutual
funds.
 There
are two independent populations: high yield mutual
funds , and the global mutual funds.

If there is a difference between the population means, then we expect that
there is a difference between the sample means.

If the size of the two samples is more than 30, we can reason that the
distribution of the difference in the sample means is Normal.

Mean of the distribution of the differences:


If zero, we conclude that there is no difference in the two populations.
If positive or negative value, we conclude that two populations do not have
the same mean.
Two-Sample Test of Hypothesis:
Independent Samples with known population SD σ
H0: μ1 = μ2
H1: μ1 ≠ μ2
H0: μ1 - μ2 =0
H1: μ1 - μ2 ≠ 0
1
X 1 ~ Norm( 1 ,
) and
n1
then
(X  X )
1
2
2
X 2 ~ Norm( 2 ,
)
n2

 12  22 
~ Norm  1  2 ,




n
n
1
2


Two-Sample Test of Hypothesis:
Independent Samples with known population SD σ
Standardize the distribution of the differences.
The test statistic for the difference between two means is:
z
X1  X 2
12
n1

 22
n2
The variance of the distribution of differences in
sample means is:

2
X1X 2

 12
n1

 22
n2
Given or
implied by the
problem
Decide if the
z or the t
distribution is
to be used.
Find the
Critical Values
and the
accept/reject
regions.
Find the z or t
value of the
sample and
check if it falls
in an
accept/reject
region.
 Example
2, page 374
First population:
A sample of 65 observations is selected. Population SD = 0.75.
Sample mean = 2.67.
Second population:
A sample of 50 observations is selected. Population SD = 0.66.
Sample mean = 2.59.
Use a 0.08 significance level.
H0: μ1 ≤ μ2
H1: μ1 > μ2
 Example
a)
b)
c)
d)
e)
2
Is this a one-tailed or a two-tailed test?
State the decision rule.
Compute the value of the test statistic
What is your decision regarding H0?
What is the p-value?
a) It is a one-tailed test.
b) For α = 0.08 and a one tailed test, then we reject H0 if z>1.41
(CV=1.41).
c)
z
X1  X 2

2
1
n1


2
2
n2

2.67  2.59
2
0.75 0.66

65
50
2
 0.607
 Example
2
d) 0.607 < 1.41, We fail to reject H0
e) p-value of the sample is
P( z  0.607)  0.5  0.2291  0.2709
Two-Sample Test of Hypothesis
(Proportions)
Standardize the distribution of the differences.
The test statistic for the difference between two
proportions is:
z
p1  p2
pc (1  pc ) pc (1  pc )

n1
n2
Two-Sample Test of Hypothesis
(Proportions)
p1 is the first sample proportion (p1=x1/n1)
p2 is the second sample proportion (p2=x2/n2)
pc is the pooled proportion

The pooled estimate of the population proportion is computed using the
formula:
x1  x2
pc 
n1  n2
 Example
12, page 378
Single People:
A sample of 400 people is selected. 120 had at least one
accident in the past three years.
Married People:
A sample of 600 people is selected. 150 had at least one
accident in the past three years.
Use a 0.05 significance level. Is there a significant difference in
the proportion of single and married people having accidents?
 Example
12
H 0: π m = π s
H 1: π m ≠ π s
0.05 significance level,
z of Critical Values: z = 1.96 and z=-1.96 (two tailed test). Accept
Region is between 1.96 and -1.96.
pc 
x1  x2
120  150
270


 0.27
n1  n2
400  600 1000
 Example
z

 Ho
12
p1  p2
pc (1  pc ) pc (1  pc )

n1
n2
(120 / 400)  (150 / 600)
 1.74
0.27(1  0.27) 0.27(1  0.27)

400
600
is not rejected. There is no difference in the proportion of
married and single drivers who have accidents.
Two-Sample Test of Hypothesis: Independent
Samples with unknown population SD σ and at least one
of the samples is less than 30.
o Use the following t distribution if:
o Independent Samples
o Both samples have unknown but equal population SD
o At least one of the samples is less than 30

We use the t statistic. We compute the t value using the formula:
t
X1  X 2
1 1 
s   
 n1 n2 
2
p
Sp squared is pooled estimate of population variance.
We use n1+n2-2 degrees of freedom.
So to find the value of t, 3 steps are performed:
1. compute s1 and s2
2. compute sp
3. determine t

Pooled variance is computed using the formula:
(n1  1)s  (n2  1)s
s 
n1  n2  2
2
p
2
1
2
2
Where s1 squared is the variance of the 1st sample;
s2 squared is the variance of the 2nd sample.
 Example
15, page 384
Men Examination Scores:
72 69 98 66 85 76 79
Women Examination Scores:
81 67 90 78 81 80 76
80
77
Is it reasonable to conclude that women score higher than men?
Use the 0.01 significance level.
 Example
15, page 384
Ho: f  m
H1: f > m
Use Appendix B2 to obtain Critical Values.
For significance level 0.01, one tailed test and df=n1+n2-2=14
we obtain a t=2.624 for the Critical Value.
Accept H0 if the sample t is less than 2.624.
Sf=6.88
Sm=9.49
X f  79
X m  78
s 2p 
(n f  1) s 2f  (nm  1) sm2
n f  nm  2
(7  1)(6.88)  (9  1)(9.49)

 71.749
972
2
t
X f Xm
 1
1 
s   
 n f nm 


2
p
t < 2.624 so we accept H0
2

79  78
1 1
71.749   
7 9
 0.234
Two-Sample Test of Hypothesis: Independent
Samples with unknown population SD σ and at least one
of the samples is less than 30.
o Use the following t distribution if:
o Independent Samples
o Both samples have unknown but can not assume
equal population SD
o At least one of the samples is less than 30

We use the t statistic. We compute the t value using the formula:
t

X1  X 2
s12 s22

n1 n2
Use the following for degree of freedom (round down if not an integer):
2
( s / n1 )  ( s / n2 ) 
df  2
( s1 / n1 ) 2 ( s22 / n2 ) 2

n1  1
n2  1
2
1
2
2
 Example
22, page 388
Klein Models:
5.0, 4.5, 3.4, 3.4, 6.0. 3.3, 4.5, 4.6, 3.5, 5.2, 4.8,
4.4, 4.6, 3.6, 5.0
Clairborne Models:
3.1, 3.7, 3.6, 4.0, 3.8, 3.8, 5.9, 4.9, 3.6, 3.6, 2.3, 4.0
Is it reasonable to conclude that Clairborne Models earn more?
Use the 0.05 significance level and assume the population
standard deviations are not the same.
 Example
15, page 384
Ho: k  c
H1: k > c
Use Appendix B2 to obtain Critical Values.
For significance level 0.05, one tailed test and df=22.
0.795
 0.881 )
(
15
12
df 
(0.795 15)  (0.881 12)
2
2
14
2
2
2
2
11
2
 22.5
we obtain a t=1.717 for the Critical Value.
Accept H0 if the sample t is less than 2.624.
t
4.387  3.858
2
2
 1.619
0.795
0.881

15
12
t < 1.717 so we fail to reject the null hypothesis