Download Basic Statistics week 8

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bayesian inference in marketing wikipedia , lookup

Transcript
Statistics
Week 8
Fundamentals of Hypothesis
Testing: One-Sample Tests
1
Goals of this note
After completing this noe, you should be
able to:

Formulate null and alternative hypotheses for
applications involving a single population mean or
proportion

Formulate a decision rule for testing a hypothesis

Know how to use the p-value approaches to test the
null hypothesis for both mean and proportion
problems

Know what Type I and Type II errors are
2
What is a Hypothesis?
 A hypothesis is a claim
(assumption) about a
population parameter:
 population mean
The average number of TV sets in U.S.
homes is equal to three ( μ  3 )
 population proportion
A marketing company claims that it receives
8% responses from its mailing. ( p=.08 )
3
The Null Hypothesis, H0
 States the assumption to be tested
Example: The average number of TV sets in
U.S. Homes is equal to three ( H0 : μ  3 )
 Is always about a population parameter,
not about a sample statistic
H0 : μ  3
H0 : X  3
4
The Null Hypothesis, H0
(continued)
 Begins with the assumption that the null
hypothesis is true
 Similar to the notion of innocent until
proven guilty
 Refers to the status quo
 Always contains “=” , “≤” or “” sign
 May or may not be rejected
5
The Alternative Hypothesis, H1
 Is the opposite of the null hypothesis
 e.g.: The average number of TV sets in U.S.
homes is not equal to 3 ( H1: μ ≠ 3 )
 Challenges the status quo
 Never contains the “=” , “≤” or “” sign
 Is generally the hypothesis that is believed (or
needs to be supported) by the researcher
6
Hypothesis Testing
 We assume the null hypothesis is true
 If the null hypothesis is rejected we have proven
the alternate hypothesis
 If the null hypothesis is not rejected we have
proven nothing as the sample size may have
been to small
7
Hypothesis Testing Process
Claim: the
population
mean age is 50.
(Null Hypothesis:
H0: μ = 50 )
Population
Is X 20 likely if μ = 50?
If not likely,
REJECT
Null Hypothesis
Suppose
the sample
mean age
is 20: X = 20
Now select a
random sample
Sample
Sampling Distribution of X
 There are two
cutoff values
(critical values),
defining the regions
of rejection
H0: μ = 50
H1: μ  50
/2
/2
X
50
Reject H0
Do not reject H0
Reject H0
0
20
Likely Sample Results
Lower
critical
value
Upper
critical
value
9
Level of Significance, 
 Defines the unlikely values of the sample statistic if
the null hypothesis is true
 Defines rejection region of the sampling distribution
 Is designated by  , (level of significance)
 Typical values are .01, .05, or .10
 Is the compliment of the confidence coefficient
 Is selected by the researcher before sampling
 Provides the critical value of the test
10
Level of Significance
and the Rejection Region
Level of significance =
H0: μ = 3
H1: μ ≠ 3

/2
Two tailed test
/2
Rejection
region is
shaded
0
H0: μ ≤ 3
H1: μ > 3
Represents
critical value

0
Upper tail test
H0: μ ≥ 3
H1: μ < 3

Lower tail test
0
11
Errors in Making Decisions
 Type I Error
 When a true null hypothesis is rejected
 The probability of a Type I Error is 
 Called level of significance of the test
 Set by researcher in advance
 Type II Error
 Failure to reject a false null hypothesis
 The probability of a Type II Error is β
12
Example
Possible Jury Trial Outcomes
The Truth
Verdict
Innocent
Innocent
No
error
Guilty
Type I Error
Guilty
Type II Error
No Error
13
Outcomes and Probabilities
Possible Hypothesis Test Outcomes
Decision
Key:
Outcome
(Probability)
Actual
Situation
H0 True
H0 False
Do Not
Reject
H0
No error
(1 -  )
Type II Error
(β)
Reject
H0
Type I Error
()
No Error
(1-β)
14
Type I & II Error Relationship
 Type I and Type II errors can not happen at
the same time

Type I error can only occur if H0 is true

Type II error can only occur if H0 is false
If Type I error probability (  )
, then
Type II error probability ( β )
15
p-Value Approach to Testing
 p-value: Probability of obtaining a test
statistic more extreme ( ≤ or  ) than the
observed sample value given H0 is true
 Also called observed level of significance
16
p-Value Approach to Testing
(continued)
 Convert Sample Statistic (e.g. X ) to Test
Statistic (e.g. t statistic )
 Obtain the p-value from a table or computer
 Compare the p-value with 
 If p-value <  , reject H0
 If p-value   , do not reject H0
17
8 Steps in
Hypothesis Testing
1.
2.
3.
4.
5.
6.
7.
8.
State the null hypothesis, H0
State the alternative hypotheses, H1
Choose the level of significance, α
Choose the sample size, n
Determine the appropriate test statistic to use
Collect the data
Compute the p-value for the test statistic from the
sample result
Make the statistical decision: Reject H0 if the p-value
is less than alpha
Express the conclusion in the context of the problem
18
Hypothesis Tests for the Mean
Hypothesis
Tests for 
 Known
 Unknown
19
Hypothesis Testing Example
Test the claim that the true mean # of TV
sets in U.S. homes is equal to 3.
 1.
State the appropriate null and alternative
hypotheses
H0: μ = 3
H1: μ ≠ 3 (This is a two tailed test)
 2. Specify the desired level of significance
Suppose that  = .05 is chosen for this test
 3. Choose a sample size
Suppose a sample of size n = 100 is selected
20
Hypothesis Testing Example
(continued)
 4.
 5.
Determine the appropriate Test
σ is unknown so this is a t test
Collect the data
Suppose the sample results are
n = 100,
 6.
X = 2.84 s = 0.8
So the test statistic is:
t 
X μ
2.84  3
 .16


 2.0
s
0.8
.08
n
100
The p value for n=100, =.05, t=-2 is .048
21
Hypothesis Testing Example
(continued)
 7. Is the test statistic in the rejection region?
Reject H0 if p is < alpha; otherwise
do not reject H0
The p-value .048 is < alpha .05, we
reject the null hypothesis
22
Hypothesis Testing Example
(continued)

8. Express the conclusion in the context of the problem
Since The p-value .048 is < alpha .05,
we have rejected the null hypothesis
Thereby proving the alternate hypothesis
Conclusion: There is sufficient evidence that the mean
number of TVs in U.S. homes is not equal to 3
If we had failed to reject the null hypothesis the
conclusion would have been: There is not sufficient
evidence to reject the claim that the mean number of
TVs in U.S. home is 3
23
One Tail Tests
 In many cases, the alternative hypothesis
focuses on a particular direction
H0: μ ≥ 3
H1: μ < 3
H0: μ ≤ 3
H1: μ > 3
This is a lower tail test since the
alternative hypothesis is focused on
the lower tail below the mean of 3
This is an upper tail test since the
alternative hypothesis is focused on
the upper tail above the mean of 3
24
Lower Tail Tests
H0: μ ≥ 3
 There is only one
critical value, since
the rejection area is
in only one tail
H1: μ < 3

Reject H0
-t
Do not reject H0
3
X
Critical value
25
Upper Tail Tests
 There is only one
critical value, since
the rejection area is
in only one tail
t
H0: μ ≤ 3
H1: μ > 3

Do not reject H0
3
tα
Reject H0
X
Critical value
26
Assumptions of the One-Sample t Test
 The data is randomly selected
 The population is normally distributed or
the sample size is over 30 and the population is
not highly skewed
27
Hypothesis Tests for Proportions
 Involves categorical values
 Two possible outcomes
 “Success” (possesses a certain characteristic)
 “Failure” (does not possesses that characteristic)
 Fraction or proportion of the population in the
“success” category is denoted by p
28
Proportions
(continued)
 Sample proportion in the success category is
denoted by ps
X number of successes in sample
 ps  n 
sample size
 When both np and n(1-p) are at least 5, ps
can be approximated by a normal distribution
with mean and standard deviation

p(1  p)
μps  p
σps 
n
29
Hypothesis Tests for Proportions
 The sampling
distribution of ps
is approximately
normal, so the test
statistic is a Z
value:
Z
ps  p
p(1  p)
n
Hypothesis
Tests for p
np  5
and
n(1-p)  5
np < 5
or
n(1-p) < 5
Not discussed
in this chapter
30
Example: Z Test for Proportion
A marketing company
claims that it receives
8% responses from its
mailing. To test this
claim, a random sample
of 500 were surveyed
with 25 responses. Test
at the  = .05
significance level.
Check:
n p = (500)(.08) = 40

n(1-p) = (500)(.92) = 460
31
Z Test for Proportion: Solution
Test Statistic:
H0: p = .08
H1: p  .08
Z
 = .05
n = 500, ps = .05
Critical Values: ± 1.96
Reject
Reject
.025
.025
-1.96
-2.47
0
1.96
z
ps  p

p(1  p)
n
.05  .08
 2.47
.08(1  .08)
500
p-value for -2.47 is .0134
Decision:
Reject H0 at  = .05
There
is sufficient
Conclusion:
evidence to reject the
company’s claim of 8%
response rate.
32
Potential Pitfalls and
Ethical Considerations
 Use randomly collected data to reduce selection biases
 Do not use human subjects without informed consent
 Choose the level of significance, α, before data
collection
 Do not employ “data snooping” to choose between onetail and two-tail test, or to determine the level of
significance
 Do not practice “data cleansing” to hide observations
that do not support a stated hypothesis
 Report all pertinent findings
33
Summary
 Addressed hypothesis testing methodology
 Discussed critical value and p–value approaches to
hypothesis testing
 Discussed type 1 and Type2 errors
 Performed two tailed t test for the mean (σ unknown)
 Performed Z test for the proportion
 Discussed one-tail and two-tail tests
 Addressed pitfalls and ethical issues
34