Download Introduction to Hypothesis Testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia, lookup

History of statistics wikipedia, lookup

Transcript
Lecture 2: Thu, Jan 16
• Hypothesis Testing – Introduction (Ch 11)
• Concepts of testing
• Tests of Hypothesis (Sigma known)
– Rejection Region method
– P-value method
– Two – tail test example
• Relationship between Tests and C.I
1
Introduction
• The purpose of hypothesis testing is to determine
whether there is enough statistical evidence in favor of
a certain belief about a parameter.
• Examples
– Does the statistical evidence in a random sample of potential
customers support the hypothesis that more than 10% of the
potential customers will purchase a new product?
– Is a new drug effective in curing a certain disease? A sample of
patients is randomly selected. Half of them are given the drug
while the other half are given a placebo. The improvement in the
patients’ condition is then measured and compared.
2
Hypothesis Testing in the Courtroom
• Null hypothesis: The defendant is innocent
• Alternative (research) hypothesis: The defendant is
guilty
• The goal of the procedure is to determine whether there
is enough evidence to conclude that the alternative
hypothesis is true. The burden of proof is on the
alternative hypothesis.
• Two types of errors:
– Type I error: Reject null hypothesis when null hypothesis is
true (convict an innocent defendant)
– Type II error: Do not reject null hypothesis when null is false
(fail to convict a guilty defendant)
3
Concepts of Hypothesis Testing
• The critical concepts of hypothesis testing.
– Example 11.1
• The manager of a department store is thinking about
establishing a new billing system for the store’s credit customers
• The new system will be cost effective only if the mean monthly
account ( m )is more than $170.
– There are two hypotheses about a population mean:
• H0: The null hypothesis
m = 170
• H1: The alternative hypothesis m > 170 (What you want to prove)
4
• Assume the null hypothesis is true (m= 170).
m = 170
– Sample from the customer population, and build a statistic
related to the parameter hypothesized (the sample mean).
– Pose the question: How probable is it to obtain a
sample mean at least as extreme as the one observed
from the sample, if H0 is correct?
5
• Assume the null hypothesis is true (m= 170).
• Common sense suggests the following.
– Suppose x is much larger than 170, then the mean m is
likely to be greater than 170. Reject the null hypothesis.
m = 170
– When the sample mean is close to 170, it is not
implausible that the mean m is 170. Do not reject the
null hypothesis.
6
Types of Errors
• Two types of errors may occur when deciding whether to
reject H0 based on the statistic value.
– Type I error: Reject H0 when it is true.
– Type II error: Do not reject H0 when it is false.
• Example continued
– Type I error: Reject H0 (m = 170) in favor of H1 (m > 170)
when the real value of m is 170.
– Type II error: Believe that H0 is correct (m = 170) when
the real value of m is greater than 170.
7
Testing the Population Mean When the
Population Standard Deviation is Known
• Example 11.1
– A new billing system for a department store will be costeffective only if the mean monthly account is more than
$170.
– A sample of 400 accounts has a mean of $178.
– If accounts are approximately normally distributed with
s = $65, can we conclude that the new system will be
cost effective?
8
Testing the Population Mean (s is Known)
• Example 11.1 – Solution
– The population of interest is the credit accounts at
the store.
– We want to know whether the mean account for all
customers is greater than $170.
H1 : m > 170
– The null hypothesis must specify a single value of
the parameter m,
H0 : m = 170
9
Approaches to Testing
• There are two approaches to test whether the
sample mean supports the alternative
hypothesis (H1)
– The rejection region method is mandatory for
manual testing (but can be used when testing is
supported by a statistical software)
– The p-value method which is mostly used when a
statistical software is available.
10
The Rejection Region Method
The rejection region is a range of values such
that if the test statistic falls into that range, the
null hypothesis is rejected in favor of the
alternative hypothesis.
11
The Rejection Region Method –
for a Right - Tail Test
Example 11.1 – solution continued
• Recall:
therefore,
H0: m = 170
H1: m > 170
• It seems reasonable to reject the null hypothesis and
believe that m > 170 if the sample mean is sufficiently large.
Reject H0 here
Critical value of the
sample mean
12
The Rejection Region Method
for a Right - Tail Test
Example 11.1 – solution continued
• Define a critical value xL for x that is just large enough
to reject the null hypothesis.
• Reject the null hypothesis if
x  xL
13
Determining the Critical Value for the
Rejection Region
• Allow the probability of committing a Type I error
be a (also called the significance level).
• Find the value of the sample mean that is just
large enough so that the actual probability of
committing a Type I error does not exceed a.
Watch…
14
Determining the Critical Value –
for a Right – Tail Test
Example 11.1 – solution continued
za 
a
m x  170
x L  170
65
400
xL
x
P(commit a Type I error) = P(reject H0 given that H0 is true)
= P( x  x L given that H0 is true) … is allowed to be a.
Since P( Z  Z a )  a we have:
15
Determining the Critical Value –
for a Right – Tail Test
Example 11.1 – solution continued
a = 0.05
m x  170
xL
za 
x L  170
65
400
65
x L  170  z a
.
400
If we select a  0.05, z .05  1.645.
65
x L  170  1.645
 175.34.
400
16
Determining the Critical value
for a Right - Tail Test
Re ject the null hypothesis if
x  175.34
Conclusion
Since the sample mean (178) is greater than
the critical value of 175.34, there is sufficient
evidence to infer that the mean monthly
balance is greater than $170 at the 5%
significance level.
17
The standardized test statistic
– Instead of using the statistic x , we can use the
standardized value z.
z
x m
s
n
– Then, the rejection region becomes
z  za
One tail test
18
The standardized test statistic
• Example 11.1 - continued
– We redo this example using the standardized test
statistic.
Recall: H0: m = 170
H1: m > 170
– Test statistic:
z
x m
s
n

178  170
65
400
 2.46
– Rejection region: z > z.05  1.645.
19
The standardized test statistic
• Example 11.1 - continued
Re ject the null hypothesis if
Z  1.645
Conclusion
Since Z = 2.46 > 1.645, reject the null
hypothesis in favor of the alternative
hypothesis.
20
P-value Method
– The p-value provides information about the amount of
statistical evidence that supports the alternative
hypothesis.
– The p-value of a test is the probability of observing a
test statistic at least as extreme as the one computed,
given that the null hypothesis is true.
– Let us demonstrate the concept on Example 11.1
21
P-value Method
The probability of observing a
test statistic at least as extreme as 178,
given that m = 170 is…
P( x  178 when m  170)
178  170
 P( z 
)
65 400
 P( z  2.4615 )  .0069
m x  170
x  178
The p-value
22
Interpreting the p-value
• Because the probability that the sample mean will
assume a value of more than 178 when m = 170 is so
small (.0069), there are reasons to believe that
m > 170.
• In addition note that observing a value of 178 when the
true mean is 170 is rare, but under the alternative
hypothesis, observing a value of 178 becomes more
probable.
• We can conclude that the smaller the p-value the more
statistical evidence exists to support the alternative
hypothesis.
23
Interpreting the p-value
• Describing the p-value
– If the p-value is less than 1%, there is overwhelming
evidence that supports the alternative hypothesis.
– If the p-value is between 1% and 5%, there is a strong
evidence that supports the alternative hypothesis.
– If the p-value is between 5% and 10% there is a weak
evidence that supports the alternative hypothesis.
– If the p-value exceeds 10%, there is no evidence that
supports the alternative hypothesis.
24
The p-value and the Rejection Region
Methods
– The p-value can be used when making decisions
based on rejection region methods as follows:
• Define the hypotheses to test, and the required
significance level a.
• Perform the sampling procedure, calculate the test
statistic and the p-value associated with it.
• Compare the p-value to a. Reject the null hypothesis only
if p-value <a; otherwise, do not reject the null hypothesis.
a = 0.05
25
Conclusions of a Test of Hypothesis
• If we reject the null hypothesis, we conclude that
there is enough evidence to infer that the alternative
hypothesis is true.
• If we do not reject the null hypothesis, we conclude
that there is not enough statistical evidence to infer
that the alternative hypothesis is true.
• Remember the truth of the alternative hypothesis is
what we are investigating. The conclusion focuses
on the validity of the alternative hypothesis.
26
A Two - Tail Test
• Example 11.2
– AT&T has been challenged by competitors who argued that
their rates resulted in lower bills.A statistics practitioner
determines that the mean and standard deviation of monthly
long-distance bills for all AT&T residential customers are
$17.09 and $3.87 respectively.
A random sample of 100 customers is selected and
customers’ bills recalculated using a leading competitor’s
rates. The sample mean of customers’ bills is $17.55.
Assuming the standard deviation is the same (3.87), can we
infer at the 5% significance level that there is a difference
between AT&T’s bills and the competitor’s bills (on the
average)?
27
28
Two tail tests and C.I.
• Note that both tests and C.I. are computed based on
the sampling distribution of the mean. To illustrate, the
95% C.I for the population mean is [16.79, 18.31] which
includes 17.09.
• Thus we cannot conclude that there is sufficient
evidence to infer that the population mean differs from
17.09
• Use of C.I has the advantage of simplicity but has two
important drawbacks
– Lack of correspondence to one-tail tests
– No p-value type information
29
Problem 11.54
Many Alpine ski centers base their projections of revenues
and profits on the assumption that the average Alpine
skier skis 4 times per year.
To investigate the validity of this assumption, a random
sample of 63 skiers is drawn and each is asked to report
the number of times they skied the previous year.
Assume that the population standard deviation is 2, and
the sample mean is 4.84. Can we infer at the 10% level
that the assumption is wrong?
30
• Suggested Problems: 11.6,11.42, 11.44
• Next Time: Finish Chapter 11 (Section 11.4),
Begin Chapter 12
31