Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia, lookup

Degrees of freedom (statistics) wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

Regression toward the mean wikipedia, lookup

Taylor's law wikipedia, lookup

Resampling (statistics) wikipedia, lookup

Misuse of statistics wikipedia, lookup

Student's t-test wikipedia, lookup

Transcript
```BA 275
Agenda
 Statistical Inference for Small Sample Size
 Statistical Inference for Two-Sample
Problems
1
Quiz #4: Question 1
Past experience indicates that the monthly long-distance telephone
bill is normally distributed with a mean of \$17.85 and a
(population) standard deviation \$3.87. After an advertising
campaign aimed at increasing long-distance telephone usage, a
random sample of 25 household bills was taken. You are
concerned whether the campaign was successful, and would
like to perform a test to find out.
1. What are the null and the alternative hypotheses?
2. If the sample mean turns out to be \$15, do you reject the null
hypothesis? Why or why not? Assume a = 5%.
3. If the sample mean turns out to be \$29.13, do you reject the null
hypothesis? Why or why not? Assume a = 5%.
4. Finally, the actual sample mean in your sample is \$19.13. Do
you reject the null hypothesis? Was the campaign successful?
Assume a = 5%.
2
Quiz #4: Question 2
The Survey of Study Habits and Attitudes (SSHA) is a psychological test
that measures the motivation, attitude toward school, and study habits
of students. The mean score for U.S. college students is about 115,
and the standard deviation is about 30. A teacher suspects that older
students (30 years or older) have better attitudes toward school and
wishes to test H0: m = 115 vs. Ha: m > 115. To test the hypothesis, she
decides to use a sample of n = 25, and to reject the null hypothesis H0
if the sample mean > 126.
1.
2.
3.
4.
5.
Find the probability of a Type I error.
Find the probability of a Type II error when m = 135.
Find the power of the test when m = 135.
Suppose the sample mean turns out to be 131.2. Find the p-value of
the test.
Construct a 95% confidence interval for the mean SSHA score for
older students. Write your final answer in the following format: ( point
estimate ) ± ( margin of error )
3
Central Limit Theorem (CLT)
  is unknown but n is large

2
s
If X ~ N ( m ,  ) , then X ~ N ( m , ) .  N ( m , )
n
n
2
2
If X ~ any distribution with a mean m, and variance 2,
then X ~ N ( m ,
2
n
) given that n is large.
s2
 N (m , )
n
4
Central Limit Theorem (CLT)
  is unknown and n is small
s2
X ~ t ( m , ) with degrees of freedom  n  1.
n
If X ~ N ( m ,  2 ) , then X ~ N ( m ,
2
).
n
If X ~ any distribution with a mean m, and variance 2,
then X ~ N ( m ,
2
n
) given that n is large.
5
T distribution with degrees of freedom
5 vs. Normal(0, 1)
6
Example 1 (Example 3 from 2-7-07)
 A random sample of 10 one-bedroom apartments (Ouch, a small




sample) from your local newspaper gives a sample mean of
\$541.5 and sample standard deviation of \$69.16. Assume a =
5%.
Q1. Does the sample give good reason to believe that the mean
rent of all advertised apartments is greater than \$500 per
month? (Need H0 and Ha, rejection region and conclusion.)
Q2. Find the p-value.
Q3. Construct a 95% confidence interval for the mean rent of all
Q4. What assumption is necessary to answer Q1-Q3.
7
Example 2
 A bank wonders whether omitting the annual credit card fee for
customers who charge at least \$2400 in a year would increase
the amount charged on its credit card the following year. A
random sample of 51 customers is chosen to see if the mean
amount charged increases from the previous year under the nofee offer. The mean increase is \$342 and the standard deviation
is \$108.
 Q1. Let a = 5%. State H0 and Ha and carry out a t test.
Approximate the p-value.
 Q2. Give a 95% confidence interval for the mean amount of the
increase.
 Q3. Suppose that the bank wanted to be quite certain of
detecting a mean increase of m = \$100 in the amount charged.
Is n = 51 enough to detect the increase?
SG Demo
8
Two-Sample Inference on m1 – m2
Population #1
m1
m1  m 2
Sample #1
X1
Population #2
m2
Sample #2
X1  X 2
X2
9
Example 3
Sample size
Sample mean
Sample Std.
U.S. Sales
30
\$14,545
\$ 1,989
Japan Sales
50
\$15,243
\$ 1,842
Q1. Do we have enough evidence to claim that the auto retail price in
Japan is higher?
Q2. If so, by how much?
10
Example 4
Do government employees take longer coffee breaks than private
sector workers?
Summary Statistics
Sample Mean
Sample Std.
Sample Size
Number of Minutes of Coffee Breaks
Government Workers
25.83
6.31
6
Private Sector Workers
19.83
3.06
6
11
 Example 1: see the slides from 2-7-2007.
 Example 2:
 Q1. H0: m = 0 vs. Ha: m > 0. Given a = 5% and df = n
– 1 = 50, the rejection region is defined as: Reject H0
if t > 1.676 (or if the sample mean > 25.5984.)
342  0
t

 22.6145 , we should reject H0.
Since
108 / 50
 Q2. 342 ± 2.009 x 108 / SQRT(51)
 Q3. Power = P( detecting a mean increase of \$100 ) =
P( being able to reject H0 when true m = \$100 ) = P(
the sample mean > 25.5984 when m = \$100 ) =
1.0000. Note that 25.5984 came from the rejection
region in Q1.
12
T Table (Table D)
13
```
Related documents