Download Interval Estimation II

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Confidence interval wikipedia , lookup

Gibbs sampling wikipedia , lookup

Statistical inference wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Summer 2004
a)
The amount of rent paid by students in a large class follows a Normal
distribution with population mean µ = €70 and population standard
deviation σ = €3.5.
(i)
What range of values contains approximately 95% of all rents paid by
students in this class?
[2 Marks]
(ii) Write down the approximate sampling distribution of the means of all
possible samples of size 100 drawn randomly from this class.
[2 Marks]
(iii) If 500 such samples were chosen at random and for each sample you
calculated a 90% confidence interval for the true mean, how many such
intervals would you expect to contain this population mean?
[2 Marks]
Sampling distribution of the Mean II
Summer 2002
(b) The director of quality at a light bulb factory needs to estimate the
average life of a large shipment of light bulbs. A random sample of 100
light bulbs indicated a sample average life of 350 hours with a sample
standard deviation of 100 hours.
i)
Construct and interpret a 95% confidence interval estimate of the
true average life of light bulbs in this shipment.
[8 Marks]
ii)
Do you think the manufacturer has the right to state that the light
bulbs last an average of 380 hours? Explain.
[2 Marks]
iii) Does the population have to be normally distributed here for the
interval to be valid? Explain.
[2 Marks]
iv) Explain why an observed value of 320 hours is not unusual, even
though it is outside the 95% confidence interval you have
calculated.
[2 Marks]
n < 30
x ± tα ∗
What happens if you can
only take a “small”
sample?
2
σ
n
Population normal
Student’s t-distribution
Density curves for Student’s t
df = ∞
[i.e., Normal(0,1)]
df = 5
df = 2
-4
-2
0
Figure 7.6.1
2
4
Student(df) density curves for various df.
• is mound shaped and centered at zero like
the Normal(0,1) but is more variable
• depends on the degrees of freedom df which
is equal to n-1
• As df becomes larger, the t distribution
becomes more and more like the
Normal(0,1) distribution
From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.
Reading Student’s t table
t Tables
Student(df) density
TABLE 7.6.1 Extracts from the S tudent's t-Distribution Table
prob
prob
Desired
df
0
Desired
upper-tail prob
tdf (prob)
df
6
7
8
…
10
…
15
…
∞
.20
0.906
0.896
0.889
…
0.879
…
0.866
…
0.842
.15
1.134
1.119
1.108
…
1.093
…
1.074
…
1.036
.10
1.440
1.415
1.397
…
1.372
…
1.341
…
1.282
.05
1.943
1.895
1.860
…
1.812
…
1.753
…
1.645
.025
2.447
2.365
2.306
…
2.228
…
2.131
…
1.960
.01
3.143
2.998
2.896
…
2.764
…
2.602
…
2.326
t-value
.005
3.707
3.499
3.355
…
3.169
…
2.947
…
2.576
.001
5.208
4.785
4.501
…
4.144
…
3.733
…
3.090
.0005
5.959
5.408
5.041
…
4.587
…
4.073
…
3.291
.0001
8.025
7.063
6.442
…
5.694
…
4.880
…
3.719
100(1-α)% ‘Small Sample’
Comparing t and Z values
C o n fid e n c e t v a lu e w ith Z v a lu e
le v e l
5 d .f
90%
2 .0 1 5
1 .6 5
95%
2 .5 7 1
1 .9 6
99%
4 .0 3 2
2 .5 8
For small samples, t value is larger than Z value
hence t interval is wider than Z interval.
Confidence Interval for µ
In repeated sampling, 100(1-α)% of
intervals calculated in this manner
s
x ± tα ∗
2
n
(with n-1 df) will contain µ.
Assumptions and Conditions
– The data arise from a random sample or
suitably randomized experiment. Randomly
sampled data (particularly from an SRS) are
ideal.
µ = 4.36
– the data are from a population that follows a
Normal model
n = 20, x = 4.6, s = 3.75
• Beware skewed data
• Beware outliers
95% C.I. for µ ?
1359
1260
1344
1249
1350
n = 12,
1220
1205
1217
1228
x = 1261
Dates
1155
1250
s = 61.2
1315
1150
1311
1271
So What Do We Know?
• We now have techniques for inference
about a mean from small samples. We can
create confidence intervals and test
hypotheses.
• The sampling distribution for the mean (for
small samples) follows Student’s tdistribution and not the Normal.
• The t-model is a family of distributions
indexed by degrees of freedom.
Are You Normal? How Can You Tell?
• When you actually have your own data,
you must check to see whether a Normal
model is reasonable.
• Looking at a histogram of the data is a
good way to check that the underlying
distribution is roughly unimodal and
symmetric.
Are You Normal? (cont.)
Are You Normal? (cont.)
• A more specialized graphical display that
can help you decide whether a Normal
model is appropriate is the Normal
probability plot.
• If the distribution of the data is roughly
Normal, the Normal probability plot
approximates a diagonal straight line.
Deviations from a straight line indicate that
the distribution is not Normal.
• Nearly Normal data have a histogram and
a Normal probability plot that look
somewhat like this example:
Are You Normal? (cont.)
What Can Go Wrong?
• A skewed distribution might have a
histogram and Normal probability plot like
this:
• Don’t use Normal models when the
distribution is not unimodal and symmetric.
• Don’t use the mean and standard
deviation when outliers are present—the
mean and standard deviation can both be
distorted by outliers.