* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Interval Estimation II
Survey
Document related concepts
Transcript
Summer 2004 a) The amount of rent paid by students in a large class follows a Normal distribution with population mean µ = €70 and population standard deviation σ = €3.5. (i) What range of values contains approximately 95% of all rents paid by students in this class? [2 Marks] (ii) Write down the approximate sampling distribution of the means of all possible samples of size 100 drawn randomly from this class. [2 Marks] (iii) If 500 such samples were chosen at random and for each sample you calculated a 90% confidence interval for the true mean, how many such intervals would you expect to contain this population mean? [2 Marks] Sampling distribution of the Mean II Summer 2002 (b) The director of quality at a light bulb factory needs to estimate the average life of a large shipment of light bulbs. A random sample of 100 light bulbs indicated a sample average life of 350 hours with a sample standard deviation of 100 hours. i) Construct and interpret a 95% confidence interval estimate of the true average life of light bulbs in this shipment. [8 Marks] ii) Do you think the manufacturer has the right to state that the light bulbs last an average of 380 hours? Explain. [2 Marks] iii) Does the population have to be normally distributed here for the interval to be valid? Explain. [2 Marks] iv) Explain why an observed value of 320 hours is not unusual, even though it is outside the 95% confidence interval you have calculated. [2 Marks] n < 30 x ± tα ∗ What happens if you can only take a “small” sample? 2 σ n Population normal Student’s t-distribution Density curves for Student’s t df = ∞ [i.e., Normal(0,1)] df = 5 df = 2 -4 -2 0 Figure 7.6.1 2 4 Student(df) density curves for various df. • is mound shaped and centered at zero like the Normal(0,1) but is more variable • depends on the degrees of freedom df which is equal to n-1 • As df becomes larger, the t distribution becomes more and more like the Normal(0,1) distribution From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000. Reading Student’s t table t Tables Student(df) density TABLE 7.6.1 Extracts from the S tudent's t-Distribution Table prob prob Desired df 0 Desired upper-tail prob tdf (prob) df 6 7 8 … 10 … 15 … ∞ .20 0.906 0.896 0.889 … 0.879 … 0.866 … 0.842 .15 1.134 1.119 1.108 … 1.093 … 1.074 … 1.036 .10 1.440 1.415 1.397 … 1.372 … 1.341 … 1.282 .05 1.943 1.895 1.860 … 1.812 … 1.753 … 1.645 .025 2.447 2.365 2.306 … 2.228 … 2.131 … 1.960 .01 3.143 2.998 2.896 … 2.764 … 2.602 … 2.326 t-value .005 3.707 3.499 3.355 … 3.169 … 2.947 … 2.576 .001 5.208 4.785 4.501 … 4.144 … 3.733 … 3.090 .0005 5.959 5.408 5.041 … 4.587 … 4.073 … 3.291 .0001 8.025 7.063 6.442 … 5.694 … 4.880 … 3.719 100(1-α)% ‘Small Sample’ Comparing t and Z values C o n fid e n c e t v a lu e w ith Z v a lu e le v e l 5 d .f 90% 2 .0 1 5 1 .6 5 95% 2 .5 7 1 1 .9 6 99% 4 .0 3 2 2 .5 8 For small samples, t value is larger than Z value hence t interval is wider than Z interval. Confidence Interval for µ In repeated sampling, 100(1-α)% of intervals calculated in this manner s x ± tα ∗ 2 n (with n-1 df) will contain µ. Assumptions and Conditions – The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly from an SRS) are ideal. µ = 4.36 – the data are from a population that follows a Normal model n = 20, x = 4.6, s = 3.75 • Beware skewed data • Beware outliers 95% C.I. for µ ? 1359 1260 1344 1249 1350 n = 12, 1220 1205 1217 1228 x = 1261 Dates 1155 1250 s = 61.2 1315 1150 1311 1271 So What Do We Know? • We now have techniques for inference about a mean from small samples. We can create confidence intervals and test hypotheses. • The sampling distribution for the mean (for small samples) follows Student’s tdistribution and not the Normal. • The t-model is a family of distributions indexed by degrees of freedom. Are You Normal? How Can You Tell? • When you actually have your own data, you must check to see whether a Normal model is reasonable. • Looking at a histogram of the data is a good way to check that the underlying distribution is roughly unimodal and symmetric. Are You Normal? (cont.) Are You Normal? (cont.) • A more specialized graphical display that can help you decide whether a Normal model is appropriate is the Normal probability plot. • If the distribution of the data is roughly Normal, the Normal probability plot approximates a diagonal straight line. Deviations from a straight line indicate that the distribution is not Normal. • Nearly Normal data have a histogram and a Normal probability plot that look somewhat like this example: Are You Normal? (cont.) What Can Go Wrong? • A skewed distribution might have a histogram and Normal probability plot like this: • Don’t use Normal models when the distribution is not unimodal and symmetric. • Don’t use the mean and standard deviation when outliers are present—the mean and standard deviation can both be distorted by outliers.