Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Summer 2004 a) The amount of rent paid by students in a large class follows a Normal distribution with population mean µ = €70 and population standard deviation σ = €3.5. (i) What range of values contains approximately 95% of all rents paid by students in this class? [2 Marks] (ii) Write down the approximate sampling distribution of the means of all possible samples of size 100 drawn randomly from this class. [2 Marks] (iii) If 500 such samples were chosen at random and for each sample you calculated a 90% confidence interval for the true mean, how many such intervals would you expect to contain this population mean? [2 Marks] Sampling distribution of the Mean II Summer 2002 (b) The director of quality at a light bulb factory needs to estimate the average life of a large shipment of light bulbs. A random sample of 100 light bulbs indicated a sample average life of 350 hours with a sample standard deviation of 100 hours. i) Construct and interpret a 95% confidence interval estimate of the true average life of light bulbs in this shipment. [8 Marks] ii) Do you think the manufacturer has the right to state that the light bulbs last an average of 380 hours? Explain. [2 Marks] iii) Does the population have to be normally distributed here for the interval to be valid? Explain. [2 Marks] iv) Explain why an observed value of 320 hours is not unusual, even though it is outside the 95% confidence interval you have calculated. [2 Marks] n < 30 x ± tα ∗ What happens if you can only take a “small” sample? 2 σ n Population normal Student’s t-distribution Density curves for Student’s t df = ∞ [i.e., Normal(0,1)] df = 5 df = 2 -4 -2 0 Figure 7.6.1 2 4 Student(df) density curves for various df. • is mound shaped and centered at zero like the Normal(0,1) but is more variable • depends on the degrees of freedom df which is equal to n-1 • As df becomes larger, the t distribution becomes more and more like the Normal(0,1) distribution From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000. Reading Student’s t table t Tables Student(df) density TABLE 7.6.1 Extracts from the S tudent's t-Distribution Table prob prob Desired df 0 Desired upper-tail prob tdf (prob) df 6 7 8 … 10 … 15 … ∞ .20 0.906 0.896 0.889 … 0.879 … 0.866 … 0.842 .15 1.134 1.119 1.108 … 1.093 … 1.074 … 1.036 .10 1.440 1.415 1.397 … 1.372 … 1.341 … 1.282 .05 1.943 1.895 1.860 … 1.812 … 1.753 … 1.645 .025 2.447 2.365 2.306 … 2.228 … 2.131 … 1.960 .01 3.143 2.998 2.896 … 2.764 … 2.602 … 2.326 t-value .005 3.707 3.499 3.355 … 3.169 … 2.947 … 2.576 .001 5.208 4.785 4.501 … 4.144 … 3.733 … 3.090 .0005 5.959 5.408 5.041 … 4.587 … 4.073 … 3.291 .0001 8.025 7.063 6.442 … 5.694 … 4.880 … 3.719 100(1-α)% ‘Small Sample’ Comparing t and Z values C o n fid e n c e t v a lu e w ith Z v a lu e le v e l 5 d .f 90% 2 .0 1 5 1 .6 5 95% 2 .5 7 1 1 .9 6 99% 4 .0 3 2 2 .5 8 For small samples, t value is larger than Z value hence t interval is wider than Z interval. Confidence Interval for µ In repeated sampling, 100(1-α)% of intervals calculated in this manner s x ± tα ∗ 2 n (with n-1 df) will contain µ. Assumptions and Conditions – The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly from an SRS) are ideal. µ = 4.36 – the data are from a population that follows a Normal model n = 20, x = 4.6, s = 3.75 • Beware skewed data • Beware outliers 95% C.I. for µ ? 1359 1260 1344 1249 1350 n = 12, 1220 1205 1217 1228 x = 1261 Dates 1155 1250 s = 61.2 1315 1150 1311 1271 So What Do We Know? • We now have techniques for inference about a mean from small samples. We can create confidence intervals and test hypotheses. • The sampling distribution for the mean (for small samples) follows Student’s tdistribution and not the Normal. • The t-model is a family of distributions indexed by degrees of freedom. Are You Normal? How Can You Tell? • When you actually have your own data, you must check to see whether a Normal model is reasonable. • Looking at a histogram of the data is a good way to check that the underlying distribution is roughly unimodal and symmetric. Are You Normal? (cont.) Are You Normal? (cont.) • A more specialized graphical display that can help you decide whether a Normal model is appropriate is the Normal probability plot. • If the distribution of the data is roughly Normal, the Normal probability plot approximates a diagonal straight line. Deviations from a straight line indicate that the distribution is not Normal. • Nearly Normal data have a histogram and a Normal probability plot that look somewhat like this example: Are You Normal? (cont.) What Can Go Wrong? • A skewed distribution might have a histogram and Normal probability plot like this: • Don’t use Normal models when the distribution is not unimodal and symmetric. • Don’t use the mean and standard deviation when outliers are present—the mean and standard deviation can both be distorted by outliers.