Download STAT 2507 Solutions for Assignment # 4 Fall 2008 Note: 1. Some

STAT 2507 Solutions for Assignment # 4 Fall 2008 Note: 1. Some answers to lab part may vary from one student to another. For such cases, the answers given here should serve as guidelines only as they correspond to one replication that one instructor made. 2. A typo was made in lab question 5. In step 3, we should read let c1=(c2 <= 4 and c3 >= 4) instead of let c1=(c2 >= 4 and c3 <= 4). Consequently, the TAs are asked to give each student they marked, the full 6 marks for that question. Part I. Lab questions. Use only the blanks left to answer lab questions. Provide all histograms, boxplots you are asked to print, but DO NOT print the data you are asked to generate. 1. Continuous distributions: Generate and store in column c1 10,000 values from the uniform distribution on the interval [3,7] as follows: random 10000 c1; uniform 3 7. [3] a. Use mean command to find the sample mean x̄ of these data 5.002 Note: The mean µ of a uniform distribution over an interval [a, b] is simply the middle of this interval, i.e. µ = (a + b)/2. [3] b. What is the mean µ of the uniform distribution on the interval [3,7]? 5 Compare µ to the value x̄ you found in part a). both are very close Generate and store in column c2 1,000 values from exponential distribution with parameter λ = .125 as follows: random 1000 c2; exponential 8. Note: The mean µ and the standard deviation σ of such distribution are both equal to 1/λ = 8 and this is the value you are asked to enter in the command above. [3] c. Use desc command to find the sample mean x̄ and sample standard deviation s for these 1,000 data X̄ = 8.170 and s = 8.319 Are x̄ and s close to the value 1/λ = 8? Yes. Fairly close. [3] d. Print (and include in your assignment) the histogram of the 1,000 values you generated from this exponential distribution. What is the shape of this distribution? Skewed to the right. 1 2. Normal distribution: Generate and store in column c3 10,000 values from the standard normal distribution as follows: random 10000 c3; normal. [3] a. Print (and include in your assignment) the histogram for these data. What is the shape of this histogram? Bell-shaped (symmetric) [3] b. What is the value on the horizontal axis around which the histogram seems to be symmetric? x=0 [3] c. Use Minitab to find the sample mean x̄ the standard deviation s for these data X̄ = 0.00615 s = 0.98863 [3] d. What are the mean µ and the standard deviation σ of the standard normal distribution? 0 and 1, respectively 3. Standardization procedure: Generate and store in column c4 10,000 values from the normal distribution with µ = 6.5 and σ = 3 as follows: random 10000 c4; normal 6.5 3. a. Print (and include in your assignment) the histogram for these data [1]. What is the value on the horizontal axis around which the histogram seems to be symmetric? x =6.5 [2] Construct and store in column c5 the data set zi (i = 1, . . . , 10, 000) obtained from the previously generated data set xi by the standardization procedure zi = (xi − µ)/σ by typing: let c5=(c4-6.5)/3 b. Print (and include in your assignment) the histogram for the zi s [1]. Around which value does it seem to be symmetric? x=0[1] What are the sample mean and standard deviation z̄ and s for this new data set?-0.0018 and 1.0021 respectively[2] Why are they close to 0 and 1? Since if X has normal (µ, σ) then Z := (X − µ)/σ will have normal (0,1). [2] 4. Central limit theorem (CLT) at work (You can use a new Minitab worksheet). Generate and store in columns c3-c902 100 samples, of size n = 900 each, from Poisson distribution with parameter µ = 9 as follows: random 100 c3-c902; poisson 9. Note This may take a few moments as you are generating 900x100=90,000 values Create and store in column c1 the 100 values of x̄ based on the 100 samples of same size n = 900 as follows: rmean c3-c902 c1 a. Print (and include in your assignment) the boxplot of c3 [1]. According to the position 2 of the median, what can you conclude about the shape of this data set? Rather symmetric, as the median seems to be in the middle of the box. Note: Your graph might also be skewed to the right.[2] [3] b. Use desc command to find sample mean and sample standard deviation of c3 8.890 and 3.203, respectively. [3] c. Print (and include in your assignment) the boxplot for the data in column c1. What can you conclude about the shape of data in c1? Fairly symmetric as its median is very close to the mean=9 [3] d. Use desc to find sample mean and sample standard deviation of c1 9.016 and .095, respectively. Are they close to 9 and 3/30? Yes. .095. close to .1 Why?According to CLT, X̄ will be approximately normally distributed with mean µ and stan√ dard deviation σ/ n = 3/30, since for Poisson (9), σ 2 = µ = 9, and the sample sizes are equal 900. 5. Confidence interval (CI) for a mean: We want to build 100 confidence intervals (CIs) with confidence level (1 − α)100% = 95% for the mean µ of a Poisson distribution via the following steps: Step 1. Generate and store in columns c6-c405 100 samples of size 400 each from Poisson with parameter µ = 4 as follows: random 100 c6-c405; poisson 4. Step 2. Use columns c4 and c5 to store respectively the means and the standard deviations of the 100 samples you generated in step 1, as follows: rmean c6-c405 c4 rstd c6-c405 c5 Step 3. Store the lower bound and the upper bound of your 95% CIs in c2 and c3 respectively by typing successively: let c2=c4-1.96*c5/20 let c3=c4+1.96*c5/20 Then create a column c1 containing 1 or 0 according to whether the corresponding interval [c2 , c3] covers µ or not, by typing: let c1=(c2 >= 4 and c3 <= 4) Finally sum up the entries of column c1 to find how many CIs cover the value µ = 4 by typing: tally c1 [3] a. What is the percentage of confidence intervals that contain the true value µ = 4?96% [3] b. How do you compare this percentage to the confidence level 95%?The two percentages are very close. 3 Part II. Long-answer questions 1. The variable X has binomial distribution with parameters n = 50, 000 and p = 1/1, 000. Hence we can √approximate it by a normal distribution with µ = np = 50 and σ = p np(1 − p) = 49.95 = 7.07. We get a. ! X − 50 60 − 50 ≥ P (X ≥ 60) = P = P (Z ≥ 1.41) = 1−P (Z < 1.41) = 1−.9207 = .0793 7.07 7.07 .b As the probability of observing 60 or more is very small, we would say that observing 60 children with genetic defect is rather unusual. If we observe 60, we might cast doubt as to the ratio of children affected. May be it is higher than 1 per 1,000. √ 2. a. X̄ has mean µ = 106 and standard deviation σ/ n = 12/6 = 2. b. Since n = 36 > 30, we can use the CLT (central limit theorem) and consider X̄ as normally distributed. ! 110 − 106 X̄ − µ 110 − µ √ √ =P Z> P (X̄ > 110) = P > = 1−P (Z ≤ 2) = 1−.9772 = .0228 2 σ/ n σ/ n c. P (|X̄ − µ| ≤ 4) = P (|Z| ≤ 2) = P (Z ≤ 2) − P (Z ≤ −2) = .9772 − .0228 = .9544. 3. a. As n = 50 > 30, then by the CLT, X̄ can be considered as√ having normal distribution √ with mean µ = 68, 500 and standard deviation σ/ n = 3500/ 50 = 495 dollars. √ b. Since Z = (X̄ − µ)/(σ/ n) has approximately standard normal distribution, with √ probability .95, we would expect X̄ to fall in µ ± 1.96σ/ n = 68, 500 ± 1.96(495) =[67530 , 69470]. c. ! 70, 000 − 68, 500 P (X̄ ≥ 70, 000) = P Z ≥ = 1 − P (Z < 3.03) = 1 − .9988 = .0012. 495 d. Yes, this would be rather unusual, since, according to part c), the odds for that happening are very slim. If we happen to observe a sample mean of $70,000, we might conclude that the true average salary is above $68,500. 4. The sample mean and sample standard deviation are respectively X̄ = 4.3 and s = 2.6. The sapmle size is n = 40 > 30. We want an 80% confidence interval (CI). Hence α = .2 and zα/2 = z.1 = 1.28 (from normal table). Hence the desired CI is 2.6 s = [3.78 , 4.82] X̄ ± 1.28 √ = 4.3 ± 1.28 6.320 n 4 5. We have n = 600, p̂ = 250/600 = .42 and 1 − p̂ = .58. here α = .05 and therefore zα/2 = 1.96. The 95% CI for p is then given by r p̂ ± 1.96 p̂(1 − p̂) = .42 ± 1.96 n r .42 × .58 = [.38 , .46]. 600 6. For 90% CI, zα/2 = 1.64. Note to TAs: please also accept any of the two other approximations of zα/2 = 1.645 or zα/2 = 1.65. For 98% CI, zα/2 = 2.33. p̂ = 25/200 = .125 and 1 − p̂ = .875. We get the desired CIs as follows: a. r .125 × .875 = [.086 , .163] .125 ± 1.64 200 b. r .125 × .875 = [.071 , .179] .125 ± 2.33 200 The last CI is wider since it has larger confidence level: 98% instead of 90%. 5 6 7 8 9 10 11

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download STAT 2507 Solutions for Assignment # 4 Fall 2008 Note: 1. Some