Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
. CLABE Statistics Homework assignment - Problem sheet 7 . 1. The following data represent the number of days absent last year in a population of 5 employees of a small company: 1; 3; 6; 7; 12. (a) (b) (c) (d) (e) (f) compute the population mean and standard deviation; select all possible samples of size 2 and compute the sampling distribution of the mean; compute the mean of the sample means, compute the standard error; select all possible samples of size 3 and compute the sampling distribution of the mean; compute the mean of the sample means, compute the standard error; compare the shape of sampling distributions for (b) and (d). SOLUTION (a) If X is the random variable describing the number of days absent last year in the population, then the distribution of X is as follows x 1 3 6 7 12 1 P (X = x) 15 15 15 15 5 0.3 0.2 0.1 probability 0.4 that is, the distribution of X is discrete uniform and a graphical representation of the corresponding probability mass function is 0 1 2 3 4 5 6 7 x 1 8 9 10 11 12 13 It is not dicult to compute E(X) = 5.8, E(X 2 ) = 47.8 and Var(X) = 14.16. (b) The computation of the probability distribution of X̄ for n = 2 requires the enumeration of all the 52 possible samples of size 2, x̄ P (X̄ = x̄) 1 2 3 3.5 4 4.5 5 6 6.5 7 7.5 9 9.5 12 0.04 0.08 0.04 0.08 0.08 0.08 0.08 0.04 0.16 0.04 0.08 0.08 0.08 0.04 0.10 0.05 probability 0.15 0.20 The corresponding graphical representation is as follows 0 1 2 3 4 5 6 7 8 9 10 11 12 13 sample mean (n=2) (c) The application of the properties of linear combinations of random variables gives: E(X̄) = √ E(X) = 5.8 whereas Var(X̄) =Var(X)/2 = 14.16/2 = 7.08 so that SD(X̄) = 7.08 = 2.66. (d) The computation of the probability distribution of X̄ for n = 3 requires the enumeration of all the 53 possible samples of size 3, x̄ P (X̄ = x̄) x̄ P (X̄ = x̄) x̄ P (X̄ = x̄) 1 5 3 7 3 8 3 3 10 3 11 3 4 0.008 0.024 0.024 0.024 0.032 0.048 0.048 0.024 13 3 13 3 5 16 3 17 3 6 19 3 20 3 0.048 0.072 0.048 0.096 0.024 0.032 0.072 0.072 7 22 3 8 25 3 26 3 9 10 31 3 12 0.056 0.048 0.024 0.072 0.024 0.024 0.024 0.024 0.008 The corresponding graphical representation is as follows 2 0.12 0.08 0.04 0.00 probability 0 1 2 3 4 5 6 7 8 9 10 11 12 13 sample mean (n=3) (e) The application of the properties of linear combinations of random variables gives: E(X̄) = √ E(X) = 5.8 whereas Var(X̄) =Var(X)/3 = 14.16/3 = 4.72 so that SD(X̄) = 4.72 = 2.17. (f) It is clear that the (i) probability mass function of X̄ approaches the normal density as n increases and (ii) the standard error of X̄ decreases as n increases. For completeness, we give below the probability mass functions of X̄ for n = 4 and n = 5 3 0.0769 0.0513 0.0256 0.0000 probability 0 1 2 3 4 5 6 7 8 9 10 11 12 13 9 10 11 12 13 0.0494 0.0247 0.0000 probability 0.0740 sample mean (n=4) 0 1 2 3 4 5 6 7 8 sample mean (n=5) 2. The diameter of Ping-Pong balls (denoted by X ) manufactured at a large factory is expected to be approximately normally distributed with a mean of 3.3cm and a standard deviation of 0.6cm. What is the probability that a randomly selected Ping-Pong ball will have a diameter 4 (a) Less than 3.25cm? (b) Between 3.25cm and 3.35cm inches? (c) Between what two values (symmetrically distributed around the mean) will 60% of the Ping-Pong balls fall (in terms of diameter)? If many random samples of 25 Ping-Pong balls are selected (d) What distribution will the sample means follow? (e) What proportion of the sample means will be less than 3.25cm? (f) What proportion of the sample means will be between 3.25cm and 3.35cm? (g) Between what two values (symmetrically distributed around the mean) will 60% of the sample means be? (h) Which is more likely to occur an individual ball between 3.25 and 3.35cm or a sample mean between 3.25 and 3.35cm? Explain. SOLUTION (a) P (X < 3.25) = 0.4669; (b) P (3.25 > X > 3.35) = 0.0664; (c) P (2.8 < X < 3.8) = 0.6; (d) X̄ ≈Normal with E(X̄) = 0.6 and Var(X̄) = 0.62 25 so that SD(X̄) = 0.6 5 = 0.12; (e) P (X̄ < 3.25) = 0.3385; (f) P (3.25 > X̄ > 3.35) = 0.3231; (g) P (3.2 < X̄ < 3.4) = 0.6. (h) The values 3.25 and 3.35 are symmetrically distributed around E(X) = E(X̄) and since Var(X) >Var(X̄) it follows that a sample mean between 3.25 and 3.35cm is more likely than an individual ball between 3.25 and 3.35cm. 3. In 1992, Canadians voted in a referendum on a new constitution. In the province of Quebec, 42.4% of those who voted were in favor of the new constitution. A random sample of 100 voters from the province was taken. (a) What is the mean of the distribution of the sample proportion in favor of a new constitution? (b) What is the variance of the sample proportion? (c) What is the standard error of the sample proportion? (d) What is the probability that the sample proportion is bigger than 0.5? SOLUTION (a) If we denote by P the sample proportion in favor of a new constitution, then E(P ) = 0.424; 5 (b) Var(P ) = 0.424(1−0.424) 100 (c) SE(P ) = q = 0.00244; 0.424(1−0.424) 100 = 0.05; (d) By the central limit theorem, we can use the normal distribution to compute the approximate probability that the sample proportion is bigger than 0.5 that is P (P > 0.5) ≈ 0.0643. 4. A random sample of 12 employees in a large manufacturing plant found the following gures for number of hours of overtime worked in the last month: 22 16 28 12 18 36 23 11 41 29 26 31 Use unbiased estimation procedures to nd point estimates for the following: (a) The population mean; (b) the population variance; (c) the variance of the sample mean; (d) the population proportion of employees working more than 30 hours of overtime in this plant in the last month. SOLUTION (a) If we denote by µ the population mean, then the sample mean X̄ is an unbiased estimator of µ. For the data of this exercise, µ̂ = x̄ = 24.42. 2 is an unbiased (b) If we denote by σ 2 the population variance, then the sample variance SX 2 2 2 estimator of σ . For the data of this exercise, σ̂ = sX = 85.72. 2 d (X̄) = S 2 /12 is an unbiased (c) The variance of the sample mean is Var(X̄) = σ12 and Var X estimator of Var(X̄). This follows by noticing that Var(X̄) is a linear transformation of d (X̄)] = E(S 2 /12) = E(S 2 )/12 = σ2 . For the data of this exercise, Var(X) so that E[Var d (X̄) = Var s2X 12 X X = 85.72 12 12 = 7.14. (d) If we denote by π the population proportion of employees working more than 30 hours of overtime in this plant in the last month then the corresponding sample proportion P is a 3 unbiased estimator of π . For the data of this exercise, π̂ = 12 = 0.25. 5. Suppose that X1 and X2 are random samples of observations from a population with mean µ and variance σ 2 . Consider the following three point estimators W , Y and Z of µ: 1 1 W = X1 + X2 2 2 1 3 Y = X1 + X2 4 4 1 2 Z = X1 + X2 3 3 (a) Show that all three estimators are unbiased; (b) which of the estimators is most ecient? (c) Find the relative eciency of W with respect to each of the other two estimators. 6 SOLUTION (a) E(W ) = 1 1 E(X1 ) + E(X2 ) = µ 2 2 E(Y ) = 1 3 E(X1 ) + E(X2 ) = µ 4 4 E(Z) = 2 1 E(X1 ) + E(X2 ) = µ 3 3 (b) Var(W ) = 1 1 σ2 Var(X1 ) + Var(X2 ) = 4 4 2 Var(Y ) = 1 9 5σ 2 Var(X1 ) + Var(X2 ) = 16 16 8 Var(Z) = 1 4 5σ 2 Var(X1 ) + Var(X2 ) = 9 9 9 (c) 4 Var(W ) = <1 Var(Y ) 5 and 7 Var(W ) 9 = <1 Var(Z) 10