Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
͵–Ƭ EŽǀĞŵďĞƌϮϬϭϯdžĂŵŝŶĂƚŝŽŶƐ /ŶĚŝĐĂƚŝǀĞ^ŽůƵƚŝŽŶƐ dŚĞ ŝŶĚŝĐĂƚŝǀĞ ƐŽůƵƚŝŽŶ ŚĂƐ ďĞĞŶ ǁƌŝƚƚĞŶ ďLJ ƚŚĞ džĂŵŝŶĞƌƐ ǁŝƚŚ ƚŚĞ Ăŝŵ ŽĨ ŚĞůƉŝŶŐ ĐĂŶĚŝĚĂƚĞƐ͘ dŚĞ ƐŽůƵƚŝŽŶƐŐŝǀĞŶĂƌĞŽŶůLJŝŶĚŝĐĂƚŝǀĞ͘/ƚŝƐƌĞĂůŝnjĞĚƚŚĂƚƚŚĞƌĞĐŽƵůĚďĞŽƚŚĞƌĂƉƉƌŽĂĐŚĞƐůĞĂĚŝŶŐƚŽĂǀĂůŝĚ ĂŶƐǁĞƌ ĂŶĚ ĞdžĂŵŝŶĞƌƐ ŚĂǀĞ ŐŝǀĞŶ ĐƌĞĚŝƚ ĨŽƌ ĂŶLJ ĂůƚĞƌŶĂƚŝǀĞ ĂƉƉƌŽĂĐŚ Žƌ ŝŶƚĞƌƉƌĞƚĂƚŝŽŶ ǁŚŝĐŚ ƚŚĞLJ ĐŽŶƐŝĚĞƌƚŽďĞƌĞĂƐŽŶĂďůĞ ͵Ǧͳͳͳ͵ Solution 1 : The percentages of customers who bought newspaper A from a magazine stall in city K for Monday to Friday in a randomly selected week are: 62% i. 55% 63% 58% 62% Mean = (62% + 55% + 63% + 58% + 62%)/5 = 60% Median = (5+1)/2 i.e. the 3rd observation when the data is sorted from smallest to largest = 62% (2 Marks) ii. a% and b% are the respective percentages of customers who bought newspaper A from the stall for Saturday and Sunday in that week. a) Using the whole week’s data, the median will be the (7+1)/2 i.e. the 4th observation when the data is sorted from smallest to largest. The value of the median will be least when both a% and b% are no more than the smallest value of 5-days’ data i.e. less than 55%. In which case, the value of median will be 58% as it will be the 4th observation when the data is sorted from smallest to largest. (1 Mark) b) We are told that the mean and the median after adding the Saturday and Sunday’s data will remain the same as without these two days data. Without loss of generality, assume a ч b. Setting the means equal: ܽ ܾ ͷ כͲ Ͳ ൌ ݅Ǥ ݁Ǥܽ ܾ ൌ ͳʹͲ Setting the medians equal … In order that the median is 62% we have 2 possibilities: x Possibility 1: a ч 62 and b ш 62 x Possibility 2: a, b ш 62 Given that we must have a + b = 120, we can only have possibility 1. So a pair of possible values of (a, b) can be (55, 65). (3 Marks) iii. The data is relevant for only 1 week. So, it cannot be really inferred that the mean and median will exceed 50% in every other week. Hence the stall keeper’s claims cannot be substantiated for want of sufficient credible data. (1 Mark) [Total Marks-7] ʹ ͵Ǧͳͳͳ͵ Solution 2 : We can assume that the events concerning the first urn are independent of events concerning the second urn. Let Ri denote drawing a red ball from urn i and let Bi denote drawing a blue ball from urn i. Let n be the number of blue balls in the second urn. Then: ͲǤͶͶ ൌ Զ൫ሺܴଵ ܴ תଶ ሻ ሺܤଵ ܤ תଶ ሻ൯ ൌ Զሺܴଵ ܴ תଶ ሻ Զሺܤଵ ܤ תଶ ሻǥ ݉݁ݒ݅ݏݑ݈ܿݔ݁ݕ݈݈ܽݑݐݑ ൌ Զሺܴଵ ሻ כԶሺܴଶ ሻ Զሺܤଵ ሻ כԶሺܤଶ ሻǥ ݅݊݀݁݁ܿ݊݁݀݊݁ ͳ ݊ Ͷ כ כ ͳͲ ͳ ݊ ͳͲ ͳ ݊ Ͷ ݊ ൌ ֜ ݊ ൌ Ͷ ͳͲ ͳͲ݊ ൌ [Total Marks- 3] Solution 3 : The lifetime, T, of an electronic device is a random variable having a probability density function: ்݂ ሺݐሻ ൌ ͲǤͷ݁ ିǤହ௧ Ǣ ݐ Ͳ The device is given an efficiency value V = 5 if it fails before time t = 3. Otherwise, it is given a value V = 2 T. Thus V is defined over the range v ш 5. Note: If t < 3, v = 5; If t ш 3, v = 2t ш 6. i. Let fV denote the probability density function of V. ൏ ͷǡ ݂ ሺݒሻ ൌ ͲܸͷǤ ൌ ͷǡ ݂ ሺݒሻ ൌ Զሺܸ ൌ ͷሻ ൌ Զሺܶ ൏ ͵ሻ ଷ ൌ න ͲǤͷ݁ ିǤହ௧ ݀ ݐൌ ͳ െ ݁ ିଵǤହ ͷ ൏ ݒ൏ ǡ ݂ ሺݒሻ ൌ ͲͷǤ ǡ ͵ ͵Ǧͳͳͳ͵ Զሺܸ ݒሻ ൌ Զሺܸ ൏ ͷሻ Զሺܸ ൌ ͷሻ Զሺͷ ൏ ܸ ൏ ሻ Զሺ ܸ ݒሻ ൌ Ͳ ሺͳ െ ݁ ିଵǤହ ሻ Ͳ Զሺ ʹܶ ݒሻ ൌ ሺͳ െ ݁ ିଵǤହ ሻ Զሺ͵ ܶ ͲǤͷݒሻ Ǥହ௩ ൌ ሺͳ െ ݁ ିଵǤହ ሻ න ͲǤͷ݁ ିǤହ௧ ݀ݐ ଷ ൌ ሺͳ െ ݁ ିଵǤହ ሻ ሺ݁ ିଵǤହ െ ݁ ିǤଶହ௩ ሻ ൌ ͳ െ ݁ ିǤଶହ௩ Thus, ݂ ሺݒሻ ൌ ݀ ݀ ሾԶሺܸ ݒሻሿ ൌ ሾͳ െ ݁ ିǤଶହ௩ ሿ ൌ ͲǤʹͷ݁ ିǤଶହ௩ ݀ݒ ݀ݒ Thus, the PDF (fv) of V is given as below: ݒ൏ ͷǡͷ ൏ ݒ൏ Ͳ ݂ ሺݒሻ ൌ ൞ ͳ െ ݁ ିଵǤହ ݒൌͷ ͲǤʹͷ݁ ିǤଶହ௩ ݒ (4 Marks) ii. The expected efficiency value of the electronic device is given as: ஶ ॱሾܸሿ ൌ ͷ݂ ሺͷሻ න ݂ݒ ሺݒሻ݀ݒ ஶ ൌ ͷሺͳ െ ݁ ିଵǤହ ሻ න ͲǤʹͷି ݁ݒǤଶହ௩ ݀ݒ ஶ ൌ ͷሺͳ െ ݁ ିଵǤହ ሻ න ͲǤʹͷሺ ݔ ሻ݁ ିǤଶହሺ௫ାሻ ݀ ݔǥ ݔൌ ݒെ ஶ ൌ ͷሺͳ െ ݁ ିଵǤହ ሻ ൌ ͷሺͳ െ ݁ ିଵǤହ ሻ ݁ ିଵǤହ ݁ ିଵǤହ ஶ න ͲǤʹͷ݁ݔ ିǤଶହ௫ ݀ ݔ ݁ כͶ ݁ ିଵǤହ න ͲǤʹͷ݁ ିǤଶହ௫ ݀ݔ ିଵǤହ ͳ כ ஶ ͲǤʹͷିǤଶହ୶ ൌ ሺͲǤʹͷሻ ൩ ஶ ͲǤʹͷିǤଶହ୶ ൌ ሺͲǤʹͷሻ ൌ ͷሺͳ ݁ ିଵǤହ ሻǤ ǤǤͳͳ (3 Marks) [ ܛܓܚ܉ۻܔ܉ܜܗ܂െ ૠሿ Ͷ ͵Ǧͳͳͳ͵ Solution 4 : Consider a random sample, X1, X2 … Xn from a normal N(ђ, ʍ2) distribution, with sample meanܺത and sample varianceܵ ଶ. i. To show: ͳ ଶ ܵ ൌ ൫ܺ െ ܺ ൯ ʹ݊ሺ݊ െ ͳሻ ଶ ୀଵ ୀଵ By definition: ͳ ሺܺ െ ܺതሻଶ ܵ ൌ ݊െͳ ଶ ୀଵ ͳ ଶ ܴ ܵܪൌ ൫ܺ െ ܺ ൯ ʹ݊ሺ݊ െ ͳሻ ୀଵ ୀଵ ͳ ଶ ൌ ൫ܺ െ ܺത ܺത െ ܺ ൯ ʹ݊ሺ݊ െ ͳሻ ୀଵ ୀଵ ͳ ଶ ቂሺܺ െ ܺതሻଶ െ ʹሺܺ െ ܺതሻ൫ܺ െ ܺത൯ ൫ܺ െ ܺത൯ ቃ ൌ ʹ݊ሺ݊ െ ͳሻ ୀଵ ୀଵ ͳ ଶ ൌ ݊ሺܺ െ ܺതሻଶ െ ʹ ሺܺ െ ܺതሻ൫ܺ െ ܺത൯ ݊൫ܺ െ ܺത൯ ʹ݊ሺ݊ െ ͳሻ ୀଵ ୀଵ ୀଵ ୀଵ ୀଵ ୀଵ ͳ ൌ ݊ሺ݊ െ ͳሻܵ ଶ െ ʹ ሺܺ െ ܺതሻ ൫ܺ െ ܺത൯ ݊ሺ݊ െ ͳሻܵ ଶ ʹ݊ሺ݊ െ ͳሻ ൌ ͳ ሾ݊ሺ݊ െ ͳሻܵ ଶ െ Ͳ ݊ሺ݊ െ ͳሻܵ ଶ ሿ ʹ݊ሺ݊ െ ͳሻ ሺܺ െ ܺതሻ ൌ ܺ െ ݊ܺത ൌ Ͳ൩ ୀଵ ୀଵ ൌ ܵ ଶ Ǥ (4 Marks) ͷ ͵Ǧͳͳͳ͵ ii. Consider any pair of (I, j) such that i т j, x E(Xi – Xj) = E(Xi) – E(Xj) = ђ - ђ = 0 x Var(Xi – Xj) = Var(Xi) + Var(Xj) – 2 Cov(Xi, Xj) = ʍ2 + ʍ2 – 2 * 0 = 2 ʍ2 [Cov(Xi, Xj) = 0 as Xi & Xj are independent] Thus: E[(Xi – Xj)2] = Var(Xi – Xj) + [E(Xi – Xj)]2 = 2 ʍ2 ܧሺܵ ଶሻ ͳ ଶ ൌ ܧ ൫ܺ െ ܺ ൯ ʹ݊ሺ݊ െ ͳሻ ୀଵ ୀଵ ͳ ଶ ൌ ܧቂ൫ܺ െ ܺ ൯ ቃ ʹ݊ሺ݊ െ ͳሻ ୀଵ ୀଵ ͳ ଶ ൌ ܧቂ൫ܺ െ ܺ ൯ ቃ ൌ Ͳ ʹ݊ሺ݊ െ ͳሻ ୀଵ ஷୀଵ ͳ ʹߪ ଶ ൌ ʹ݊ሺ݊ െ ͳሻ ୀଵ ஷୀଵ ൌ ͳ ߪʹ כଶ ݊ כሺ݊ െ ͳሻ ʹ݊ሺ݊ െ ͳሻ ൌ ߪ ଶ Thus, S2 is an unbiased estimator of ʍ2. (3 Marks) [Total Marks- 7] Solution 5 : There are two independent random variables X and Y with probability density functions g and h respectively, where for any x > 0, we have: ݃ሺݔሻ ൌ ͵ଵ଼ ݔଵ ݁ ିଷ௫ ݁ ିଷ௫ Ǣ ݄ሺݔሻ ൌ ͵ ݔହ ͳǨ ͷǨ We have: S = X + Y The probability density function of S is given as (for any s > 0): ஶ ݂ௌ ሺݏሻ ൌ න ݃ሺݔሻ ݄ሺ ݏെ ݔሻ݀ݔ ିஶ [Using the convolution formula given] ௦ ൌ න ͵ଵ଼ ݔଵ ͵Ǧͳͳͳ͵ ݁ ିଷ௫ ݁ ିଷሺ௦ି௫ሻ ͵ כሺ ݏെ ݔሻହ ݀ݔሾ ͲƬ ݏെ ݔ Ͳሿ ͳǨ ͷǨ ௦ ͵ଶସ ିଷ௦ ݁ න ݔଵ ሺ ݏെ ݔሻହ ݀ݔ ൌ ͳǨ ͷǨ ௦ ͵ଶସ ିଷ௦ ଶଶ ݔଵ ݔହ ൌ ݁ ݏන ቀ ቁ ቀͳ െ ቁ ݀ݔ ݏ ݏ ͳǨ ͷǨ ௦ ݔଵ ݔହ ݔ ͵ଶସ ଶଷ ିଷ௦ ൌ ݁ ݏන ቀ ቁ ቀͳ െ ቁ ݀ ቀ ቁ ݏ ݏ ݏ ͳǨ ͷǨ ଵ ͵ଶସ ଶଷ ିଷ௦ ͳǨ ͷǨ ʹ͵Ǩ ൌ ݁ ݏ כන ݑଵ ሺͳ െ ݑሻହ ݀ݑ ʹ͵Ǩ ͳǨ ͷǨ ͳǨ ͷǨ ݔ ቂ ݑൌ ቃ ݏ ൌ ͵ଶସ ଶଷ ିଷ௦ ݁ ݏ ͳ כ ʹ͵Ǩ ܽܽݐ݁ܤ݂ܽݕݐ݈ܾܾ݅݅ܽݎ݈ܽݐݐ݄݁ݐݏ݈ܽݑݍ݁݀݊ܽݎ݃݁ݐ݄݊݅݁ݐݏሺͳͺǡ ሻ݀݅݊݅ݐݑܾ݅ݎݐݏ ൌ ͵ଶସ ଶଷ ିଷ௦ ݁ ݏ ʹ͵Ǩ [Total Marks- 5] Solution 6 : Let N be the number of claims on a motor insurance policy in one year. Suppose the claim amounts X1, X2 … are independent and identically distributed random variables, independent of N. Let S be the total amount claimed in one year for that insurance policy. i. The mean and variance of S are given as: x E(S) = E(N)*E(X1) x Var(S) = E(N)*Var(X1) + Var(N)*[E(X1)]2. (1 Mark) ii. For the ith policy, the first two moments for the number of claims under Option 1 are as below: x Mean = ɴi x Variance = ɴi (1 – ɴi) ͵Ǧͳͳͳ͵ Thus, if Si denotes the total claim amount under ith policy, we have: x E(Si) = ɴi ђ x Var(Si) = ɴi ʍ2 + ɴi (1 – ɴi) ђ2 Therefore, the mean of T is ଵ ଵ ଵ ܧሺܶሻ ൌ ܧ൭ ܵ ൱ ൌ ܧሺܵ ሻ ൌ ߤ ߚ ୀଵ ୀଵ ୀଵ The variance of T is ଵ ܸܽݎሺܶሻ ൌ ܸܽ ݎ൭ ܵ ൱ ଵ ୀଵ ൌ ܸܽݎሺܵ ሻ ǥ ܵ ୀଵ ଵ ൌ ሾߚ ߪ ଶ ߚ ሺͳ െ ߚ ሻߤଶ ሿ ୀଵ (3 Marks) iii. For the ith policy, the first two moments for the number of claims under Option 2 are as below: x Mean = ɴi x Variance = ɴi Thus, if Si denotes the total claim amount under ith policy, we have: x E(Si) = ɴi ђ x Var(Si) = ɴi ʍ2 + ɴi ђ2 Therefore, the mean of T’ is ଵ ଵ ଵ ܧሺܶԢሻ ൌ ܧ൭ ܵ ൱ ൌ ܧሺܵ ሻ ൌ ߤ ߚ ୀଵ ୀଵ ୀଵ The variance of T’ is ଵ ܸܽݎሺܶԢሻ ൌ ܸܽ ݎ൭ ܵ ൱ ଵ ୀଵ ൌ ܸܽݎሺܵ ሻ ǥ ܵ ୀଵ ଵ ൌ ሾߚ ߪ ଶ ߚ ߤଶ ሿ ୀଵ ͺ ͵Ǧͳͳͳ͵ Comparing with the mean and variance of T: ଵ ܧሺܶԢሻ ൌ ߤ ߚ ൌ ܧሺܶሻ ୀଵ ଵ ܸܽݎሺܶ ᇱ ሻ ൌ ሾߚ ߪ ଶ ߚ ߤଶ ሿ ୀଵ ଵ ଵ ଶ ൌ ሾߚ ߪ ߚ ሺͳ െ ߚ ሻߤଶ ሿ ߤ ߚ ଶ ଶ ୀଵ ୀଵ ଵ ൌ ሺܶሻ ߤଶ ߚ ଶ ᇣᇧ ᇧᇤᇧ ୀଵ ᇧᇥ வ ܸܽݎሺܶሻ (4 Marks) [Total Marks- 8] Solution 7 : Consider a random variable U that has a uniform distribution on [0, 1] and let F be the cumulative distribution function of the standard normal distribution. Define a random variable X as below: ଵ ଵ െି ܨଵ ቀܷ ଶቁ Ͳ ൏ ଶ ܺൌቐ . ଵ ଵ െି ܨଵ ቀܷ െ ଶቁ ଶ ൏ ܷ ͳ i. The range of X for each sub-part is: x ൌ Ͳ ֜ ܺ ൌ െ ିଵ ሺͲǤͷሻ ൌ Ͳ ቊ ՜ ͳൗʹ െ֜ ൌ െ ିଵ ሺͳሻ ՜ െλ x ቊ ՜ ͳൗʹ ֜ ൌ െ ିଵ ሺͲሻ ՜ λ ൌ ͳ ֜ ܺ ൌ െ ିଵ ሺͲǤͷሻ ൌ Ͳ For -ь < x ч 0, ͳ ܲሾെλ ൏ ܺ ݔሿ ൌ ܲ െλ ൏ െି ܨଵ ൬ܷ ൰ ݔ൨ ʹ ͳ ൌ ܲ ܨሺെݔሻ ܷ ൏ ͳ൨ ʹ ͳ ͳ ൌ ܲ ͳ െ ܨሺݔሻ െ ܷ ൏ ൨ ʹ ʹ ͻ ͵Ǧͳͳͳ͵ ͳ ͳ ൌ ܲ െ ܨሺݔሻ ܷ ൏ ൨ ʹ ʹ ͳ ͳ ൌ െ െ ܨሺݔሻ൨ǥ ሺͲǡ ͳሻ ʹ ʹ ൌ ܨሺݔሻ For 0 ч x < +ь, ͳ ܲሾ ݔ ܺ ൏ λሿ ൌ ܲ ݔ െି ܨଵ ൬ܷ െ ൰ ൏ λ൨ ʹ ͳ ൌ ܲ Ͳ ൏ ܷ െ ܨሺെݔሻ൨ ʹ ͳ ͳ ൌ ܲ ൏ ܷ ͳ െ ܨሺݔሻ൨ ʹ ʹ ͵ ͳ ൌ ܲ ൏ ܷ െ ܨሺݔሻ൨ ʹ ʹ ͳ ͵ ൌ െ ܨሺݔሻ൨ െ ǥ ሺͲǡ ͳሻ ʹ ʹ ൌ ͳ െ ܨሺݔሻ Thus, X has a standard normal distribution. (7 Marks) ii. Given u = 0.619, ͳ ݔൌ െି ܨଵ ൬ͲǤͳͻ െ ൰ ʹ ൌ െି ܨଵ ሺͲǤͳͳͻሻ ൌ െି ܨଵ ሺͳ െ ͲǤͺͺͳሻ ൌ െି ܨଵ ൫ͳ െ ܨሺͳǤͳͺሻ൯ ൌ െି ܨଵ ൫ܨሺെͳǤͳͺሻ൯ ൌ െሺെͳǤͳͺሻ ൌ ͳǤͳͺ Given u = 0.483, ͳ ݔൌ െି ܨଵ ൬ͲǤͶͺ͵ ൰ ʹ ିଵ ሺͲǤͻͺ͵ሻ ൌ െܨ ൌ െ2.12 (3 Marks) [Total Marks- 10] ͳͲ ͵Ǧͳͳͳ͵ Solution 8 : Let ɽ ೡ {0, 1 … 10} be the number of weapon-producing nuclear plants in country A. Let X be the number of nuclear plants found to be producing nuclear weapons by the secret agent. Thus X is a random variable with possible values 0, 1 or 2. i. The decision making process of country B can be formulated as below: “Test H0: ɽ = 0 against H1: ɽ > 0” This is equivalent to testing none of the plants are weapon-producing under null hypothesis against at least 1 plant is weapon-producing under alternate hypothesis. (1 Mark) ii. The type of error made by B if she does not invade A when some of the nuclear plants in A are indeed producing weapons is Type II Error. (1 Mark) iii. The critical region adopted by country B is given as: [ x ೡ {0, 1, 2} : x > 0 ] as H0 will be rejected if at least 1 plant is found to be weapon-producing. (1 Mark) iv. The probability of a Type I error is given by P[Reject H0 | H0 is true] i.e. P[X > 0 |ɽ = 0]. Given ɽ = 0, this means there are no weapon producing nuclear plants. This implies X = 0 with probability 1. Therefore, the probability of a Type I error = ܲሾܺ Ͳȁߠ ൌ Ͳሿ ൌ Ͳ (1 Mark) [Total Marks- 4] Solution 9 : For the study involving ‘n’ independent three-toss experiments, the frequencies a, b, c, d, e, f, g and h of the eight possible outcome sequences and the associate probabilities are as follows: HHH a ͳ ଶ ߠ ʹ ͳ ߠሺͳ െ ߠሻ ʹ ͳ ሺͳ െ ߠሻଶ ʹ ͳ ߠሺͳ െ ߠሻ ʹ TTH e THT f HTT g TTT h ͳ ଶ ߠ ʹ ͳ ߠሺͳ െ ߠሻ ʹ i. HHT b ͳ ሺͳ െ ߠሻଶ ʹ HTH c ͳ ߠሺͳ െ ߠሻ ʹ THH d Let S be a random variable denoting the total number of outcomes of type HH or TT observed in the above ‘n’ three-toss experiments. Thus, the observed value of S in terms of frequencies as: ͳͳ ͵Ǧͳͳͳ͵ s = # of outcomes of the type HH or TT = a * #{HHH} + b* #{HHT} + c * #{HTH} + d * #{THH} + e * #{TTH} + f * #{THT} + g * #{HTT} + h * #{TTT} =a*2+b*1+c*0+d*1+e*1+f*0+g*1+h*2 = 2a + b + d + e + g + 2h ii. (2 Marks) The likelihood equation for the given data will be: ܮ൫ߠǢܺ൯ ௗ ͳ ͳ ͳ ͳ ͳ ͳ ൌ ߠ ଶ ൨ כ ߠሺͳ െ ߠሻ൨ כ ሺͳ െ ߠሻଶ ൨ כ ߠሺͳ െ ߠሻ൨ כ ߠሺͳ െ ߠሻ൨ כ ሺͳ െ ߠሻଶ ൨ ʹ ʹ ʹ ʹ ʹ ʹ ͳ ͳ ଶ כ ߠሺͳ െ ߠሻ൨ כ ߠ ൨ ʹ ʹ ͳ ାାାௗାାାା ൌ൬ ൰ ߠ כଶାାௗାାାଶ כሺͳ െ ߠሻାଶାௗାାଶା ʹ ߠ ן௦ ሺͳ െ ߠሻଶି௦ NB: x s = 2a + b + d + e + g + 2h x n=a+b+c+d+e+f+g+h x 2n – s = 2*( a + b + c + d + e + f + g + h) – (2a + b + d + e + g + 2h) = b + 2c +d + e + 2f + g Taking logarithm of the likelihood function, ݈൫ߠǢܺ൯ ൌ ݈݃ כ ݏ ሺߠሻ ሺʹ݊ െ ݏሻ ݈݃ כ ሺͳ െ ߠሻ Ǥ Ǥ Ǥીǣ ݊ʹ ݏെ ݏ ߲݈ ൌ െ ߲ߠ ߠ ͳ െ ߠ డ Solving: డఏ ൌ Ͳǡwe get ݏ ʹ݊ ߲ଶ݈ ݏ ʹ݊ െ ݏ ൏ Ͳ ֜ ݉ܽ݉ݑ݉݅ݔ ቈ݄݇ܿ݁ܥǣ ଶ ൌ െ ଶ െ ฬ ሺͳ െ ߠሻଶ ఏ ߠ ߲ߠ ಾಽಶ ߠொ ൌ (4 Marks) ͳʹ iii. ͵Ǧͳͳͳ͵ For large n, ߠொ is approximately normal, and is unbiased with variance given by the CramerRao lower bound, that is: ߠொ ൎ ܰሺߠǡ ܤܮܴܥሻ ǡ ൌ ͳ ߲ଶ݈ െ ܧ ଶ ൨ ߲ߠ Now, ߲ଶ݈ ʹ݊ െ ܵ ܵ െ ܧቈ ଶ ൌ ܧ ଶ ൨ ሺͳ െ ߠሻଶ ߲ߠ ߠ ൌ ͳ ͳ ܧ כሺܵሻ כሾʹ݊ െ ܧሺܵሻሿ ଶ ሺͳ െ ߠሻଶ ߠ ൌ ͳ ͳ ߠ݊ʹ כ כሾʹ݊ െ ʹ݊ߠሿǥ ܧ݁ܿ݊݅ݏሺܵሻ ൌ ʹ݊ߠ ଶ ሺͳ െ ߠሻଶ ߠ ͳ ͳ ൌ ʹ݊ ൨ ߠ ͳെߠ ʹ݊ ൌ ߠሺͳ െ ߠሻ The asymptotic variance of the MLE of ɽ is ɽ (1 – ɽ) / 2n. Thus, the approximate asymptotic distribution of the MLE of ɽ is: ߠொ ൎ ܰ ቈߠǡ ߠሺͳ െ ߠሻ ʹ݊ (3 Marks) iv. In a particular study involving 1000 three-toss experiments, the observed frequencies were: a 135 b 131 c 125 d 125 e 123 f 115 g 125 h 121 Here: x n = 1000 x s = 2a + b + d + e + g + 2h = 1016 ߠொ ൌ ݏ ͳͲͳ ൌ ൌ ͲǤͷͲͺ ʹ݊ ʹͲͲͲ A large-sample 95% confidence interval for ɽ is given as: ߠொ ൫ͳ െ ߠொ ൯ ߠொ േ ݖǤଶହ כඨ ʹ݊ ͳ͵ ͵Ǧͳͳͳ͵ ͲǤͷͲͺ Ͳ כǤͶͻʹ ൌ ͲǤͷͲͺ േ ͳǤͻ כඨ ʹͲͲͲ ൌ ͲǤͷͲͺ േ ͲǤͲʹʹ ൌ ሺͲǤͶͺǡ ͲǤͷ͵Ͳሻ (4 Marks) v. It has been suggested that the above model is incorrect in its assumption that the probability of a head on the first toss is 0.5. In order to test this, we can set up the following hypothesis testing problem: H0: p = 0.5 against H1: p т 0.5 Here: p = probability of getting a head on first toss. Let X denote the number of heads observed in the first toss in the above n (=1000) 3-toss experiments. As a random variable, X follows Binomial (n, p) distribution. Given n is large and so using Normal approximation, a 95% asymptotic confidence interval for ‘p’ is stated as below: Ƹ േ ͳǤͻ כ ܺ ඥ݊Ƹ ሺͳ െ Ƹ ሻ Ƹ ൌ ݊ ݊ The given data reveals that: X = a + b + c + g = 516. This means Ƹ ൌ ͲǤͷͳǤ The 95% confidence interval: 0.516 ± 0.031 = (0.485, 0.547) contains the value 0.5. Hence, we can conclude that the criticism does not hold water at the 5% level of significance. (5 Marks) [Total Marks-18] ͳͶ ͵Ǧͳͳͳ͵ Solution 10 : i. Comments on the plot x The centers of the distributions differ for all the four cities. Thus there is a prima facie case for suggesting that the underlying means are different. x The difference between the mean time taken to commute to office in peak hours and nonpeak hours are in the order City A (highest), City D (lowest). x The variation in the data for City C is lowest compared to City D which appears to be highest. However, with only 7 observations for each city, we cannot be sure that there is a real underlying difference in variance. (2 Marks) ii. Following are the assumptions underlying analysis of variance: x x x The populations must be normal. The populations have a common variance. The observations are independent. (2 Marks) iii. We are carrying out the following test: ܪ : The mean of differences is same for each city against ܪଵ : The mean of differences are not the same for all of the cities To carry out the ANOVA, we must first compute the Sum of Squares ൌ ͳǡͶͻͷ െ ଵଷమ ଶ଼ ൌ ͳǡͳͳǤͳͳ ͳͲ͵ଶ ͳ ൌ ͷͳǤͳͳ ൌ ሺͶଶ ͶͶଶ ͺଶ ሺെͳ͵ሻଶ ሻ െ ʹͺ ܵܵோ ൌ ்ܵܵ െ ܵܵ ൌ ͲͲǤͲͲ The ANOVA table is: Source Treatments Residual Total ܪ ǡ ܨൌ ଵଶǤସ ଶହǤ df 3 24 27 SS 516.11 600.00 1,116.11 MS 172.04 25.00 F 6.88 ൌ Ǥͺͺǡ ܨଷǡଶସ . ͳͷ ͵Ǧͳͳͳ͵ The 5% critical point is 3.009, so we have sufficient evidence to reject ܪ at the 5% level. Therefore it is reasonable to conclude that there are underlying differences between the cities. (4 Marks) iv. Analysis of the mean differences Since, ݕതଵ כൌ ͻǤͳͶǢݕതଶ כൌ ǤʹͻǢݕതଷ כൌ ͳǤͳͶǢݕതସ כൌ െͳǤͺ we can write: ݕതଵ כ ݕതଶ כ ݕതଷ כ ݕതସכ ߪො ଶ ൌ ܵܵோ ൌ ʹͷ ݊െ݇ The least significant difference between any pair of means is: ͳ ͳ ͳ ͳ ݐଶସǡǤଶହ ߪ כොඨ൬ ൰ ൌ ʹǤͲͶ כξʹͷ כඨ൬ ൰ ൌ ͷǤͷʹ Now we can examine the difference between each of the pairs of means. If the difference is less than the least significant difference then there is no significant difference between the means. We have: ݕതଵ כെ ݕതଶ כൌ ʹǤͺͷǢݕതଶ כെ ݕതଷ כൌ ͷǤͳͷǢݕതଷ כെ ݕതସ כൌ ͵ǤͲͲ Observing that all these 3 differences are less than 5.52, we underline these pairs to show that they have no significant difference: ഥ כ ࢟ ഥ כ ࢟ ഥכ ഥ כ ࢟ ࢟ Examining to see if the first two groups can be combined: ݕതଵ כെ ݕതଷ כൌ ͺǤͲͲ There is a significant between means 1 and 3, so we cannot combine the first two groups. Examining to see if the last two groups can be combined: ͳ ͵Ǧͳͳͳ͵ ݕതଶ כെ ݕതସ כൌ ͺǤͳͷ There is a significant between means 2 and 4, so we cannot combine the last two groups. Therefore the diagram remains as before. (6 Marks) [Total Marks-14] Solution 11 : i. Fitted Linear Regression Equation The relevant summary statistics to fit the equation are: σ ݔଶ = 12,666.58; σ ݕଶ = 119,026.9; n = 12. σ = ݔ385.2; σ = ݕ1,162.5; σ = ݕݔ38,191.41; ͵ͺͷǤʹ ଶ ൰ ൌ ͵ͲͳǤ ܵ௫௫ ൌ ݔെ ݊ ݔൌ ͳʹǤͷͺ െ ͳʹ כ൬ ͳʹ ͵ͺͷǤʹ ͳͳʹǤͷ ൰൬ ൰ ൌ ͺͷǤͳ ܵ௫௬ ൌ ݕݔെ ݊ ݕݔൌ ͵ͺͳͻͳǤͶͳ െ ͳʹ כ൬ ͳʹ ͳʹ ͳͳʹǤͷ ଶ ଶ ଶ ܵ௬௬ ൌ ݕെ ݊ ݕൌ ͳͳͻͲʹǤͻͲ െ ͳʹ כ൬ ൰ ൌ ͶͲͻǤͳ ͳʹ ଶ ଶ The coefficients of the regression equation are: ୶୷ ͺͷǤͳ ൌ ൌ ʹǤͻͲ ୶୶ ͵ͲͳǤ ͳͳʹǤͷ ͵ͺͷǤʹ ߙො ൌ LJ െ ߚመ כLJ ൌ ൬ ൰ െ ʹǤͻͲ כ൬ ൰ ൌ ͵Ǥͺ ͳʹ ͳʹ ܠൌ Ǥ ૠૡ Ǥ ૢܠ ෝ ࢼ Therefore, the fitted regression line is: ܡൌ ࢻ ߚመ ൌ (4 Marks) ii. Confidence interval for ɴ Assuming normal errors with a constant variance: 95% confidence interval for ɴ: ߚመ േ ݐିଶ ሺʹǤͷͲΨሻ ݏ כǤ ݁Ǥ ሺߚመ ሻ ͳ ǣݏǤ ݁Ǥ ൫ߚመ ൯ ൌ ඨ ͵Ǧͳͳͳ͵ ߪො ଶ ܵ௫௫ ܵ௫௬ ଶ ͳ ቈܵ െ ൌ ͵ͺǤͲ ߪො ൌ ݊ െ ʹ ௬௬ ܵ௫௫ ଶ ͵ͺǤͲ ݏǤ ݁Ǥ ൫ߚመ ൯ ൌ ඨ ൌ ͳǤͳ͵ ͵ͲͳǤ ͻͷΨ ϐ ȾǣʹǤͻͲ േ ʹǤʹʹͺ ͳ כǤͳ͵ ൌ ሺͲǤ͵ͺǡ ͷǤͶʹሻ (5 Marks) iii. 95% confidence intervals for the mean IBM share price ͳ ሺݔ െ ݔҧ ሻଶ ݕො௫బ േ ݐିଶ ሺʹǤͷͲΨሻඨߪො ଶ ቆ ቇ ܵ௫௫ ݊ The Dell Share price is US $ 40 (x0). ݕො௫బ ൌ ͵Ǥͺ ʹǤͻͲ כͶͲ ൌ ͳͳͻǤͺ Thus, 95% Confidence interval: ଵ = ͳͳͻǤͺ േ ʹǤʹʹͺ כට͵ͺǤͲ כቂଵଶ ሺସିଷଶǤଵሻమ ଷଵǤ ቃ = ͳͳͻǤͺ േ ʹǤʹʹͺ Ͳͳ כǤͷͻͺͻ = ሺͻǤͳǡ ͳͶ͵Ǥ͵ͻሻ (4 Marks) [Total Marks-13] Solution 12 : We are carrying out the following test: ǣɊ ൌ ͵ͲͲݒȀݏଵ ǣɊ ൏ ͵ͲͲ Under ǡ ͳͺ ഥ െ ͵ͲͲ ͵ͲΤξ ͵Ǧͳͳͳ͵ ̱ሺͲǡͳሻ The test statistic is ʹͻͲ െ ͵ͲͲ ͵ͲΤξͲ ൌ െʹǤͷͺͳͻ From the Tables, -2.3263 is the critical value for a left-sided one-tailed 1% test. The test statistic of -2.5819 is less than this critical value, so we do not have sufficient evidence to accept . Therefore, it can be concluded that the quality control suspicions are true at the 1% level of significance. [Total Marks-4] XXXXXXXXXXXXXXXXXXXX ͳͻ