Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 17.1SB2 HERIOT-WATT UNIVERSITY 17.1SB2 STATISTICS II Friday 21st March 2003 9:30-11:30 You should attempt all questions. A total of 105 marks is available. Full credit will be given for a score of 100 marks. Approved electronic calculators may be used 1 2 1. A group of 10 students consists of 5 sets of twins. Suppose that 3 students are selected at random without replacement. a) Show that the total number of such selections containing no set of twins is 80. [4] b) Hence, or otherwise, show that the probability that the selection contains a pair of 1 twins is . [3] 3 2. Suppose that a fair coin is tossed 5 times. The results of tosses are independent of each other. Let X be the number of heads that are observed. a) Specify fully the distribution of X. [2] b) Let A be the event that more than 1 head is observed. i) Describe the event Ac (“not A”) in terms of the number of heads observed. [2] ii) Calculate P(Ac) and, hence, P(A). [3] Continued/ 2 3 3. Let X be a continuous random variable whose p.d.f. is f x x 1 , 0 x 1, and f(x) = 0 elsewhere, where > 0. a) Sketch (roughly) the graph of f(x) for = 2, and show the probability P(X > 0.5) as an area under this graph. [3] b) Let Y = X2. State the range of Y and show that its p.d.f. is given by fY y 2 1 y 2 , 0 < y < 1. [5] (Hint: The function y = g(x) = x2 is increasing over 0 < x < 1. Its inverse g-1 can be defined over the range of Y.) c) By evaluating an appropriate integral, show that E X . 1 [5] d) Let the values in a random sample of size 5 from this distribution be 0.41 0.84 0.89 0.94 0.98 Calculate ˆ , the method-of-moments estimate of , from these data. [5] 4. a) Let X be a random variable. Explain what is meant by the cumulative distribution function (c.d.f.) of X, FX(x). State the properties that a c.d.f. must satisfy and describe the difference between the c.d.f. of a discrete random variable and that of a continuous random variable. [7] b) Let X ~ Bin(8, 0.3). Use statistical tables to evaluate P(X < 4). [3] c) Let X ~ Bin(16, 0.7). Use statistical tables to evaluate P(13 > X > 8). [4] d) Let X be a random variable whose probability density function is fX(x) = 2 – 2x, 0 < x < 1, and fX(x) = 0 otherwise. Determine the c.d.f., FX(x), for 0 < x < 1 by evaluating an appropriate integral [4] Continued/ 3 4 5. A gambler wishes to test whether or not a given die is fair (i.e. all 6 outcomes are equiprobable). To do this he throws the die 120 times and notes the scores for each throw. The results are shown in the following table. Score (x): Frequency (fx): 1 11 2 16 3 24 4 16 5 25 6 28 a) Compute the expected frequency for each score for a sample of size 120, assuming the die is fair. [1] b) Compare the expected numbers with the actual numbers given above by calculating the value of the 2 statistic and estimating and commenting on the associated p-value. Do you think the die is fair? [7] 6. The concentration (mg/litre) of a certain blood protein the blood was measured on a random sample of 30 healthy males. The results were (in ascending order): 10 26 44 13 27 45 14 29 46 16 29 46 17 36 49 18 40 50 18 40 19 40 19 41 21 42 22 43 24 44 i) Calculate the median and the quartiles Q1, and Q3 for these data. ii) Construct a stem-and-leaf diagram for the data. Does the plot support the suggestion that the distribution of concentrations is Normal? [7] [4] 7. Let X be a random variable whose distribution is N(, 2). a) State the distribution of the quantity Z X [2] b) Suppose that X ~ N(80, 9). Use statistical tables to calculate the following probabilities: i) P(X > 83); [3] ii) P(77 < X < 80). [5] Continued/ 4 5 8. Let X1, …, Xn be independently identically distributed (i.i.d.) random variables whose common distribution has mean and variance2. n a) Show that the mean and variance of Y X i are given by n and n2 respectively, i 1 and use the Central Limit Theorem to identify approximately the distribution of Y. [5] b) The lifetime, T, measured in years, of a certain brand of light bulb has mean 2.0 and variance 5.0. Use the Central Limit Theorem to calculate an approximation to the probability that the sum of the lifetimes of 30 randomly selected bulbs is less than 58 years. [6] 9. The population distribution of the amount of nicotine (measured in g) in a single cigarette of a certain brand (brand A) is known to be Normal with unknown mean A. Eight cigarettes are selected at random and their nicotine content measured. The resulting data are: 8.3 For these data 11.3 x i 10.6 81.7 and 9.8 x 2 i 11.0 10.6 10.1 10.0. 840.35 . a) Evaluate the sample mean and the sample variance for these data. [4] b) Evaluate a 95% confidence interval for the unknown mean A. [5] c) Suppose that, for a sample of size 10 of a second brand of cigarette (brand B), the sample mean and standard deviation are 11.0 and 0.9 respectively. Assuming that the population variances for brands A and B are equal, calculate a 95% confidence interval for the difference in the mean nicotine content of brands A and B. Comment on any difference in the mean nicotine content of the brands. (Hint: First calculate the pooled sample variance.) [6] END OF QUESTION PAPER 5