Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Multi-armed bandit wikipedia , lookup
Foundations of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
History of statistics wikipedia , lookup
German tank problem wikipedia , lookup
Statistical inference wikipedia , lookup
Student's t-test wikipedia , lookup
The written Master’s Examination Option Statistics and Probability FALL 2009 Full points may be obtained for correct answers to 8 questions. Each numbered question (which may have several parts) is worth the same number of points. All answers will be graded, but the score for the examination will be the sum of the scores of your best 8 solutions. Use separate answer sheets for each question. DO NOT PUT YOUR NAME ON YOUR ANSWER SHEETS. When you have finished, insert all your answer sheets into the envelope provided, then seal and print your name on it. Any student whose answers need clarification may be required to submit to an oral examination. MS Exam, Option Statistics and Probability, FALL 2009 1. (Stat 401) Let U and V be two independent, standard normal random variables. (a) Find P(U V ≥ 1) . (b) Find the distribution of U V . (Hint: you may start with the joint distribution of U V and V.) 2. (Stat 411) Let X1, …, Xn be independently, identically distributed Poisson random variables with parameter θ, where θ>0. (i) (ii) Show that Y = ∑ X i is complete sufficient. Use Rao-Blackwell theorem to derive the MVUE (Minimum Variance Unbiased Estimator) of P(X1=0). 3. (Stat 411) Let and and respectively. be two independent random samples from normal distributions against [1] Find the likelihood ratio for testing [2] Rewrite as a function of which has a well-known distribution. 2 . MS Exam, Option Statistics and Probability, FALL 2009 4. (Stat 416) Consider the following data which gives miles per gallon of a make of car before and after application of a newly developed gasoline additive. Car 1 2 3 4 5 6 7 Before 17.2 21.6 19.5 19.1 22.0 18.7 20.3 After 18.3 20.8 20.9 21.2 22.7 18.6 21.9 State the basic model assumptions under which the Wilcoxon Signed Rank Test can be used. Set up the null hypothesis and the alternative, determine the value of the test statistic, and explain how you would determine the p-value. When would you use the Sign Test instead? 5. (Stat 431) 1- How would you draw a simple random sample of size 100 from a finite population of size 20,000? 2- A simple random sample of size 10 was drawn from a finite population of size 80. These 10 units were surveyed and the related data on a single survey variable Y were collected. However, later it was observed that in recording the data in a computer file one of the 10 units was missing in the file. What will be the HT estimator of the population total based on the remaining 9 data points? Indicate if you need to make any assumptions. 3- Based on the following survey sampling plan how would you estimate the population mean and its standard error if upon implementation of the sampling design the sample {3} is selected. Samples in the support of the design: {1, 2, 3}, {3, 4, 5}, {2, 4, 5}, {3}, {1,4,5} Prob. distribution over the support: 3/9 2/9 2/9 1/9 1/9 6. (Stat 461) Suppose that coin 1 has probability 0.7 of coming up heads, and coin 2 has probability 0.6 of coming up heads. If the coin flipped today comes up heads, then we select coin 1 to flip tomorrow, and if it comes up tails, then we select coin 2 to flip tomorrow. If the coin initially flipped is equally likely to be coin 1 or coin 2, then what is the probability that the coin flipped on the third day after the initial flip is coin 2? 3 MS Exam, Option Statistics and Probability, FALL 2009 7. (Stat 471) Consider a matrix A of order m × n and let b be a column m -vector. Prove that exactly one of the following holds. Either: Ax = b has x ≥ 0 or AT y ≥ 0, bT y < 0 has a solution y but not both. 8. (Stat 471) What are complementary slackness conditions for two linear programming problems that are dual to each other? Use the complementary slackness conditions to show that for a transportation problem with 3 warehouses and 4 cities and with cost matrix ⎛ 5 7 9 6⎞ ⎜ ⎟ C = ⎜ 6 7 10 5 ⎟ , ⎜7 6 8 1⎟ ⎝ ⎠ and with supplies 120, 140, 100 from warehouses 1, 2, and 3 respectively and with demands 100, 60, 80, and 120 at cities 1, 2, 3 and 4 respectively, shipping schedule: Ship from warehouse 1 to city 1 100 units Ship from warehouse 1 to city 3 20 units Ship from warehouse 2 to city 2 60 units Ship from warehouse 2 to city 3 60 units Ship from warehouse 2 to city 4 20 units Ship from warehouse 3 to city 4 100 units is optimal. 9. (Stat 473) Given the bimatrix game ( A, B) of order m × n with Nash equilibrium strategies p = ( p1 , p2 ,..., pm ) , q = (q1 , q2 ,..., qn ) , show that they are optimal strategies for the two players in case A = − B . 4 MS Exam, Option Statistics and Probability, FALL 2009 10. (Stat 481) Four corn varieties were tested for their production in an experiment with 4 blocks and the following yield data were obtained: Block 1 9.3 9.4 9.2 9.7 yij Variety A B C D 2 9.4 9.3 9.4 9.6 3 9.6 9.8 9.5 10.0 4 10.0 9.9 9.7 10.2 (1). Assume that there is no interaction effect between the corn variety and the blocks, write down the appropriate model and necessary restrictions to analyze the data. (2). Denote yi⋅ and y⋅ j the row mean and column mean respectively, we have 4 ∑ j =1 4 ( y i ⋅ − y ⋅ ⋅ )2 = 0 . 30 , 4 ∑ i=1 ( ) 4 y ⋅ j − y ⋅ ⋅ 2 = 0 . 70 , ∑ ∑ (y ij 4 4 i=1 j =1 ) − y ⋅ ⋅ 2 = 1 . 11 Construct the ANOVA Table: Source S.S. DF MS F (3). Formulate the hypothesis and draw your conclusions for both effects based on above ANOVA table given significance level 0.05. [Given: F(0.05,3,9)=3.86, F(0.05,3,12)=3.49] 11. (Stat 481) To analyze a dataset with 10 observations of ( xi , yi ) and ∑y 2 i = 50 , ∑x y i i ∑x i = 18 , ∑y i = 20 , ∑x 2 i = 40 , = 30 , one may consider simple linear regression model, yi = β 0 + β1 xi + ε i ; ε ∼ N (0, σ 2 I n ). (a) Find the least-square estimate of the simple linear regression model. (b) Perform a test of H 0 : β1 = 0 against H a : β1 < 0 . Use α = 0.05 . (c) Estimate E ( y | x = 2) and construct a 95% confidence interval for E ( y | x = 2) . (d) Predict the value of y at x = 2 , and construct a 95% confidence interval for y. Hint: you need to consider the effect of ε on prediction of y. [Given: z (0.05) = 1.645 z (0.025) = 1.96 t (0.05;8) = 1.86 t (0.025;8) = 2.306 ] 5 Statistics 401&481 – MS Exam Fall Semester 2009 1. Let U and V be two independent, standard normal random variables. (a) Find P(U V ≥ 1) . (b) Find the distribution of U V . (Hint: you may start with the joint distribution of U V and V.) Solution: ∞ u P (U V ≥ 1) = P (U ≥ V ≥ 0) + P (U ≤ V ≤ 0) = ∫ ∫ φ (u )φ ( v ) dvdu + 0 0 ∞ = ∫ φ (u )( Φ (u ) − 1 / 2) du + (a) 0 1 = ∫ 0 0 ∫ ∫ φ (u )φ (v ) dvdu −∞ u 0 ∫ φ (u )(1 / 2 − Φ (u )) du , −∞ 1/ 2 (t − 1 / 2) dt + 1/ 2 ∫ (1 / 2 − t ) dt = 1 / 4 0 where t = Φ (u ) , and φ (⋅) and Φ(⋅) are pdf and cdf of standard normal distribution, respectively. (b) Let X = U / V and Y = V , then U = XY and V = Y . Therefore, the joint pdf of ( X , Y ) is f X ,Y ( x , y ) = fU ,V (u , v ) ∂u / ∂x ∂u / ∂y ∂v / ∂x ∂v / ∂y = φ ( xy )φ ( y ) y x 0 1 ⎧ x y ⎫ 1 ⎧ y ⎫ 1 exp ⎨ − exp ⎨ − ⎬ | y | ⎬ 2 ⎭ 2π 2π ⎩ ⎩ 2 ⎭ 2 2 ⎧ (1 + x ) y ⎫ 1 = exp ⎨ − ∀x, y ∈ R , ⎬ | y |, 2π 2 ⎩ ⎭ 2 = 2 2 and the marginal pdf of X is ∞ ∞ ⎧ (1 + x 2 ) y 2 ⎫ 1 exp ∫ ∫ 2π ⎨⎩− 2 ⎬⎭ | y | dy −∞ −∞ ∞ ⎧ (1 + x 2 ) y 2 ⎫ ⎛ (1 + x 2 ) y 2 ⎞ 1 1 exp ⎨− , = ⎬d ⎜ ⎟= 2 ∫ 2 π (1 + x ) 0 2 2 ⎩ ⎭ ⎝ ⎠ π (1 + x ) f X ( x) = f X ,Y ( x, y )dy = ∀x ∈ R . [Note: This proof shows that the ratio of two indep. standard normal r.v.’s follows a standard Cauchy distribution.] Stat 411, Estimation problem, Fall 2009 Let X1 ; : : : ; Xn be independently, identically distributed Poisson random variables with parameter , where >0. (i) Show that Y = Xi is complete su¢ cient. (ii) Use Rao-Blackwell theorem to derive the MVUE (Minimum variance unbiased estimator of P (X1 = 0). Solution. (i) The p.d.f. of a Poisson is f (x; ) = ex ln x! ; x = 0; 1; ::: 0; otherwise. Clearly this belongs to the regular exponential class of families. Hence Y is complete su¢ cient. (ii) Let 1, if X1 = 0 0; otherwise. U= Clearly E(U ) = P (X1 = 0) = e 0) = e : For y = 0; 1; ::: E(U jy) for all ; hence U is unbiased for P (X1 = = P (X1 = 0jY = y) n P P (X1 = 0)P Xi = y i=2 = n P P Xi = y = = e e i=1 (n 1) e 1 n n n y : Hence by the Rao-Blackwell theorem n 1 n is MVUE for P (X1 = 0): 2 y ((n 1) ) =y! y (n ) =y! n P Xi i=1 Stat 411 (chapter 8, 9), Fall 2009: Let and be two independent random samples from normal distributions and respectively. [1] Find the likelihood ratio for testing against . [2] Rewrite as a function of which has a well-known distribution. What is it? Solution: [1] Under , write . The likelihood function attains its maximum at Under , where , and . , the likelihood function attains its maximum At [2] Write mean . Therefore, the likelihood ratio statistic . Then and variance and . follows a normal distribution with Statistics 431-MS Exam Fall Semester 2009 1- How would you draw a simple random sample of size 100 from a finite population of size 20,000? Answer: Write an algorithm which allows you to draw one unit at time without replacement till you have drawn 100 units. See Chapter 1 in the text book for the course (see below). 2- A simple random sample of size 10 was drawn from a finite population of size 80. These 10 units were surveyed and the related data on a single survey variable Y were collected. However, later it was observed that in recording the data in a computer file one of the 10 units was missing in the file. What will be the HT estimator of the population total based on the remaining 9 data points? Indicate if you need to make any assumptions. Answer: If it is reasonable to assume that the missing data could have been from any of the 10 units in the sample with equal probability then the remaining 9 units in the sample form a simple random sample of size 9 from the frame of 80. Now consult Chapter 4 of the text book under SRS (80,9) for the remaining parts in the question. 3- Based on the following survey sampling plan how would you estimate the population mean and its standard error if upon implementation of the sampling design the sample {3} is selected. Samples in the support of the design: {1, 2, 3}, {3, 4, 5}, {2, 4, 5}, {3},{1, 4, 5} Prob. distribution over the support: 3/9 2/9 2/9 1/9 1/9 Answer: We note that the frame has 5 units in it. And the selected probability sample is {3}, thus we need to compute the first ordered inclusion probability for unit 3 under the given sampling plan. Since unite 3 are in three samples {1, 2, 3}, {3,4,5} and {3}, thus, the probability of unit 3 being selected under this sampling plan is 3/9 + 2/9 + 1/9 = 6/9. Therefore, the HT estimator of the population mean is [(observation made on unit 3)9/6]/5. As for the standard deviation of this HT estimator, we observe that the second ordered inclusion probabilities for all 10 pairs of units are positive for this sampling plan and therefore we can have an unbiased estimator of the variance of the given HT estimator even though we have only a single observation at hand. Now you can use expression (3.15) in the text and utilize the first part of the expression to carry the estimation and then take positive square root of it for the standard deviation. Text Book: Hedayat, A.S., Sinha, B.K. (1991). Design and Inference in Finite Population Sampling. Wiley, New York. Masters exam questions: VII Stat 471: Linear and Non Linear Programming: Consider a matrix A of order m × n and let b be a column m-vector. Prove that exactly one of the following holds Either : Ax = b has x ≥ 0 or AT y ≥ 0, bT y < 0 has a solution y but not both. Solution: If the LP problem max 0.x Ax = b, x ≥ 0 has an optimal solution then solution must first be feasible and in case AT y ≥ 0 for some y then for the feasible x ≥ 0 we will have (x AT y) ≥ 0 ⇒ (x AT y) = (y Ax) = (y b) ≥ 0 and thus bT y < 0 is not possible. In case the set {x : Ax = b, x ≥ 0} is ∅ then since min bT y subject to AT y ≥ 0 has a feasible solution y = 0, it cannot have an optimal solution for otherwise it will contradict the duality theorem by asserting an optimal solution to its dual max 0.x b, subject to Ax = x ≥ 0 which by assumption has not even a feasible solution. The statement that the minimum problem has no optimal solution, but a feasible solution implies that there is some y for which, bT y < 0 where 0 corresponds to the objective function’s value at the feasible solution point y ∗ = 0. VIII. Stat 471: Linear and Non Linear Programming: What are complemen- tary slackness conditions for two linear programming problems that are dual to each other? Use the complementary slackness conditions to show that for a transportation problem with 3 warehouses and 4 cities and with cost matrix 5 7 9 6 C= 6 7 10 5 7 6 8 1 and with supplies 120, 140, 100 from warehouses 1, 2, and 3 respectively and with demands 100, 60, 80, and 120 at cities 1, 2, 3 and 4 respectively, shipping scedule: Ship from warehouse 1 to city 1 100 units Ship from warehouse 1 to city 3 20 units Ship from warehouse 2 to city 2 60 units 1 Ship from warehouse 2 to city 3 60 units Ship from warehouse 2 to city 4 20 units Ship from warehouse 3 to city 4 100 units is optimal. Solution Given the linear programming problem, say min(c, the dual is max(b, x) subject to Ax = b, x ≥ 0, y) subject to AT y ≤ c, and where y is unrestricted. The complementary slackness theorem says that at optimal solutions x∗ , y ∗ to the two problems when it exists we will have [(Ax∗ )i − bi ]yi∗ = 0 for each i. and similarly [(AT y ∗ − c)j x∗j ] = 0 for each j. Further for feasible solutions x̄, ȳ these complementary conditions when satisfied imply they are also optimal. Thus if we can check the feasibility of our solutions and dual feasibility constructed assuming slackness we can check the optimality. Our problem of transportation has dual max P dj vj + P si ui subject to ui +vj ≤ cij for all i, j. First let us pretend our shipping schedule as optimal and thus let us get the necessary slackness feasibility conditions giving u1 + w1 = c11 = 5, u1 + w3 = 9, u2 + w2 = 7, u2 + w3 = 10, u2 + w4 = 5, u3 + w4 = 1 giving a solution say starting arbitrarily with u1 = 0, giving w1 = 5, w3 = 9, u2 = 1, w2 = 6, w4 = 4, u3 = −3. These give a feasible solution to the dual ui + wj ≤ cij . Thus since the given shipping is feasible, the shipping is optimal by complementary slackness sufficiency part for the two feasible solutions. IX Stat 473 Game Theory Given the bimatrix game (A, B) of order m × n with Nash equilibrium strategies p = (p1 , p2 , . . . pm ), q = (q1 , q2 , . . . qn ) show that they are optimal strategies for the two players in case A = −B. Solution Since mixed strategies p, q constitute a Nash equilibrium we have (p Aq) ≥ (x Aq) ∀ mixed strategy x. Thus v = (p Aq) ≥ (Aq)i for all coordinates i. Similarly the equilibrium condition (p Bq) ≥ (p By) for all mixed strategy y says (p Bq) = −(p Aq) = −v ≥ (p Bej ) = −(p, Aej ) ⇒ (AT p)j ≥ v Thus p, q are optimal for the zero sum game A with value v. 2 STAT 481 -Fall 2009 (Jing Wang) Four corn varieties were tested for their production in an experiment with 4 blocks and the following yield data were obtained: yij Variety A B C D 1 9.3 9.4 9.2 9.7 Block 2 3 4 9.4 9.6 10 9.3 9.8 9.9 9.4 9.5 9.7 9.6 10 10.2 (1). Assume that there is no interaction effect between the corn variety and the blocks, write down the appropriate model and necessary restrictions to analyze the data. Model for the data is Yij = µ + τi + βj + εij , , i = 1, ..., 4, j = 1, ..., 4 where µ is the overall mean, τi is the treatment (variety) effect such that 4 P τi = 0 and i=1 4 P βj is the block effect such that βj = 0, errors εij are assumed to be i.i.d. and follow a normal distribution with a constant variance, i.e. εij ∼ N 0, σ 2 . (2). Denote and the row mean by ȳi· and column mean by ȳ·j respectively, we have j=1 4 X 4 (ȳi· − ȳ·· ) = 0.3, 4 X 4 (ȳ·j − ȳ·· ) = 0.7, 4 (yij − ȳ·· ) = 1.11 i=1 j=1 j=1 i=1 4 4 X X Construct the ANOVA Table: Source T reatment Block Error T otal Sum Square 0.3 0.7 0.11 1.11 D.F. MS 3 0.1 3 0.233 9 0.012 15 F 8.182 19.091 (3). Formulate the hypothesis and draw your conclusions for both effects based on above ANOVA table given significance level 0.05. Hyptothesis for treatment effects τ H0 : τ1 = τ2 = τ3 = τ4 = 0 vs. H1 : not all τi = 0. As in ANOVA table, Ftrt = 8.182 > F (0.05, 3, 9) = 3.86 which leads to the conclusion that there is significant treatment (variety) effect. Hyptothesis for treatment effects β H0 : β1 = β2 = β3 = β4 = 0 vs. H1 : not all βj = 0. As in (2), Fblock = 19.091 > F (0.05, 3, 9) = 3.86 which suggests that the block effect is also significant. 1 11. To analyze a dataset with 10 observations of ( xi , yi ) and ∑x 2 i = 40 , ∑y 2 i = 50 , ∑x y i i ∑x i = 18 , ∑y i = 20 , = 30 , one may consider simple linear regression model, yi = β 0 + β1 xi + ε i ; ε ∼ N (0, σ 2 I n ). (a) Find the least-square estimate of the simple linear regression model. (b) Perform a test of H 0 : β1 = 0 against H a : β1 < 0 . Use α = 0.05 . (c) Estimate E ( y | x = 2) and construct a 95% confidence interval for E ( y | x = 2) . (d) Predict the value of y at x = 2 , and construct a 95% confidence interval for y. Hint: you need to consider the effect of ε on y. Solution: ∑ ( x − x )( y − y ) = ∑ x y − n ∑ x ∑ y ∑ (x − x ) ∑ x − n (∑ x ) −1 (a) βˆ1 = i i i 2 i i 2 i i i 2 −1 = i 30 − 18 × 20 /10 = −0.789 , and 40 − 182 /10 βˆ0 = y − x βˆ1 = 2 − 1.8 × (−0.789) = 3.420 . βˆ (b) Under H 0 , t1 = 1 ∼ t (n − 2) , where s ( βˆ1 ) = s ( βˆ1 ) s ∑ ( xi − x )2 and s 2 ∑( y − y) = i n −1 2 . The rejection region is {t1 < −t (0.05;8)} = {t1 < −1.86} (this is a one-sided test!), and the observed t1 = −2.06 . So we have strong evidence to reject H 0 . (c) The point estimate Eˆ ( y | x = 2) = 3.420 − 0.789 × 2 = 1.842 , and ⎛ (2 − x ) 2 ⎞ 2 1 ˆ s ( E ( y | x = 2)) = s ⎜ + = 0.117 . ⎜ n ∑ ( x − x ) 2 ⎟⎟ i ⎝ ⎠ 2 The 95% C.I. for E ( y | x = 2) is Eˆ ( y | x = 2) ± t (0.025;8) × s( Eˆ ( y | x = 2)) = (1.053, 2.631) . (d) The point estimate yˆ = 3.420 − 0.789 × 2 = 1.842 , and ⎛ 1 (2 − x ) 2 ⎞ s 2 ( yˆ ) = s 2 ⎜1 + + = 1.228 . ⎜ n ∑ ( x − x ) 2 ⎟⎟ i ⎝ ⎠ The 95% C.I.(or prediction interval) for y is yˆ ± t (0.025;8) × s( yˆ ) = ( −0.713, 4.397) .