Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
LAST NAME (Please Print): FIRST NAME (Please Print): HONOR PLEDGE (Please Sign): Statistics 111 Midterm 4 • This is a closed book exam. • You may use your calculator and a single page of notes. • The room is crowded. Please be careful to look only at your own exam. Try to sit one seat apart; the proctors may ask you to randomize your seating a bit. • Report all numerical answers to at least two correct decimal places or (when appropriate) write them as a fraction. • All question parts count for 1 point. 1 1. Suppose a network has 5 nodes, and each pair of nodes is independently linked by an edge with probability 0.25. What is the probability that node A is not connnected to some other node? What is the probability that nodes A and B are connected by a shortest path of length 2? 2. Consider the Rayleigh distribution with parameters θ0 and θ1 , which has cumulative distribution function 1 F (t) = 1 − exp(−θ0 t − θ1 t2 ). 2 When will the Rayleigh distribution have increasing failure rate? Write a modified cdf which can describe bathtub-shaped failure rates. Suppose one observes one random failure time T1 = 4 from a Rayleigh distribution with parameter θ0 = 0. Find the maximum likelihood estimate of θ1 . 3. John is bidding against Yoko to own a first edition of the Theory of Games and Economic Behavior. It is a sealed-bid auction. He believes that Yoko’s bid (in dollars) for the book will be uniformly distributed between $100 and $150. His own top-dollar value for the book is $140. What bid should he make in order to maximize his expected profit? 2 4. Dr. Evil has bred a new species of cockroach, whose lifespan (in years) is exponentially distributed with parameter λ = 0.5. In contrast, ordinary cockroaches have lifespans that are exponential with λ = 1.2. (Recall that the mean of an exponential is 1/λ.) Dr. Evil releases his cockroaches into the wild. If they are viable, then the U.N. will declare a world cockroach emergency. To assess the threat, an entomologist collects an egg from Dr. Evil’s island, hatches it, and observes its lifespan. She will declare a cockroach emergency if it lives more than 2.3 years. In words, what is her alternative hypothesis? What is her α level? What is the power of her test? The hatchling lives 1.7 years. What is her signficance probability? 5. A physician wants to predict lifespan from some sensible explanatory variables. To assess predictive accuracy, she uses cross-validation. But her sample includes many identical twins. How will this affect her estimate of predictive accuracy? 6. You want to describe the probability that someone dies between the ages of 20 and 30. Should you use a competing risks model or a Cox proportional hazards model? Why? 3 7. Suppose that the baseline hazard function for a bridge has the Weibull distribution, with k x k−1 F (x) = 1 − exp[−(x/λ)k ] f (x) = exp[−(x/λ)k ] λ λ for k, λ > 0. What is the hazard function for a bridge? An engineer fits a Cox proportional hazards model to the lifespan of a bridge (in decades). Her covariates are average annual traffic load (in millions), percentage of rebar, and whether or not the bridge has a cantilever span (1 for yes), and her estimates for the corresponding coefficients are -2, 0.15, and -0.5, respectively. Suppose the Tacoma Narrows Bridge carries 3 million cars per year on average, has 20% rebar, and uses a cantilever span. And the Minneapolis I-35W bridge carries 5 million cars per year, has 30% rebar, does not use the cantilever span. What is the hazard ratio for Tacoma Narrows compared to I-35W? (Tacoma Narrrows is in the numerator.) Which bridge is safer? 8. To succeed, the Reese’s Co. requires both chocolate and peanut butter. They have two suppliers for peanut butter (A, B), and three suppliers for chocolate (C, D, E). Over the next year, the failure probabilites for these suppliers is as follows: company failure probability A 0.5 B 0.3 C 0.4 D 0.8 E 0.3 If the failures for each supplier are independent, what is the probability that Reese’s fails in the next year? Why should you question the assumption that failures are independent? 4 9. When does multicollinearity occur? What is the effect of multicollinearity? 10. What is a “large-world” social network? 11. In the Holland-Leinhardt model, assume that the baseline connectivity in the population is 0.2, that Tarzan has expansiveness 0.5, Jane has attractiveness 0.3, and the tendency to reciprocity is 0.4. What is the probability of an edge from Tarzan to Jane? 12. List all, and only, true statements. A. Dunbar’s number represents the maximum number of close friends one can cognitively manage. B. As points cluster more tightly around a line, the correlation increases. C. Georg Simmel studied the six-degrees-of-separation theory. D. In high-dimensional regression, most data sets are not multicollinear. E. The mean of the residuals is the average of the dependent variables. F. Roger Boisjoly questioned the accuracy of extrapolation. G. People tend to overestimate common risks. H. People tend to maximize their expected utility. I. The typical utility curve for money is concave (i.e., looks like the graph of ln x). J. In high dimensions, the number of possible models that can be fit explodes. K. When there are many independent variables, one needs lots of data in order to estimate dependent values accurately. L. Statistics is a good thing to know. 5