Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Stat 493: Homework #1 – Answers, with discussion where needed 1) Thanks for this information. It makes things a bit less impersonal. For my part, I’ve been on the faculty of the Statistics Dept at ISU since 1998. Before that I was a researcher at Savannah River Ecology Lab, in Aiken SC for 11 years. My PhD is in Ecology and I have a MS in Statistics. I teach a variety of applied statistics classes: 401/402, our yearlong statistical methods course for graduate students in agriculture and biology, part of 415, our advanced methods course for non-stats graduate students, 500, our 1 semester methods for statistics graduate students, and 534, Ecological Statistics. I do a lot of statistical consulting, especially for graduate students and faculty in agriculture and the biological sciences. My choice of topics for this course reflects the issues and misconceptions I encounter in consulting. On the personal side, I’m married with 2 boys, now 11 and 13. When I have time for hobbies, they include cross-county skiing, bicycling, playing classical music, and Scottish dancing. I’m teaching 3 classes this semester: 402, 493, and part of 415, so there isn’t much time for hobbies until summer. 2) This question gave most of you some trouble. You need to look at who (or what) assigned the treatment(s) and how it was done. If something other than the experimenter chose who got what treatment, then the study isn’t randomized. For example, in c and d, the burnt areas were not chosen by the investigator. So those aren’t randomized experiments. Remember also that there are two forms of randomness (Diagram in Types of Randomness). The samples in e are not randomly chosen from some population, but treatments are randomly assigned to those samples. Remember it is the treatment assignment that matters for deciding whether a study is a randomized experiment. a) Not randomized. Left leg is always the control. b) Randomized. Half samples are randomly assigned to treatment=method. c) Not randomized. Treatment = burnt not manipulated by investigator. d) Not randomized. Same reason, although this study is replicated. e) Randomized. You manipulated the treatment and randomly assigned it. 3) This gave a few of you some trouble. If you’re one of those, you might want to look back over the ‘Using incomplete information page’, especially the diagram. Remember, the population is the set of things you would like to study. The sample is the subset actually used in your study. Similarly, the parameter is the quantity you want to know; the statistic is what you calculate. A number of you provided the response variable (e.g.answered 3c with “optimism (1-5) score about future of agriculture” or answered 3b with “Phosphorus levels in soil samples”), not the parameter. The parameter is the population version of the statistic. So, in 3a. The statistic is the average leptin concentration in the blood on the five sampled days. The parameter is the mean leptin concentration during the month of December. a) Randomly sampled. Population = blood of this specific animal over days in Dec, statistic = average leptin conc., parameter = mean b) Not randomly sampled. Population = all possible soil samples from the field. statistic = average P concentration, param = mean P concentration in the field. c) Not randomly sampled. Population = conference attendees who work in Iowa. Sample = average, parameter = mean. You could also argue that: Population = conference attendees who work in IA and returned your questionaire. You have completely enumerated this population, so there is no sampling. d) Not randomly sampled. Population = fellow students in your two classes. No sampling, because you have a complete census. The parameter and the statistic are the same quantity. e) Random sample. Population = all pigs on your farm (problem is unclear here). Parameter = population standard deviation, sigma. Statistic = sample standard deviation 4) Questions 10 and 11 were the most commonly missed. Again, look back at the ‘Using incomplete information page’ for 10 and 13 10: #1, parameter 11: #2, all values are the same 13: #4, a sample 14: #2, the distribution is skewed 5) Almost everyone got most of this and all of 6 correct. The biggest problem was making a histogram in Excel. Many of you plotted an index plot (X = observation number, Y = value). Although this looks sort of like a histogram (compare the books Figure 1.4 and 1.5), the axes are completely different. In a histogram, groups of values are on the X axis (e.g. tree height in figure 1.4) and the frequency of that group is on the Y axis. 6) Water: mean = 7.13, median = 1.5, variance = 453 Fowl: mean = 75.63, median = 11.5, variance = 42197 I didn’t print out charts. See me if you had problems with this or questions. The distribution of both variables is highly skewed. For both variables, the median is the most appropriate measure of the typical individual. 7) a) mean = 7.27, standard deviation = 1.63 b) median = 7.1, It occurred sometime in 1977/78.