Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Summary of Initial Class Material 8/28/06 Key concepts from Chapter 1: Statistics is the art and science of designing studies and analyzing the data that those studies produce. Its ultimate goal is translating data into knowledge and understanding of the world around us. In short, statistics is the art and science of learning from data. Design: Planning how to obtain data to answer the question of interest. Description: Summarizing the data that are obtained Inference: Making decisions and predictions based on the data Example: In the exit poll about the Gray Davis recall, 54% of a sample of 3160 voters said they had voted to recall Davis. Stating that the sample fraction was 54% is an example of descriptive statistics. However, if you used that to make the prediction that Governer Davis was going to lose the recall, that would be an example of inference. Populations and samples: The population is the total set of subjects in which we are interested. The sample is a subset of that population on whom we have (or plan to have) data. Example: Figures released Wednesday by the National Association of Realtors show a continued implosion in the housing market, with inventories up sharply as prices are softening. Sales were down 11.4 percent in the past year, to a seasonally adjusted annual rate of 6.33 million, compared with 6.60 million in June. Was this a population or a sample? Descriptive statistics and inferential statistics: Descriptive statistics refers to methods of summarizing the data, usually either as numerical values (e.g. averages and percentages) or through graphs. Inferential statistics refers to methods of making a decision or prediction about a population, based on data from a sample of that population. Parameters and statistics: A parameter is some numerical summary of an entire population (e.g. the proportion of all voters in California who voted for Governor Davis’ recall). A statistic is a numerical summary based on a sample (e.g. the proportion of voters in an exit poll who stated they had voted for the recall) Probability is also critical for developing statistical inference methods. For example, if the recall vote were actually supported by less than half the voters, what is the probability that there would be 54% in favor among a sample of 3160 voters? Types of Data (Section 2.1) Variables – categorical, quantitative If quantitative, may be discrete or continuous. A frequency table is a list of all the possible values of a variable, together with the number of cases in which each value occurs. The proportion of the observations associated with a given category is the count of observations in that category, divided by the total number of observations. If the proportion is multiplied by 100, it is called a percentage. Proportions and percentages are also called relative frequencies. Questions for Homework 1 1.22. In a University of Wisconsin (UW) study about alcohol abuse among students, 100 of the 40,858 members of the student body in Madison were sampled and asked to complete a questionnaire. One question was, “On how many days in the past week did you consume at least one alcoholic drink?” a. Identify the population and the sample. b. For the 40,858 students at UW, one characteristic of interest was the number of students who would reply “zero” to this question. For the 100 students sampled, supposed 29% gave this response. Does this mean that 29% of the entire population of UW students would make this response? Explain. c. Is the numerical summary of 29% a sample statistic or a population parameter? 1.31. Pick up a recent issue of a national newspaper such as The New York Times or USA Today. Identify an article that used statistical methods. Did it use descriptive statistics, or inferential statistics, or both? Explain. (Note: as already explained in the introductory handout, I am willing to accept local as well as national newspapers, or webbased news such as www.nytimes.com or www.cnn.com.) 2.6. Identify each of the following variables as continuous or discrete. a. The length of time to run a marathon. b. The number of people in line at a box office to purchase theater tickets. c. The weight of a dog. d. The number of people you have dated in the past month. 2.8. In the 2002 General Social Survey, respondents answered the question, “How many children have you ever had?” Results were: a. Is the variable, number of children, categorical or quantitative? b. Is the variable, number of children, discrete or continuous? c. Add proportions and percentages to this frequency table. d. Which response if the mode? Number of Children Count 0 1 2 3 4 5 6 7 799 469 657 481 185 73 40 22 8 or more 34