Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 12 Notes Sample Surveys The entire group of individuals that we want information about is called the________________. A ___________ is an attempt to gather information about every individual member of the population. Problems with census—costs; time needed to complete; sometimes testing can destroy item. A _____________ is a part of the population that we actually examine in order to gather information. The _____________________ refers to the method used to choose the sample from the population. The number of individuals in a sample is called the ___________________. The sample size determines how well the sample represents the population, not the fraction of the population sampled. The design of a study is _____________ if it systematically favors certain outcomes. A _________________________ is a numerically valued attribute of a model for a population. We rarely expect to know the true value of a population parameter, but we do hope to estimate it from sampled data. ______________ are values calculated for sampled data. Those that correspond to, and thus estimate, a population parameter, are of particular interest. The term “sample statistic” is sometimes used, usually to parallel the corresponding term “population parameter.” A sample is said to be ____________________ if the statistics computed from it accurately reflect the corresponding population parameters. A list of individuals from whom the sample is drawn is called the _____________________. Individuals who may be in the population of interest, but who are not in the sampling frame, cannot be included in any sample. ___________________ is the natural tendency of randomly drawn samples to differ, one from another. Sometimes, unfortunately, called _________________________, sampling variability is no error at all, but just the natural result of random sampling. Just Checking Various claims are often made for surveys. Why is each of the following claims not correct? a. It is always better to take a census than to draw a sample. b. Stopping students on their way out of the cafeteria is a good way to sample if we want to know about the quality of the food there. c. We drew a sample of 100 from the 3000 students in a school. To get the same level of precision for a town of 30,000 residents, we’ll need a sample of 1,000. d. A poll taken at a statistics support Web site garnered 12,357 responses. The majority said they enjoy doing statistics homework. With a sample size that large, we can be pretty sure that most Statistics students feel this way, too. e. The true percentage of all Statistics students who enjoy the homework is called a “population statistic.” I. Probability sampling: Obtaining a randomly chosen sample refers to the way in which the sample is chosen, not the specific members of the population that happen to end up in the sample. A. A sample is a ___________________________________ if it is selected so that: each member of the population is equally likely to be chosen and the members of the sample are chosen independently of one other; OR every set of n units has an equal chance to be the sample actually selected. The best defense against bias is _________________________, in which each individual is given a fair, random chance of selection. “Classic” method—put names in hat and draw until have desired sample size; more commonly, number names and use random number table or other source of random numbers to select sample. Sources of bias in probability samples: ____________________________ is a bias introduced to a sample when individuals can choose on their own whether to participate in the sample. Samples based on voluntary response are always invalid and cannot be recovered, no matter how large the sample size. _____________________________ occurs when some groups in the population are left out of the process of choosing the sample (i.e. telephone survey excludes those without phones). Occasionally random sampling may give a sample that is not representative of the population—but it is still random. __________________________ occurs when the method of observation tends to produce values that systematically differ from the true value in some way. (i.e. improperly calibrated scale is used to weigh items; poorly worded survey questions) ______________________________ occurs when individuals chosen for the sample can’t be contacted or refuse to cooperate. Difficulties with probability sampling—may be no easy way to list all members of a population or to contact some people. Can use: voter registration lists—not everybody is registered; list of all addresses—homeless, multiple people at same address; telephone numbers—unlisted phone numbers, people with multiple numbers are overrepresented; cell phones B. Other probability sampling methods for obtaining large samples: 1. ______________________—randomly choose some starting point; then select every kth element in the population. Easier than random sampling; also guarantees sample is taken from throughout the population—be careful how population is ordered. (i.e. take every 10 th student at CVHS; start at randomly selected point—still requires list) 2. ________________________—divide the population into sections or clusters; randomly select a few of those sections and then choose all members from the selected sections (i.e. randomly select 10 third period classes—survey all students in each class selected— faster way to obtain sample; assumes classes are relatively heterogeneous). 3. ___________________________—subdivide the population into at least two different subpopulations (strata) that share some characteristic; then draw a random sample from each stratum. The purpose is to insure that the sample is more representative of the population than a SRS might be; also might be interested in results from separate strata. (i.e. select SRS from each grade-level at BHS—9th, 10th, 11th, 12th—in proportion to number of students in each level) __________________________—uses multiple strata (i.e. stratify by gender as well as grade-level—select proportionately sized SRS from each). II. Other sampling techniques (non-probability—BAD!): (All hypothesis testing that we do requires probability sampling.) _________________________—easiest way to obtain a sample is to choose it without any random mechanism (also called haphazard sampling). Simply uses a sample that is readily available—often biased. 1. ________________________________—when people participate in a survey by voluntarily responding (radio, TV, Internet, texting, etc.) People who care enough to respond usually are not representative of the whole population. Not random. STATISTICAL INFERENCE SHOULD NOT BE DONE ON A VOLUNTARY POLL. 2. _______________________—a form of convenience sampling where an “expert” selects a sample s/he considers representative. Not random—no objective way to quantify this kind of sample. 3. _________________________—type of convenience sampling using clusters and strata. Interviewers have detailed strata definitions, assigned locations and must find fixed number of subjects for each stratum; still not random. Random sampling is needed for further calculations, but it is often difficult to do, especially for large populations. (Convenience sampling is easy to do but not very useful—may be representative/ balanced for variables in strata but not for population as a whole.) Just Checking 1. We need to survey a random sample of 300 passengers on a flight from San Francisco to Tokyo. Name each sampling method described below. a. Pick every 10th passenger as people board the plane. b. From the boarding list, randomly choose 5 people flying first class and 25 of the other passengers. c. Randomly generate 30 seat numbers and survey the passengers who sit there. d. Randomly select a seat position (right window, right center, right aisle, etc.) and survey all the passengers sitting in those seats. A _____________________ is a study that asks questions of a sample drawn from some population in the hope of learning something about the entire population. Polls taken to assess voter preferences are common sample surveys. A ______________________ yields the information we are seeking about the population we are interested in. Before setting out to survey think about these principles: Basic Principle #1: A survey is not merely a collection of questions, thrown together without purpose— _____________________________________________________________________________ _________________________________________________________________ Basic Principle #2: Both parties to the survey have responsibilities: The interviewer’s work must be mostly done in advance; _________________________________________________________________ The interviewee’s task is to—having agreed to answer questions—____________. Basic Principle #3: A prime task of the interviewer at the question design stage is to help the interviewee be ________________. The interviewee’s tasks – and survey helps: 1. Comprehension—understand the survey directions and each question asked 2. Retrieve information from memory to answer the question— most factual answers are approximations to the truth, often reconstructed to answer the question 3. Formulate and report a response A ___________ is a small trial run of a survey to check whether questions are clear. A pilot study can reduce errors due to ambiguous questions. The ________________ of questions is the most important influence on the answers given to a sample survey. Confusing or leading questions can introduce measurement bias. Even minor changes in wording can change the outcome of a survey—pretest survey (often in interview format) to check clarity of questions. Never trust the results of a sample survey until you have read the exact questions posed. Also look at the sampling design, the size of the sample (larger samples generally give more accurate results), the amount of non-response and the date of the survey.