Download chapter 12 notes day 1 student

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Chapter 12 Notes Sample Surveys
The entire group of individuals that we want
information about is called
the________________.
A ___________ is an attempt to gather
information about every individual member of the
population. Problems with census—costs; time
needed to complete; sometimes testing can
destroy item.
A _____________ is a part of the population that
we actually examine in order to gather
information.
The _____________________ refers to the
method used to choose the sample from the
population.
The number of individuals in a sample is called
the ___________________. The sample size
determines how well the sample represents the
population, not the fraction of the population
sampled.
The design of a study is _____________ if it
systematically favors certain outcomes.
A _________________________ is a numerically valued attribute of a model for a population. We
rarely expect to know the true value of a population parameter, but we do hope to estimate it from
sampled data.
______________ are values calculated for sampled data. Those that correspond to, and thus
estimate, a population parameter, are of particular interest. The term “sample statistic” is sometimes
used, usually to parallel the corresponding term “population parameter.”
A sample is said to be ____________________ if the statistics computed from it accurately reflect the
corresponding population parameters.
A list of individuals from whom the sample is drawn is called the _____________________.
Individuals who may be in the population of interest, but who are not in the sampling frame, cannot be
included in any sample.
___________________ is the natural tendency of randomly drawn samples to differ, one from
another. Sometimes, unfortunately, called _________________________, sampling variability is no
error at all, but just the natural result of random sampling.
 Just Checking
Various claims are often made for surveys. Why is each of the following claims not correct?
a. It is always better to take a census than to draw a sample.
b. Stopping students on their way out of the cafeteria is a good way to sample if we want to know about
the quality of the food there.
c. We drew a sample of 100 from the 3000 students in a school. To get the same level of precision for a
town of 30,000 residents, we’ll need a sample of 1,000.
d. A poll taken at a statistics support Web site garnered 12,357 responses. The majority said they enjoy
doing statistics homework. With a sample size that large, we can be pretty sure that most Statistics
students feel this way, too.
e. The true percentage of all Statistics students who enjoy the homework is called a “population statistic.”
I. Probability sampling: Obtaining a randomly chosen sample refers to the way in which the sample
is chosen, not the specific members of the population that happen to end up in the sample.
A. A sample is a ___________________________________ if it is selected so that:
 each member of the population is equally likely to be chosen and the members of the sample
are chosen independently of one other;
OR
 every set of n units has an equal chance to be the sample actually selected.
The best defense against bias is _________________________, in which each individual is given a
fair, random chance of selection.
“Classic” method—put names in hat and draw until have desired sample size; more commonly,
number names and use random number table or other source of random numbers to select sample.
Sources of bias in probability samples:
 ____________________________ is a bias introduced to a sample when individuals can
choose on their own whether to participate in the sample. Samples based on voluntary
response are always invalid and cannot be recovered, no matter how large the sample size.
 _____________________________ occurs when some groups in the population are left out of
the process of choosing the sample (i.e. telephone survey excludes those without phones).
Occasionally random sampling may give a sample that is not representative of the
population—but it is still random.
 __________________________ occurs when the method of observation tends to produce
values that systematically differ from the true value in some way. (i.e. improperly calibrated
scale is used to weigh items; poorly worded survey questions)
 ______________________________ occurs when individuals chosen for the sample can’t be
contacted or refuse to cooperate.
Difficulties with probability sampling—may be no easy way to list all members of a population or to
contact some people. Can use: voter registration lists—not everybody is registered; list of all
addresses—homeless, multiple people at same address; telephone numbers—unlisted phone
numbers, people with multiple numbers are overrepresented; cell phones
B. Other probability sampling methods for obtaining large samples:
1. ______________________—randomly choose some starting point; then select every kth
element in the population. Easier than random sampling; also guarantees sample is taken
from throughout the population—be careful how population is ordered. (i.e. take every 10 th
student at CVHS; start at randomly selected point—still requires list)
2. ________________________—divide the population into sections or clusters; randomly
select a few of those sections and then choose all members from
the selected sections
(i.e. randomly select 10 third period classes—survey all students in each class selected—
faster way to obtain sample; assumes classes are relatively heterogeneous).
3. ___________________________—subdivide the population into at least two different
subpopulations (strata) that share some characteristic; then draw a random sample from
each stratum. The purpose is to insure that the sample is more representative of the
population than a SRS might be; also might be interested in results from separate strata. (i.e.
select SRS from each grade-level at BHS—9th, 10th, 11th, 12th—in proportion to number of
students in each level)
__________________________—uses multiple strata (i.e. stratify by gender as well as
grade-level—select proportionately sized SRS from each).
II. Other sampling techniques (non-probability—BAD!): (All hypothesis testing that we do
requires probability sampling.)
_________________________—easiest way to obtain a sample is to choose it without any random
mechanism (also called haphazard sampling). Simply uses a sample that is readily available—often
biased.
1. ________________________________—when people participate in a survey by voluntarily
responding (radio, TV, Internet, texting, etc.) People who care enough to respond usually are not
representative of the whole population. Not random. STATISTICAL INFERENCE SHOULD NOT
BE DONE ON A VOLUNTARY POLL.
2. _______________________—a form of convenience sampling where an “expert” selects a
sample s/he considers representative. Not random—no objective way to quantify this kind of
sample.
3. _________________________—type of convenience sampling using clusters and strata.
Interviewers have detailed strata definitions, assigned locations and must find fixed number of
subjects for each stratum; still not random.
Random sampling is needed for further calculations, but it is often difficult to do, especially for large
populations. (Convenience sampling is easy to do but not very useful—may be representative/
balanced for variables in strata but not for population as a whole.)
 Just Checking
1. We need to survey a random sample of 300 passengers on a flight from San Francisco to
Tokyo. Name each sampling method described below.
a. Pick every 10th passenger as people board the plane.
b. From the boarding list, randomly choose 5 people flying first class and 25 of the
other passengers.
c. Randomly generate 30 seat numbers and survey the passengers who sit there.
d. Randomly select a seat position (right window, right center, right aisle, etc.) and survey
all the passengers sitting in those seats.
A _____________________ is a study that asks questions of a sample drawn from some population
in the hope of learning something about the entire population. Polls taken to assess voter
preferences are common sample surveys.
A ______________________ yields the information we are seeking about the population we are
interested in. Before setting out to survey think about these principles:
Basic Principle #1:
A survey is not merely a collection of questions, thrown together without purpose—
_____________________________________________________________________________
_________________________________________________________________
Basic Principle #2:
Both parties to the survey have responsibilities:
 The interviewer’s work must be mostly done in advance;
_________________________________________________________________
 The interviewee’s task is to—having agreed to answer questions—____________.
Basic Principle #3:
A prime task of the interviewer at the question design stage is to help the interviewee be
________________.
The interviewee’s tasks – and survey helps:
1. Comprehension—understand the survey directions and each question asked



2. Retrieve information from memory to answer the question— most factual answers are
approximations to the truth, often reconstructed to answer the question

3. Formulate and report a response


A ___________ is a small trial run of a survey to check whether questions are clear. A pilot study
can reduce errors due to ambiguous questions.
The ________________ of questions is the most important influence on the answers given to a
sample survey. Confusing or leading questions can introduce measurement bias. Even minor
changes in wording can change the outcome of a survey—pretest survey (often in interview format) to
check clarity of questions.
Never trust the results of a sample survey until you have read the exact questions posed. Also look
at the sampling design, the size of the sample (larger samples generally give more accurate results),
the amount of non-response and the date of the survey.