Download Sample Design Part 1.. - NMSU College of Business

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Sample Design: Part 1
Slide 1
The goal of this lecture on sample design is to introduce you to basic samples and terminology,
to discuss the various stages of selecting a sample, to introduce you to two basic types of
samples—probability and non-probability—and to explain the different types of samples in each
of those categories and when you might use them, to introduce you to the notion of sampling
error, and to close with a brief discussion of Internet and panel samples.
Slide 2
As the caption to this cartoon states, “In sampling you examine less than the whole body of data
to reach a conclusion about that body. Thus, when you taste your first bite of something and say
‘It needs salt’, you’re sampling.”
Slide 3
Perhaps a non-food context might help. In the four photos shown on this slide, the first one is at
normal resolution and the subsequent three photos are of lower and lower resolution. The face
remains recognizable when going from photo #1 to #2. By photo #4, it’s almost impossible to
discern a face. This photo analogy hints at issues about sample size and representativeness.
Slide 4
I’ll begin by introducing you to some basic sampling terms: population or universe, population
element, census, and sample.
Slide 5
A population or universe is any complete group. It could be a group of people, sales territories,
or stores. It’s the total group from which information is needed. Note that a population need not
be huge. For example, the students enrolled in this course represent the population of students
enrolled in this course.
Slide 6
To take a census means studying all the elements comprising a population. The U.S.
Government attempts a population census every 10 years. Of course, the population could be
small, like the students enrolled in this course. If I surveyed all those students, then I’d be taking
a census.
Slide 7
In contrast to a census, a sample is a subset of a larger population. Cost and time are the main
reasons for drawing a sample. A census is expensive unless the population is small and
accessible. Typically, a population is large and not readily accessible, in which case drawing a
representative sample is the only cost- and time-effective way to assess the population.
Slide 8
Researchers progress through the following stages as they draw a sample: (1) defining the
target population, (2) selecting a sample frame, (3) deciding whether to use a probability or non-
Page | 1
probability sampling method, (4) planning the procedure for selecting sampling units, (5)
determining the sample size, (6) drawing the sample, and (7) conducting the field work.
Slide 9
The first step in defining a target population is to consider the research objectives. For example,
if I’m interested in conducting a study on different advertising appeals or attempting to fine tune
a new questionnaire scale, then an appropriate population is students. Alternatively, if I was a
marketing manager interested in adding a product to my existing product line, the population is
unlikely to be students, unless the product is sweatshirts or PCs. To operationally define a
population like NMSU alumni, I might contact the alumni office and request a complete list of
alumni. If the population is NMSU marketing majors, then I might acquire a list of those students
from the COB. Although such a list won’t be perfect—it will include some non-majors (and nonstudents) and exclude some current majors—I can assume that all current marketing majors
appear on that list. When selecting a target population, the convenience and cost of alternative
samples should be considered. An operationally defined population that’s less expensive to
access should be used, ceteris paribus, because the savings per completed questionnaire could
be put toward collecting a larger sample.
Slide 10
The second step is to select a sampling frame. This will only be relevant for probability or
scientific samples. A sampling frame is the list of elements from which a sample may be drawn.
Alternative sampling frames might include mailing and commercial lists, but they can be
problematic if they’re not representative of the target population. For example, for my
dissertation research I needed to identify people who relocated within the last six months. At the
time, R.L. Pope maintained the ‘new movers list’, so I paid $4,000 for contact information for
40,000 households. Unfortunately, the list was not as advertised, i.e., 80% of households had
relocated in the last six months. In fact, one person I contacted hadn’t moved in 57 years! Sadly,
I would have been better off randomly phoning people across the U.S. Although other
commercially available lists may perform better, all such lists can be problematic.
Slide 11
The third step is to identify the sampling units. Sampling units can be persons, household,
businesses, et cetera. Several years ago, I conducted a study for the NMSU Marketing
Department to assess the skills of and needed by undergraduate marketing majors, so I
surveyed several groups. One group was Fortune 500 companies—you can’t survey the
company but you can survey the personnel director—and few opted to respond. Yet, the
sampling unit was personnel directors for those companies. Sometimes the primary sampling
unit is of interest; other times it’s the secondary sampling unit (as per cluster sampling).
Slide 12
The fourth step is to decide if it’s best to draw a probability (scientific) or non-probability (nonscientific) sample. For a probability sample, every population element has a known non-zero
probability of selection; for a non-probability sample, the probability of selecting a particular
member of the population is unknown. This known or unknown probability of selecting
population elements differentiates the two types of samples. A sample is scientific if the
probability of selecting any member is known, and non-scientific if the probability of selecting
any member is unknown; the latter case makes it difficult to extrapolate from the sample to the
population.
Page | 2
Slide 13 (No Audio)
Slide 14
When are non-probability samples preferred to probability samples, and vice versa? Here are
five considerations:

Cost is an issue and non-probability samples are far cheaper than probability samples.
For exploratory research, non-probability samples often are serviceable. For conclusive
research—from which managers draw conclusions and act accordingly—probability
samples should be representative and hence reliable.

If non-sampling error are a major error component, then non-probability sampling is
acceptable because sampling error is a lesser error source. Alternatively, if sampling
errors are the largest error component, then a probability sample is preferred.

If the population is relatively homogeneous—members are relatively similar to one
another, then intra-population variability is low and a non-probability sample may be
sufficiently representative of the larger population. Hence, it may be possible to draw a
representative sample without probability sampling. Alternatively, if the population is
highly variable or heterogeneous, then a probability sample is preferred because there’s
more control over its representativeness.

If statistics—which are irrelevant for non-probability samples—are needed to extrapolate
from a sample to the population, then probability sampling is required.

If operational considerations are critical, then a non-probability sample is preferred. Such
samples are easier and faster to collect, more accessible, and less expensive.
Probability samples tend to require great care in selection and are more expensive
because creating a sample frame is an expensive proposition.
In summary, non-probability samples are great for exploratory research, great if non-sampling
errors are larger than sampling errors, the population is similar across members, statistical
analyses are not required, and operational considerations are critical. In contrast, probability
samples are preferred if decisions will be made based on the research, sampling error is the
largest component of total error, population elements tend to be heterogeneous, statistical
analyses are important to extrapolate to a larger population, and unfavorable operational
considerations are less critical.
Slide 15
At this point, a discussion on the different types of non-probability and probability samples is
warranted. I’ll start with non-probability samples because it’s easier to contrast the probability
samples after discussing non-probability samples. The four types of non-probability samples are
convenience, judgmental, quota, and snowball. Then I’ll talk about probability samples: random,
systematic, stratified, and cluster.
Slide 16 (No Audio)
Slide 17
I’ve included this cartoon to illustrate the problem with televised polls that news networks run,
even when they’re indicated to be non-scientific, as viewers tend to tune out that caveat
Page | 3
because they don’t know what it means. The cartoon shows a person who cares a lot about the
outcome of a television-program-instigated survey and will continue to dial the 900 number to
register his vote. Baseball fans are encouraged to submit multiple All-Star ballots; in fact,
Internet-based ballots allow voting for the starting team and the last player elected to the AllStar team. As a result, votes for the starting team are weighted by fan interest; it isn’t a one
person, one vote election.
Slide 18
Convenience samples also are called haphazard or accidental samples. Convenience sampling
is a procedure for obtaining people or units that are convenient to the researcher. Convenience
is not necessarily a problem and convenience sampling is desirable for some types of research.
In particular, convenience samples are very useful for pre-testing questionnaires because they
are inexpensive and can help to identify poorly worded questions and poorly formatted
questionnaires.
Slide 19
The sampling error introduced by convenience sampling is twofold. First, certain elements of the
target population will be systematically excluded. Second, elements that are not members of the
target population will be included. As a result, the sample will only include a biased subset of the
target population.
Slide 20
This classic Pogo cartoon makes the point once more that a convenience sample may be great
for some purposes but not for trying to generalize to a larger population.
Slide 21
A second type of non-probability sample is a judgment sample, also referred to as a purposive
sample. The judgment in this case is the judgment of the researcher; an experienced person
selects the sample based on his or her judgment about appropriate characteristics required of
sample members. One example of such samples is very familiar: television networks use
judgment sampling to forecast the winner before all votes are tallied. These forecasts are based
on exit interviews of voters as they leave the polling precinct. It is cost prohibitive to place
interviewers at every precinct in the U.S. However, there’s historical data on the districts whose
voters are most predictive of the election. With historical data, the networks can identify these
key precincts and station interviewers at key polling places. The judgment here is informed
judgment based on historical voting patterns. If conducted properly, exit polling provides
accurate forecast; the misleading exit polls for the Bush-Kerry presidential election were a
byproduct of poor questioning rather than poor sampling
Slide 22
Unlike convenience sampling, in which people who are and aren’t members of a population are
surveyed, the judgment sample includes only relevant member of a population. However, if
certain members of the population are systematically excluded from the sample, then the survey
results may be biased.
Page | 4
Slide 23
A quota sample is a sample in which various population subgroups are representative on
pertinent sample characteristics to the extent desired by researchers.
Slide 24
Sex and age are two characteristics that a researcher might want to control. For example, the
researchers for this chewing gum study controlled respondents such that 45% are male and
55% are female. The same can be done for age of respondents and mouthwash use or gender
and razor blade consumption. For mouthwash, the field service could be required to solicit
responses so that 45% of the sample was male and 55% was female, 20% were between the
ages of 16-24, 25% were between 25 and 34, et cetera.
Slide 25
The last of the four non-probability samples is called a snowball sample. For such samples, a
small initial set of respondents is selected by a probability method and additional respondents
are recommended by these initial respondents. Assume I want to survey a group that is rare in
the population and there’s no good mailing or phone list available. I might randomly dial the
phone, looking for people who qualify as members of the target population. If I dial enough
numbers, I’ll eventually find one person. It’s likely that one person knows other similar people. At
the end of the interview, I’d ask that person for three other people who are qualified to
participate in my study due to similar interests, predispositions, and behaviors. For example,
suppose there was no good list of NMSU alumni. I could start dialing phone numbers in New
Mexico until I contacted a NMSU alum. At the end of the interview, I would ask for three more
alums, then contact those three people and ask each of them for three more alums, and so on.
Basically, the sample would snowball from one or a handful of initial respondents who would
identify other qualified respondents, who would in turn identify more respondents, et cetera.
Page | 5