Download Print

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 8
Producing Data: Sampling
Chapter 8
Sampling
1
Objectives (BPS chapter 8)
Producing Data: Sampling

Observation versus experiment

Population versus sample

Sampling methods

How to sample badly

Simple random samples

Other sampling designs

Caution about sample surveys

Learning about populations from samples (inference)
Chapter 8
Sampling
2
Experiments vs. Observational Studies

Experiment
– experimenter determines which units receive which treatments (ideally
using some form of random allocation)
– Deliberately imposes some treatment on individuals in order to
observe their responses.
– Studies whether the treatment causes change in the response.

Observational study
– compare units that happen to have received each of the treatments
– often useful for identifying possible causes of effects, but cannot
reliably establish causation

Only properly designed and executed experiments can reliably
demonstrate causation.
Chapter 8
Sampling
3
Population

The complete collection of all subjects or
objects (scores, people, measurements, and
so on) that are being studied.

The collection is complete in the sense that it
includes all subjects to be studied.
Chapter 8
Sampling
4
 Census: The collection of data from every
individual in a population.
 Sample
: A subset of elements drawn from a
population from which we collect data.

The sample must be a good representative of the
entire population.

A sampling design describes exactly how to choose
a sample from the population.
Chapter 8
Sampling
5
Population
individuals
Chapter 8
Sampling
6
Sampling Frame
List of individuals that could possibly be selected
for the sample (not necessarily the same as
the population)
Chapter 8
Sampling
7
Census
3
2
1
4
11
12
13
10
9
15
14
1
12
13
4
5
11
10
9
15
14
17
16
3
2
Census
5
16
Chapter 8
17
List of Individuals
 1
 2
6
 3
7
 4
8
 5
 6
 7
 8
 9
10
11
6
12
7
13
8
14
15
16
17
Sampling
8
List of Individuals
Sampling Frame
3
2
1
12
13
4
5
11
10
9
15
14
6
16
8
7
17
Sample
Chapter 8
Sampling
1
2
3
4
 5
6
7
 8
 9
10
11
12
13
14
15
16
17
9
Example

Suppose we are interested in the average age
of all VIU students.

The relevant population is all VIU students
(including students in all campuses).

Possible Sampling Frame: List of VIU students
at the Nanaimo campus.
Chapter 8
Sampling
10
Example Cont.

A sample can be students in this Math 161
class, or, 50 randomly selected VIU students
at the Nanaimo campus.

If we use the ages of all VIU students, then we
have a _________.
Chapter 8
Sampling
11
Thought Question
Popular magazines often contain surveys
that ask their readers to answer questions
about hot topics in the news. Do you think
the responses the magazines receive are
representative of public opinion? Explain
why or why not.
Chapter 8
Sampling
12
Thought Question
Suppose you access an online listing of all
courses at your institution, alphabetized by
department, to determine what proportion
of all courses have a statistics course as a
prerequisite. If you decide to sample 50
courses in order to get a representative
sample of courses, how would you select
them? Would it be appropriate to simply
select the first 50 courses listed?
Chapter 8
Sampling
13
Bad Sampling Plans
 Convenience
sampling
– selecting individuals who are easiest to reach
– Problem:
– Sample might not be representative of the
target population.
Chapter 8
Sampling
14
Convenience Sampling
 Sampling
mice from a large cage to study
how a drug affects physical activity
– lab assistant reaches into the cage to select
the mice one at a time until 10 are chosen
 Which
mice will likely be chosen?
– could this sample yield biased results?
Chapter 8
Sampling
15
Bad Sampling Plans
 Voluntary
response sampling
– allowing individuals to choose to be in the sample
Problem:
– People with strong opinions (or feelings) about the
issue tend to respond.
– Example: RateMyProfessor.com
Chapter 8
Sampling
16
Voluntary Response
 To
prepare for her book Women and Love, Shere
Hite sent questionnaires to 100,000 women asking
about love, sex, and relationships.
– 4.5% responded
– Hite used those responses to write her book
 Moore
(Statistics: Concepts and Controversies,
1997) noted:
– respondents “were fed up with men and eager to fight
them…”
– “the anger became the theme of the book…”
– “but angry women are more likely” to respond
Chapter 8
Sampling
17
CNN on-line surveys:
Bias: People have to care enough about an issue to bother replying.
This sample is probably a combination of people who hate “wasting the
taxpayers’ money” and “animal lovers.”
Chapter 8
Sampling
18
Bias
The design of a statistical study is biased if
it systematically favours certain outcomes.
Convenience Sampling and Voluntary
Response Sampling often produce biased
samples.
Chapter 8
Sampling
19
Polls and Surveys
 Data
carelessly collected (even if the
sample size is large), is subject to a high
degree of bias.
 To
avoid biases, samples must be
randomly chosen.
Chapter 8
Sampling
20
Avoiding Bias
 We
select a sample in order to get information about
some population.
 How
can we choose a sample that fairly represents
the population?
 Probability
Sample: A sample chosen by chance.
We must know what samples are possible and what
chance (probability), each possible sample has.
Chapter 8
Sampling
21
Simple Random Sampling
 Each
individual in the population has the
same chance of being chosen for the sample
 Each
group of individuals in the population of
the required size (n) has the same chance of
being the sample actually selected
Chapter 8
Sampling
22
Simple Random Sample (SRS)
A simple random sample (SRS) of size n
consists of n individuals from the population
chosen in such a way that every set of n
individuals has an equal chance of being
selected.
Chapter 8
Sampling
23
How to choose an SRS
– Label each individual in the population with a unique
number (same # of digits)
You can label up to 10 items with 1 digit (0,1,…,9)
 You can label up to 100 items with two digits (00,01,…,99)
 For 20 items say, you can label as 01, 02, …,20.

– “drawing names (numbers) out of a hat”
– random number table (see Table B on pg. 686 of text)
– computer software (www.randomizer.org) or see
textbook website (http://bcs.whfreeman.com/bps4e)
(or CD) Statistical Applets – Simple Random Sample
Chapter 8
Sampling
24
Simple Random Sampling
Example: Courses with Statistics Prerequisite
Suppose there are 800 courses at an institution, alphabetized
by department (and numbered 001-800), and you decide to
randomly select 50 of them to determine what proportion of all
the courses have a statistics course as a prerequisite. Use a
random number table to select which 50 courses to sample.
Page 686 of textbook:
Pick a line and column at random: suppose we get line 111, column 3
Random numbers: 605 130 929 700 412 712 …
TRY: Use line 126, column 1:
Random numbers:
Chapter 8
Sampling
25
Systematic Sample
 randomly
select a member of the sampling
frame for the sample
 using a set procedure or rule, select the
rest of the individuals for the sample
– for example, randomly select an individual
from the sampling frame, and then select
every 25th member of the sampling frame to
be in the sample
Chapter 8
Sampling
26
Stratified Random Sample
 first
divide the population into groups of similar
individuals, called strata
 second, choose a separate simple random sample
in each stratum
 third, combine these simple random samples to form
the full sample
– if only certain strata are (randomly) chosen to be used,
and all subjects in these strata make up the sample, then
we have a cluster sample.
– the population is often divided according to geographic
regions (called clusters).
Chapter 8
Sampling
27
Multistage Sample
divide the population of interest into groups
 randomly select some of those groups
 divide the resulting collection of individuals into
smaller groups
 randomly select some of those groups
 continue dividing the resulting collection of
individuals into groups and randomly selecting
some of those groups until you can simply list
all of the resulting individuals and randomly
select n of them for your sample

Chapter 8
Sampling
28
Probability Sampling Plans
 Simple
random sampling (SRS)
 Systematic sampling
 Stratified random sampling
 Cluster sampling
 Multistage sampling
Chapter 8
Sampling
29
Steps for Designing a Study
1. Identify your objective
2. Develop a plan: Experiment or
Observational study
3. Use a random procedure to collect data
4. Analyze the data and form conclusions
Chapter 8
Sampling
30
_________________ Sampling use results that are readily available
Hey!
Do you believe
in the death
penalty?
Chapter 8
Sampling
31
__________________ - selection so
that each has an equal chance of being selected
Chapter 8
Sampling
32
_____________ Sampling Select some starting point and then select every
Kth element in the population
Chapter 8
Sampling
33
_______________ Sampling subdivide the population into subgroups (strata) that
share the same characteristic, then draw a sample
from each stratum
Chapter 8
Sampling
34
__________________ Sampling divide the population into sections (or clusters);
randomly select some of those clusters; choose all
members from selected clusters
Chapter 8
Sampling
35
Thought Question
When surveying students on their opinions on
their professor’s teaching methods, do you
think it matters who conducts the interviews?
Explain your answer with an example.
Chapter 8
Sampling
36
Sources of Error in Surveys

Random sampling reduces bias in choosing a
sample and allows control of variability.

Sampling in the real world is more complex
and less reliable than we might hope for.

Confidence statements do not reflect all
sources of error that are present in sampling.
Chapter 8
Sampling
37
 Sampling Errors – Errors that are caused by the act
of taking a sample.
Random Sampling Error: the difference between a
sample result and the true population result; such an
error results from chance sample fluctuations.
- Measured by the margin or error.
 Nonsampling Errors – Errors that are not related to
the act of taking a sample.
Example: Sample data that are incorrectly collected, recorded,
or analyzed (such as using a defective instrument, or copying
the data incorrectly).
Nonsampling errors can be much larger than the sampling errors.
Chapter 8
Sampling
38
Sampling Errors
 Using the wrong sampling frame.
Undercoverage:
Excluding some units
in the population.
Chapter 8
Sampling
39
Sampling Errors

Disasters
– Using voluntary response (self selection)
– Using a convenience or haphazard sample

cannot extend results to the population of interest
(need a broad cross-section of the population)
Chapter 8
Sampling
40
Nonsampling Errors
 Difficulties
– Processing errors (data entry, calculations)
– Wording of questions / Response error
 Disasters
– Nonresponse (cannot contact subjects or
they do not respond)
Chapter 8
Sampling
41
Sources of Nonsampling Errors
Non-response bias:
Cannot contact subjects or they do not respond.
- Nonrespondents often behave or think differently
from respondents.
– low response rates can lead to huge biases.
Processing Errors:
Data that are incorrectly collected, recorded,
calculated etc.
Chapter 8
Sampling
42
Nonsampling errors cont.

Survey format effects:
Factors such as question order, questionnaire layout, self -administered
questionnaire or interviewer, can affect the results.

Interviewer effects:
Different interviewers asking the same questions can tend to obtain
different answers.

Response bias: Fancy term for lying when you think you should not
tell the truth. Like if your family doctor asks: “How much do you drink?”
Or a survey of female students asking: “How many men do you date per
week?” People also simply forget and often give erroneous answers to
questions about the past.
Chapter 8
Sampling
43
Concerns when Asking Survey
Questions
 Deliberate
bias
 Unintentional bias
 Desire to please
 Asking the uninformed
 Unnecessary complexity
 Ordering of questions
 Confidentiality and anonymity
Chapter 8
Sampling
44
Confidentiality and Anonymity
 Confidential
answer
– respondent is known, but the information is
a secret
– facilitates follow-up studies
 Anonymous
answer
– the respondent is not known, or cannot be
linked to his/her response
– usually yields more truthful answers
Chapter 8
Sampling
45
Dealing with errors

Statistical methods are available for estimating the
likely size of sampling errors.
-margin of error gives the sampling error.

All we can do with nonsampling errors is to try to
minimize them at the study-design stage.

Pilot Survey: One tests a survey on a relatively small
group of people to try to identify any problems with the
survey design before conducting the survey proper.
Chapter 8
Sampling
46
Learning about Populations from Samples
The techniques of inferential statistics allow us to draw inferences or
conclusions about a population from a sample.
– Your estimate of the population is only as good as your sampling
design  Be sure to eliminate possible biases .
– Your sample is only an estimate—and if you randomly sampled
again, you would probably get a somewhat different result.
– The bigger the sample the better. We’ll get back to it in later
chapters.
Chapter 8
Sampling
47
Quiz
For each study, identify: the population, the sample, the sampling method
and any possible biases.
1) To assess the opinions of students at VIU regarding campus safety, a
reporter interviews 15 students he meets walking on the campus late at
night who are willing to give their opinions.
2) An SRS of 1200 adult Americans is selected and asked: “In light of the
huge national deficit, should the government at this time spend additional
money to establish a national system of health insurance?” Thirty-nine
percent of those responding answered yes.
Chapter 8
Sampling
48