Download Sample - staff.stir.ac.uk

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Lecture 11 of 47C5 Social
Research Process I:
Sampling in Quantitative
Research I
Paul Lambert, 14.10.03, 4-5pm
1
47C5: Survey research lectures
Lecture 8: The Survey Method
Intro. to & qualities of survey method
Lecture 9: Using Secondary Datasets
Data access and issues
Lectures 11/12: Sampling
Sample design, data collection / analysis
2
Resources for lectures 8,9,11,12
• Lecture slides on WebCT site
• 2 Reading lists:
– Initial list in 47C5 unit outlines
– Some additions on further list on WebCT site
3
Web Resources for lectures
8,9,11,12
• Slides and additional reading list also at:
http://staff.stir.ac.uk/paul.lambert/teaching.htm
• Some other internet resources (cf De Vaus 2002)
http://trochim.human.cornell.edu/kb/
http://statcomp.ats.ucla.edu/survey
4
L11/12: Surveys and Sampling
Lecture 11:
1) Role of sampling in social surveys
2) Types of sampling methods
Lecture 12:
3) Good practice in survey conduct
4) Robust analysis of survey data
5
Part 1: Role of sampling in
survey research
•
•
•
•
Surveys can be census’s
More often samples from wider population
Several sampling methods select cases
Aim: representative of wider population
6
Inference
• Key idea is inference
= confidence in our ability to generalise
Sampling inference = application of statistical
theories in order to estimate probabilities
that a sample result is ‘likely to have been
unrepresentative’
7
The ‘normal’ (Gaussian) curve
8
Theories of sampling methods
Sampling and probability theories tell us
that any particular random sample is most
likely to have the same properties as the
wider population. We can then estimate the
probability that sample results of a
particular nature could have arisen by
chance, rather than because they are the
same as the population result.
9
 If the cases in sample surveys
were selected at random, then
can use sampling theories and
thus ‘inference’
10
‘Inferential data analysis’
• Variable-by-case matrix data analysis for
generalising findings to population
• Often distinguished from ‘descriptive’ data
analysis (results of sample only)
• Key: joint influence of
– 1) size of sample
– 2) strength of data pattern
in increasing confidence about generalisations
11
Statistical inference
..causes confusion; one of hardest parts of
survey data analysis to understand..
Phrases: ‘significance level’ ‘p-value’,
‘confidence interval’, ‘hypothesis testing’, ..
Meaning: Whether results would probably
generalise to a larger population
(if sample is treated as random)
See: Refs for L11 part 1 (supplementary list)
12
Critiques of survey generalisation
1) Part of the ‘fall of survey methods’ 1960’s:
• Sampling is not representative
 Sampling is systematically biassed
• Inferential conclusions too carelessly made
and too strongly stated
• See for example Cicourel 1964
13
Critiques of survey generalisation
2) Deconstructing inference (1980’s )
• Inferential methods over-relied upon
 Survey analysis becomes theory-free hunt
for ‘significant’ patterns
• Inference needed less than often suggested
• Bad variable analysis (operationalisation) effects
inference results, eg (non-)parametric variables;
data clustering; …
• See for example Rose and Sullivan 1996 p192-5
14
Contemporary survey research
Tends to use 2 strategies to address critiques:
Large scale, often secondary, rigorous methods
or
Small scale, primary, claims carefully qualified
15
Terms in sample survey analysis
•
•
•
•
Population: all cases of interest
Sampling frame: list of all potential cases
Sample: cases selected for analysis
Sampling method: technique for selecting
cases from sampling frame
• Sampling fraction: proportion of cases
from population selected for sample (n/N)
16
Survey analysis:
‘variable-by-case matrix’
Cases 
1
2
3
4
.
.
N
1
1
2
2
.
.
17
18
17
18
.
.
 Variables
1.73 A
.
1.85 B
.
1.60 C
.
1.69 A
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
17
Sample Surveys (case selection)
Populn. Cases 
1
2
3
4
1
5
2
6
7
3
8
4
N=8
n=4

1
1
2
2
Variables
17
1.73
18
1.85
17
1.60
18
1.69

A
B
C
A
18
Part 2: Sampling methods and
techniques
= Ways of selecting case from population
2.1 Random
(probabilistic)
Generalisable,
inferential
statistics, fewer
applications
2.2 Non-random
(opportunistic;
purposive)
Harder to
generalise,
inference contested,
more widely used
19
2.1a Simple Random Sample
• A statistical method used to choose cases
randomly (eg random numbers)
Every case in population has exactly the same
chance of being in sample
• Most data analysis techniques initially
designed for simple random samples
20
2.1b Systematic Random Sample
• Like the SRS, select cases from anywhere
in the whole population
• An easier selection method : choose every
(n)th person for the sample
• Danger of ‘periodicity’ if original
population order has any structure,  bias
21
Problems with sample methods
selecting from whole population
• The ‘random’ part means it is always
possible to get a population coverage quite
different from known structures
• If total population is large or dispersed, then
coverage of random parts of it is expensive
and time consuming: few surveys use
random sampling from whole of UK
22
2.1c Stratified random samples
• Modifies random sample to ensure even (or
‘boost’) coverage of population groups
–
–
–
–
split sampling frame by stratification factors
select random samples within each factor
final sample has correct proportions of each
Example: select 490 M and 510 F
• Properties: proportionate sample, correct
representations; but more expensive & complex,
should use ‘weights’ for analysis
23
2.1d Multistage cluster samples
• i) Select clusters of population at random
• ii) Sample randomly within clusters
• Eg: clusters = local authorities in UK
– With qualifications, may still be treated as
‘random’ for analysis purposes
– Big reduction in costs if face-to-face contacts
 Most widely favoured sampling method
in large scale survey collections
24
Example: Multistage cluster
sample
• Interest: attitudes of Scottish school pupils
• Resources: 400 interviews with pupils
25
Shetlands 2
Highlands 40
Islands 20
Moray 20
Aberdeen 40
Perth 20
Edinburgh 100
Argyll 24
Borders 10
Glasgow 124
26
Moray 40
Stirling 60
Edinburgh
150
Glasgow
150
27
Stirling 60
30 young people at
Balfron School
and
30 young people at
Stirling High
28
2.1e Longitudinal random
samples
• Longitudinal = interest in study over time
• ‘Panel’ and ‘cohort’ samples
– recontact an initially random sample
– Problems of attrition
• Retrospective sample
– Rely on recall evidence of random selection
– Problems of selective recall
29
Issues in random sampling
• Only as good as underlying sampling
frame (a good one may not be available, or
not be as good as we think)
• Data analysis methods need adapting for
stratified / clustered designs
• Other survey factors interact with sample
selection issues, eg poor interviewers may
discourage certain cases from response
30
2.2) Opportunistic sampling
• More often in social research, sample
design is ‘opportunistic’ (‘purposive’)
– Random sampling is expensive
– Random sampling is complex
– Some purported random samples are actually
purposive (understanding of ‘random’)
31
2.2a Quota sampling
• Fill up quota’s of groups of interest
• Quota’s can ensure:
– overall representation (cf systematic)
– broad topic coverage (eg types of voter)
• Example: market researchers in street;
telephone call centres vetting contacts
• Biasses: issues in how a quota ‘fills up’
32
2.2b Snowball sampling
• Also ‘focussed enumeration’
• Technique for contacting cases from
populations rare / difficult to access
• Ask first obtained contact for suggested
further contacts
 snowball gathers size…
• Eg – smaller ethnic minority groups
• Problem: social networks are non-random!
33
2.2c Convenience sampling
• Samples whatever cases from population
were easiest to reach, eg personal contacts
• Often no other sampling strategy involved
• Biasses likely in convenience process
• Examples: …most social research survey
examples are ‘convenience’..!
34
Random v’s Opportunistic
• Random difficult and expensive – mainly
government funded resources
• Most people who have conducted a survey
have conducted an opportunistic one
• Much data analysis / inference assumes
random sample, but not applied to
• But opportunistic data is often robust…
35
More on sampling methods
• Refs for sampling methods / properties: eg
Gilbert 2001 chpt5; De Vaus 2001 chpt6;
Bryman 2001 chpt4.
• Research reports: most important is
documentation of sampling process / issues
– To be open about research
– To consider unintentional mistakes
36
Summary on sampling methods
• Good sampling not a panacea
– Other elements of surveying equally crucial
• Many statistical methods assume random
sample
• For good sampling, use secondary data..
• All samples have some value, but nonrandom ones need careful context
37