Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 11 of 47C5 Social Research Process I: Sampling in Quantitative Research I Paul Lambert, 14.10.03, 4-5pm 1 47C5: Survey research lectures Lecture 8: The Survey Method Intro. to & qualities of survey method Lecture 9: Using Secondary Datasets Data access and issues Lectures 11/12: Sampling Sample design, data collection / analysis 2 Resources for lectures 8,9,11,12 • Lecture slides on WebCT site • 2 Reading lists: – Initial list in 47C5 unit outlines – Some additions on further list on WebCT site 3 Web Resources for lectures 8,9,11,12 • Slides and additional reading list also at: http://staff.stir.ac.uk/paul.lambert/teaching.htm • Some other internet resources (cf De Vaus 2002) http://trochim.human.cornell.edu/kb/ http://statcomp.ats.ucla.edu/survey 4 L11/12: Surveys and Sampling Lecture 11: 1) Role of sampling in social surveys 2) Types of sampling methods Lecture 12: 3) Good practice in survey conduct 4) Robust analysis of survey data 5 Part 1: Role of sampling in survey research • • • • Surveys can be census’s More often samples from wider population Several sampling methods select cases Aim: representative of wider population 6 Inference • Key idea is inference = confidence in our ability to generalise Sampling inference = application of statistical theories in order to estimate probabilities that a sample result is ‘likely to have been unrepresentative’ 7 The ‘normal’ (Gaussian) curve 8 Theories of sampling methods Sampling and probability theories tell us that any particular random sample is most likely to have the same properties as the wider population. We can then estimate the probability that sample results of a particular nature could have arisen by chance, rather than because they are the same as the population result. 9 If the cases in sample surveys were selected at random, then can use sampling theories and thus ‘inference’ 10 ‘Inferential data analysis’ • Variable-by-case matrix data analysis for generalising findings to population • Often distinguished from ‘descriptive’ data analysis (results of sample only) • Key: joint influence of – 1) size of sample – 2) strength of data pattern in increasing confidence about generalisations 11 Statistical inference ..causes confusion; one of hardest parts of survey data analysis to understand.. Phrases: ‘significance level’ ‘p-value’, ‘confidence interval’, ‘hypothesis testing’, .. Meaning: Whether results would probably generalise to a larger population (if sample is treated as random) See: Refs for L11 part 1 (supplementary list) 12 Critiques of survey generalisation 1) Part of the ‘fall of survey methods’ 1960’s: • Sampling is not representative Sampling is systematically biassed • Inferential conclusions too carelessly made and too strongly stated • See for example Cicourel 1964 13 Critiques of survey generalisation 2) Deconstructing inference (1980’s ) • Inferential methods over-relied upon Survey analysis becomes theory-free hunt for ‘significant’ patterns • Inference needed less than often suggested • Bad variable analysis (operationalisation) effects inference results, eg (non-)parametric variables; data clustering; … • See for example Rose and Sullivan 1996 p192-5 14 Contemporary survey research Tends to use 2 strategies to address critiques: Large scale, often secondary, rigorous methods or Small scale, primary, claims carefully qualified 15 Terms in sample survey analysis • • • • Population: all cases of interest Sampling frame: list of all potential cases Sample: cases selected for analysis Sampling method: technique for selecting cases from sampling frame • Sampling fraction: proportion of cases from population selected for sample (n/N) 16 Survey analysis: ‘variable-by-case matrix’ Cases 1 2 3 4 . . N 1 1 2 2 . . 17 18 17 18 . . Variables 1.73 A . 1.85 B . 1.60 C . 1.69 A . . . . . . . . . . . . . . . . . . . . . . . . . 17 Sample Surveys (case selection) Populn. Cases 1 2 3 4 1 5 2 6 7 3 8 4 N=8 n=4 1 1 2 2 Variables 17 1.73 18 1.85 17 1.60 18 1.69 A B C A 18 Part 2: Sampling methods and techniques = Ways of selecting case from population 2.1 Random (probabilistic) Generalisable, inferential statistics, fewer applications 2.2 Non-random (opportunistic; purposive) Harder to generalise, inference contested, more widely used 19 2.1a Simple Random Sample • A statistical method used to choose cases randomly (eg random numbers) Every case in population has exactly the same chance of being in sample • Most data analysis techniques initially designed for simple random samples 20 2.1b Systematic Random Sample • Like the SRS, select cases from anywhere in the whole population • An easier selection method : choose every (n)th person for the sample • Danger of ‘periodicity’ if original population order has any structure, bias 21 Problems with sample methods selecting from whole population • The ‘random’ part means it is always possible to get a population coverage quite different from known structures • If total population is large or dispersed, then coverage of random parts of it is expensive and time consuming: few surveys use random sampling from whole of UK 22 2.1c Stratified random samples • Modifies random sample to ensure even (or ‘boost’) coverage of population groups – – – – split sampling frame by stratification factors select random samples within each factor final sample has correct proportions of each Example: select 490 M and 510 F • Properties: proportionate sample, correct representations; but more expensive & complex, should use ‘weights’ for analysis 23 2.1d Multistage cluster samples • i) Select clusters of population at random • ii) Sample randomly within clusters • Eg: clusters = local authorities in UK – With qualifications, may still be treated as ‘random’ for analysis purposes – Big reduction in costs if face-to-face contacts Most widely favoured sampling method in large scale survey collections 24 Example: Multistage cluster sample • Interest: attitudes of Scottish school pupils • Resources: 400 interviews with pupils 25 Shetlands 2 Highlands 40 Islands 20 Moray 20 Aberdeen 40 Perth 20 Edinburgh 100 Argyll 24 Borders 10 Glasgow 124 26 Moray 40 Stirling 60 Edinburgh 150 Glasgow 150 27 Stirling 60 30 young people at Balfron School and 30 young people at Stirling High 28 2.1e Longitudinal random samples • Longitudinal = interest in study over time • ‘Panel’ and ‘cohort’ samples – recontact an initially random sample – Problems of attrition • Retrospective sample – Rely on recall evidence of random selection – Problems of selective recall 29 Issues in random sampling • Only as good as underlying sampling frame (a good one may not be available, or not be as good as we think) • Data analysis methods need adapting for stratified / clustered designs • Other survey factors interact with sample selection issues, eg poor interviewers may discourage certain cases from response 30 2.2) Opportunistic sampling • More often in social research, sample design is ‘opportunistic’ (‘purposive’) – Random sampling is expensive – Random sampling is complex – Some purported random samples are actually purposive (understanding of ‘random’) 31 2.2a Quota sampling • Fill up quota’s of groups of interest • Quota’s can ensure: – overall representation (cf systematic) – broad topic coverage (eg types of voter) • Example: market researchers in street; telephone call centres vetting contacts • Biasses: issues in how a quota ‘fills up’ 32 2.2b Snowball sampling • Also ‘focussed enumeration’ • Technique for contacting cases from populations rare / difficult to access • Ask first obtained contact for suggested further contacts snowball gathers size… • Eg – smaller ethnic minority groups • Problem: social networks are non-random! 33 2.2c Convenience sampling • Samples whatever cases from population were easiest to reach, eg personal contacts • Often no other sampling strategy involved • Biasses likely in convenience process • Examples: …most social research survey examples are ‘convenience’..! 34 Random v’s Opportunistic • Random difficult and expensive – mainly government funded resources • Most people who have conducted a survey have conducted an opportunistic one • Much data analysis / inference assumes random sample, but not applied to • But opportunistic data is often robust… 35 More on sampling methods • Refs for sampling methods / properties: eg Gilbert 2001 chpt5; De Vaus 2001 chpt6; Bryman 2001 chpt4. • Research reports: most important is documentation of sampling process / issues – To be open about research – To consider unintentional mistakes 36 Summary on sampling methods • Good sampling not a panacea – Other elements of surveying equally crucial • Many statistical methods assume random sample • For good sampling, use secondary data.. • All samples have some value, but nonrandom ones need careful context 37