Download Surveys

Document related concepts
no text concepts found
Transcript
Lecture 4 - Survey design
•
•
•
•
•
Sampling
Sample size/precision
Data collection issues
Sources of bias
Critical review of survey reports
1
Why do surveys?
• Information on particular population
– prevalence of a disease
– behaviour, knowledge, attitude
• Planning of services
• Collect information on data not routinely
available:
– e.g., mental health status, health behaviours
• Repeat surveys to monitor trends (serial crosssectional studies)
2
Bias and precision of the survey
estimates
• Bias:
– selection bias relates to sample selection
– information bias relates to information
collected (measurements)
• Precision
– relates to sample size
3
Study bias and precision vs
measurement validity and
reliability
• Bias/validity:
– does measurement/study estimate reflect true
state of affairs
• Precision/reliability
– if measurement/study is repeated, will similar
result be obtained?
4
Reasons to sample
• Reduce cost
• Increase accuracy and quality of data
collected
5
Definitions
• Sampling unit
– person or group (e.g., household)
• Sampling frame
– list of sampling units in the population
•
•
•
•
censuses
electoral lists
telephone lists
are institutional populations excluded (e.g., prisons,
nursing homes)
6
Target and study population
• Target population:
– population for generalization of results
• Study population:
– population for collection of data
– may be total target population or a sample
7
Types of sample
• Non-representative
– convenience
– volunteers
• Representative
–
–
–
–
simple random
systematic
cluster
multistage
8
Simple random sample
• Each sampling unit in the population has
equal probability of being included
• Sampling with replacement:
– each unit placed back in pool
• Sampling without replacement (usual
method):
– each unit selected is kept out of pool
9
Simple random sample (cont’d)
• Methods:
– manual
– tables of random numbers
– computer-generated random numbers
10
Systematic sample
• Select every nth individual from a list
– can use existing numbers
– e.g., patient appointments, medical records
• Advantages:
– Does not require complete sampling frame
– Simple to carry out
• Disadvantages:
– May be unsuitable for cyclic or ordered data
(e.g., every 5th patient when only 5/day)
11
Stratified sampling
• Separate sample selected from different
strata of population
• Requires separate sampling frame for each
stratum
• Useful if there are small but important
subgroups of the population (e.g., very old,
very young, institutionalized, sick)
12
Cluster sampling
• Sampling unit is a group (e.g., household,
village, school)
• Step 1: Simple random sample of groups
• Step 2: All members of group included in
sample
• Advantages:
– enumeration of population not needed
– more efficient use of resources
13
Multistage sampling
• Larger units sampled in first stage, smaller
units later
• e.g.:
– stage 1 - sample of towns
– stage 2 - sample of city blocks or census tracts
– stage 3 - sample of households
14
Sampling for “hidden populations”
• Homosexual men:
– gay bars, newspapers
• Injection drug users:
– convenience sample (e.g., treatment facilities)
– snowball sampling (through networks)
• Capture-recapture methods
– identify biases of sampling method
15
Planning a survey
• Define target population
• Select method of sampling
– sampling unit, sampling frame, etc
• Calculate sample size
• Define survey data collection methods
• Non-respondents
– number of attempts to reach
– different days, times
16
Sample size estimations
• Requirements:
– level of precision (width of confidence interval)
– expected variability (estimated from previous
studies, pilot study, or literature)
17
Design of questionnaires
•
•
•
•
•
•
List study variables
Collect existing questions and instruments
Adapt and/or develop new questions
Format questionaire
Pre-testing (timing, responses, clarity, etc.)
Revise, determine priorities, shorten
18
Question wording: clarity
• Use concrete rather than abstract terms, e.g.,
– During a typical week, how many hours do you
spend doing vigorous exercise?
– Not: How much exercise do you get?
• Avoid jargon, technical terms, slang
• Avoid double-negatives (Do you disagree that
doctors should not make house calls?)
• Use active vs passive voice (Has a doctor ever told
you vs Have you ever been told by a doctor?)
19
Question wording: clarity
– Break long sentences into short ones (20 word
or fewer)
– Use good grammar but use informal style
– Avoid hypothetical questions
– Evaluate reading level (normally not more than
8th grade)
20
Question wording: neutrality
• Do not suggest desirable response, e.g.:
– Not: do you ever drink alcohol?
– Better: how often do you drink alcohol?
• Give permission to give undesirable response e.g.:
– Sometimes people forget to take medications
their doctor prescribes. Do you ever forget (or
how often do you forget) to take your
medications?
21
Question wording
• Introduce attitude questions, e.g.:
– People have different opinions about their
medical care. We are interested in your opinion.
• Avoid double-barreled questions
– How much coffee or tea do you drink each day?
• Avoid assumptions
– How much help do you get from your family?
22
Response wording
• Make them short
• Use as few options as possible
• Consider different types of non-response:
–
–
–
–
–
refuse
don’t know
no opinion
not applicable
omission by subject or interviewer
23
Response wording
• Make sure responses are mutually exclusive
(or give instructions to “check all that
apply”)
• Consider use of response card for multiple
questions with same set of responses
24
Organization of questionnaire
• Group questions by subject matter
• Introduce each group with short descriptive
statement (e.g., now I am going to ask you
some questions about your use of health
services)
• Begin with more emotionally neutral
questions
• More sensitive questions (e.g., income,
sexual function) near end of questionnaire 25
Organization of questionnaire
• interviewer-administered: repeat time frame
fairly frequently
• self-administered: repeat time frame at top
of each page or each set of questions, e.g.:
During the past year, how many times have you:
– Visited a doctor?
– Been a patient in an emergency department?
– Been admitted to hospital?
26
Organization of questionnaires
• Group questions with similar response scale
• Format skip patterns
– screener questions
– branching questions
• Time frame
– group questions that ask about same time frame
– “usual” behavior vs specified time period
– assist respondent with milestones to help define
reference time frame
27
Questionnaire mode
•
•
•
•
Face-to-face
Telephone
Mail
Other:
– diaries
• Mixed mode
28
Face-to-face interviews:
advantages
• reduce items with no response
• easier for older, less educated, lack of
fluency in language
• some formats easier to administer:
– skip patterns to avoid irrelevant questions
– open-ended questions - can probe for more
complete response
29
Face-to-face interviews:
disadvantages
• cost
• time
• effort (interviewer training, evaluation of
inter-rater reliability)
• interviewer biases
• differences in sociodemographic
characteristics of interviewer and subject
30
Telephone interviews:
advantages
• less expensive than face-to-face
• reduce items with non-response
• some formats easier to administer:
– skip patterns to avoid irrelevant questions
– open-ended questions - can probe for more complete
response
• large, representative samples can be organized from one
office
• avoids bias associated with appearance of interviewer
31
Telephone interviews:
disadvantages
•
•
•
•
•
•
misses households without telephone
misses those with unlisted ‘phone numbers
bias when calls made during day
multiple calls may be needed
perceived as intrusive by some
difficult to administer items with multiple
response options
32
Mailed questionnaires:
advantages
•
•
•
•
least expensive
can be coordinated from one office
social desirability minimized
inconsistent results on completeness of
reporting (e.g., for # MD visits)
33
Mailed questionnaires:
disadvantages
• relatively low response rates
– multiple mailings, cover letter, letterhead,
advance warning, token of appreciation, SSAE
• difficult to get information on non-respondents
– differences between early and late responders
• items may be omitted: 5-10% may be unusable
• cannot control order of questions
• postal strikes
34
Analysis of surveys
• Missing data
– exclude
– imputation: e.g., based on characteristics of
respondents
– sensitivity of estimate to method of imputation
• Weighting of estimates
– for stratified samples
35
Analysis of surveys (cont’d)
• Crude estimates, confidence intervals
– Continuous data: Mean, median, quartile
– Categorical data: proportion
– Confidence intervals to describe precision
36
Bias and precision of the survey
estimates
• Bias:
– selection bias relates to sample selection
– information bias relates to information
collected
• Precision
– relates to sample size
37
Selection bias in surveys
• Does the final analysis sample represent the
original target population?
• Sources of bias:
– sampling method
– non-response
– missing data
38
Information bias in surveys
• Bias in measurement of outcomes
• Sources of information bias:
–
–
–
–
non-validated measurement instrument
unblinded or poorly trained data collectors
response set
etc.
39
Critical review of an article
describing prevalence or incidence
(Loney et al, 1998)
• Are the study methods valid?
• What is interpretation of the results?
• What is the applicability of the results?
40
Are the study methods valid?
• Appropriate study design and sampling
methods
• Appropriate sampling frame
• Adequate sample size
• Suitable outcome
• Unbiased measurement of outcome
• Adequate response rate
41
What is interpretation of the
results?
• Are the estimates of prevalence or incidence
given with confidence intervals and in detail
by subgroup, if appropriate?
42
What is the applicability of the
results?
• Are the study subjects and the setting
described in detail and similar to those of
interest to you?
43
CSHA: Are the study methods
valid?
• Appropriate study design and sampling
methods
• Appropriate sampling frame
• Adequate sample size
• Suitable outcome
• Unbiased measurement of outcome
• Adequate response rate
44
CSHA: study design and sampling methods
• Prevalence survey with 2 analytic studies
appended
• Target population: Canadian population
aged 65 and over
• Exclusions:
–
–
–
–
Yukon and NW territories
Indian reserves, military units
persons with life-threatening illnesses
not fluent in French or English
45
CSHA: Appropriate study design
and sampling methods (cont’d)
• 18 study centres across Canada
• 36 cities and surrounding rural area
– selected for accessibility to study centres
– included 60% of population aged 65+
46
Sampling frame: community sample
• Sampling frame for community sample:
– Medicare (provincial health insurance plans)
– In Ontario: used Enumeration Composite
Record (aggregate based on election records
and municipal records)
• Stratified random sampling by age:
– 65-74
– 75-84 (twice sampling fraction of 75-84)
– 85+ (2.5x sampling fraction of 75-84)
47
Sampling frame: institutional
sample
• Nursing homes, chronic care facilties,
collective dwellings (e.g., convents)
• 3 centres sampled from insurance lists
• Other centres used multistage sampling:
– stratified sample of institutions:
• small (up to 25 beds)
• medium (26 - 100 beds)
• large (more than 100 beds)
– random sampling within selected institutions
48
Sampling (cont’d)
• Person who could not be contacted or who
refused was replaced with another from
same age group, same sex, same geographic
region.
• Target for each region:
– 1800 from community sample
– 250 in institutional sample
49
Adequate sample size?
• Target sample in each region:
• 1800 in community
• 250 in institutions
• Assuming institutional prevalence of 50%
– 95% CI of 6%
• Assuming community prevalence of 5%
– 95% CI of 1%
50
Suitable outcome
• 2-stage process
• 3MS screen in subject’s home
• all with positive screen (score of <78) and
random sample of those with negative
screen referred for clinical evaluation
• DSM III-R criteria for final diagnosis
51
Unbiased measurement of
outcome
• Interviewers and clinical team (nurse,
psychometrician, neuropsychologist,
physician) were blind to screening result
• Negative screens included
52
Response rate: community sample
• 19,398 people on community sample lists
– 3,753 had died, were wrong age, had left study area, or
institutionalized
– 1,020 could not speak English or French
– 534 away or in hospital during study period
• 14,091 (72.6%) eligible for study
– 1,601 could not be contacted
– 3,482 refused
• 9,008 participated (63.9% of those eligible)
• 8,949 screened (59 who could not be screened referred for
clinical assessment)
53
Response rate: community sample (cont)
• Among those with positive screen (1,614):
– 508 (31%) refused clinical assessment
• Among sample of those with negative
screen (731):
– 228 (31%) refused clinical assessment
• Total participation rate (screening and
clinical assessment): 0.69 x 0.64 = 0.44
54
Response rate: institutional sample
• 1817 subjects in sample
– 154 died, assigned to wrong age group, left
study area or institution
– 46 could not speak French or English
– 31 in hospital
• 1586 (87.3%) eligible
– 50 could not be contacted
– 281 refused
• 1,255 (79.1%) participated in screening
55
CSHA: interpretation of the results
• Prevalence: 8.0% overall
• 95% CI given
• Subgroups:
–
–
–
–
age group (65-74, 75-84, 85+)
sex
setting (community or institution)
region of Canada
56