Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sampling Class 7 Goals of Sampling Representation of a population Representation of a specific phenomenon or behavior that is infrequent in the population Ensuring sufficient power for statistical analysis Types of Samples Probability Samples Simple Random Samples Stratified Random Samples Cluster Samples Matched Samples (Case Controls) Non-Probability Samples Systematic Samples Quota Samples Purposive Samples Theoretical Samples Simple Random Samples We use simple random samples when we don't know how a phenomenon is distributed in the population, or when we assume that the probability of an event is equal for all persons in the population, or when we assume that the population characteristics that may bear on the phenomena being studies are evenly distributed among the population (EPSEM) Examples − − − Monitoring the Future – annual survey of high school youths News Polls General Social Survey Stratified Random Samples We use stratified random samples when we believe that these population characteristics are not evenly distributed; in that case a random sample would not ensure representativeness of the population. Stratification means that we sample first by identify specific population characteristics or groups, and then sampling cases within each groups. Examples School research Selection of stratifying variables? Theoretical concerns Demographic concerns We oversample when we need sufficient cases of a population that has a low base rate in the overall population, and when even stratification procedures may not yield sufficient cases for comparison of these groups Example – Adolescent Health – oversample kids who engage in risky behavior Cluster Samples Cluster samples are used when subjects are widely dispersed spatially or socially. Thus, we identify the social or spatial units first, take a sample of these, and then sample specific subjects within each of the social or spatial units. This method is called a multi-stage cluster sampling procedure Example: Lawyer Satisfaction Study Stratify by type of practice and area of law, (e.g. oversample patent lawyers), BUT let other characteristics (e.g., demographics) vary naturally Question – how, in this example, should we deal with years of practice? Case Controls Case Controls Matched Samples Matched Cases Fair Housing Checks Employment Discrimination Checks Matching on other sampling units? Schools, Job Types, Type of dwelling Non-Probabilistic Samples Systematic Samples Convenient but flawed. You sample based on a consistent parameter but with a sample whose representation to the population is uncertain. The most well know examples is election exit polls, or market research at a shopping mall. Quota Sample Ensures adequate representation of specific groups, but not with the goal of constructing a representative population. Generally useful when phenomena is not randomly distributed but concentrated, or when practical issues prevent other probability-based techniques. Example – survey of second-generation immigrants Purposive Samples Useful in generalizing to a specific phenomenon when the independent variable is not widely distributed. For example, we may want to look at the effects of particular occupations on job satisfaction, but these occupations may be rare (eg., driving instructors, stenographers). We sample by identifying these individuals and conducting observations on as many as are needed to make valid statistical inferences. Examples: People with unusual jobs (e.g., driving instructors, stenographers) Consumers of unusual products Persons with rare diseases Theoretical Samples (a.k.a., snowball samples) Sampling on the dependent variable when it is not widely distributed and its population parameters are unknown (precluding other sampling techniques). Examples: People engaged in rare and hard-to-find behaviors These raise problems in inference, but there are considerable strengths in internal validity Technological Samples (?) How valid are net-based surveys? Sample Frame What do such samples represent? Who is missing? Case-Study – Knowledge Networks (www.knowledgenetworks.com ) Random-digit dialing telephone methodology to recruit sample Uses know probabilities of selection associated with geographical locations Confirmation by Express Mail delivery and instructions for telephone enrollment Panel is about 40,000 members Average stay of 3 years on the panel Participating households get free hardware, software, Internet service, email accounts, and technical support Member households get about one multimedia survey per month and three total per month Commercial and academic surveys Advantages? Lack of interviewer reduces bias Broader range of stimulus materials Issues in Sample Construction Sample attrition and mortality Sample size Over-samples to compensate for low base rates or specific theoretical questions Practical limitations in sampling Sample error The degree of error for a particular sampling design s PxQ n Where P and Q are parameters, n=sample size, and s = standard error http://www.dssresearch.com/toolkit/secalc/error.asp Sample weighting Power Considerations Power is the ability of a test to detect relationships that exist in the population Statistical Power is is generally defined as the ability of a design to reject the null hypothesis when it is false. In other words, power gives you the probability of not making a Type II Error. Therefore, Power = 1- β When a study has low power, effect size estimates will be less precise (have wider confidence intervals) and we may incorrectly conclude that the cause and effect do not covary. − − − Type I Error – False Positive (α) Type II Error – False Negative (β) Power = 1- β It’s easy to get statistical significance with a large sample, but it’s not terribly important (theoretically) if the effect size is quite small The (hypothetical) effect size is determined from what is practically or theoretically important and significant. So, you want to specify a difference between groups that is meaningful, that is worth detecting. Most statisticians agree that a power of less than 0.80 suggests a weakness in the sampling design of a study.