Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Principles and Strategies of Quantitative Data Analysis SLC515 Research Methods for Socio-Legal Studies and Criminology 2007/2008 Outline Foundation of QR: Positivism Validity and reliability in QR Core issues of concern in QR Critique of QR Outline Descriptive and inferential statistics Inferential Statistics Crosstabs Correlation Regression Positivism - Historical Facts 16/17th century: blooming of European thought Beginning of modern science Auguste Comte (1798-1857): Sociology as the “queen” of social sciences “Social physics”; idea of progress, social science works like natural science Precise and certain methods, basing theoretical laws on sound empirical observation Knowledge is derived from empirical evidence 19th century: natural sciences gain influence impacts thinking in social science Positivism - Emile Durkheim (18581917) Human and material phenomena are equally real but human phenomena cannot be reduced to pure material facts. Social facts - society as a moral reality, expressed in institutions such as law, religion etc. which are external to us and constrain us. Sociologists should describe characteristics of facts and explain how they came into being. Explanation of social facts by causes: single cause effect, law-like relationship Same general methods of scientific inquiry can be used. Positivism - Key Elements Social research as ‘science’ Universal laws - testing theories Cause and effect relationships between variables Solid methods Value neutral Objectivity Science is … Positivism Interpret. Soc. Science Critical perspective Based on strict rules procedures Just common sense Between the (no science) positions of positivism and interpretivism Deductive Inductive Emancipating, empowering Nomothetic (based on laws) Relies on interpretations Brain-washed, misled, conditioned Value free Not value free Not value free Adapted from Sarantakos (1993) Table 2.2, page 38. Purpose of research … Positivism Interpr. Soc. Science Critical perspective To explain facts/causes/ effects To interpret the world To get below the surface; to expose real relations To predict To understand social life To disclose myths and illusions Emphasises removing false beliefs/ideas, emancipation and empowerment Adapted from Sarantakos (1993) Table 2.2, page 38 & 39. Theories, Hypotheses and Research Design Deductive approach Theory testing Derive hypotheses from theory (if… then… sentences) and test them Research design: Cross-sectional survey Longitudinal design Case study design Comparative design Operationalisation Translation of a theoretical concept into something that can be measured Example (natural sciences): temperature - degrees Celsius, velocity - km/h How do you measure the frustration caused by unemployment or the level of alienation in a society or a society’s satisfaction with its government? Operational definition through indicators Indicators Difference between a measure and an indicator (quantities versus complex concepts) An indicator is employed as though it were a measure of a concept. Example: job satisfaction Research Sites and Subjects Depending on research design, methods and sources of data Establish an appropriate setting: Decisions are involved: where? and who? Sampling strategies Probability and non-probability sampling Representative sample - generalizability Collecting and Processing Data Depends on the chosen research design: Experiments: pre- and post-testing Survey interviews: questionnaire and interviews Etc. Gathered information is then transformed into ‘data’ Information will be quantified - coding - to be processed by a computer Analysing Data and Research Findings Statistical techniques/analysis, special software Results/findings have to be interpreted based on theoretical reflections in the beginning (verification/falsification of hypotheses) Objectives of QR: Support or reject theoretical concepts or findings of other studies Detecting trends, patterns Uncover common sense knowledge Building typologies Writing up Findings and Conclusion Results enter the public domain Conference paper, article, report, thesis, book Significance and validity of findings Implications? (policy advice etc.) Presentation of quantitative data is different than in qualitative research Validity and Reliability Validity, reliability and generalizability are measures of the quality, rigour and wider potential of research Validity = are you observing what you want to observe (construct validity) Is your set of indicators really measuring what you want to measure? Reliability = are the measures, devised for the concept, concise (stability of measure) Stability over time, consistency of indicators (internal reliability) and observers (inter-observer consistency) Core Issues of Concern in QR Measurement Causality - Explanation (dependent and independent variable) Generalisation (representative sample) Replication Testing theory Critique of QR Positivism vs. interpretive social sciences Objectivity? Generalization - but too simplistic? Causality Contrasting Qualitative and Quantitative Research Quantitative Numbers Researcher’s view Researcher distant Theory testing Static Structured Generalization Hard, reliable data Macro Behaviour Artificial setting Qualitative Words/Text Participant’s view Researcher close Theory emergent Process Unstructured Contextual Rich, deep data Micro Meaning Natural setting Descriptive and Inferential Statistics Univariate Descriptive analysis of one variable (column in data set) Bivariate Relationship between two variables Dependent and independent variables Relation between dependent and independent variables Differences between dependent and independent variables Bivariate Analysis Questions we can ask: Is the relationship significant? If so, how strong is the relationship? In which direction does the relationship go? Positive relationships Negative relationships Some statistical tests: Crosstabulation Correlation Regression analysis Cross Tabs All levels of measurement are allowed Cross tabs express common frequencies of the categories of two different variables Significance test: CHI Square How strong?: lambda, gamma, r2 Which direction?: gamma, tau Example ? Highschool Coll Uni Row total Working class 748 50.4% 87 13.9% 10 10.2% 21 5.1% 866 33.1% Middle class 694 46.8% 474 75.6% 67 68.4% 266 64.9% 1501 57.3% Upper class 42 2.8% 66 10.5% 21 21.4% 123 30.0% 252 9.6% Column total 1484 56.7% 627 23.9% 98 3.7% 410 15.7% 2619 100% Example Grouped literacy rates * Grouped GDP Crosstabulation Grouped GDP Grouped literacy rates Very low literacy Low literacy Medium literacy High literacy Total Count % within Grouped literacy rates % within Grouped GDP Count % within Grouped literacy rates % within Grouped GDP Count % within Grouped literacy rates % within Grouped GDP Count % within Grouped literacy rates % within Grouped GDP Count % within Grouped literacy rates % within Grouped GDP Very low GDP 5 Low GDP 1 Medium GDP 0 High GDP 0 Very high GDP 0 83.3% 16.7% .0% .0% .0% 100.0% 17.9% 15 3.6% 4 .0% 0 .0% 0 .0% 0 5.6% 19 78.9% 21.1% .0% .0% .0% 100.0% 53.6% 5 14.3% 7 .0% 5 .0% 1 .0% 2 17.8% 20 25.0% 35.0% 25.0% 5.0% 10.0% 100.0% 17.9% 3 25.0% 16 22.7% 17 16.7% 5 8.7% 21 18.7% 62 4.8% 25.8% 27.4% 8.1% 33.9% 100.0% 10.7% 28 57.1% 28 77.3% 22 83.3% 6 91.3% 23 57.9% 107 26.2% 26.2% 20.6% 5.6% 21.5% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% Total 6 Correlation Correlations measure statistical associations, but do not allow any inferences about causal patterns Requirement: Ordinal and interval data Normally distributed and linear relation Coefficient: Pearson’s r (interval data) Spearman’s correlation coefficient (ordinal data) Example Correlations People who read (%) Gross domestic product / capita Pears on Correlation (r) Sig. (2-tailed) N Pears on Correlation (r) Sig. (2-tailed) N People who read (%) Gross domestic product / capita 1 .552 107 .000 107 .552 1 .000 107 109 Regression Analysis You can visualize correlation in a scatter diagram Regression line Regression coefficients Formula: y=a+b(x) Requirements: Interval data Normally distributed Linear relationship Example Controlling for Variables Purpose of controlling for variables: e.g. exploration of variables that intervene in the relationship between other variables Example: examination of the relationship between GDP and literacy rates in different regions of the world Independent variable: GDP Dependent variable: literacy Control variable: regions Example Pearson’s r was used to measure the correlation between GDP and literacy rates in three regions of the world OECD countries Latin America Africa r=0.616 r=0.608 r=0.421 Conclusion: The strength of the association between GDP and literacy rates varies between different regions. In some, GDP is a better predictor of literacy rates than in others.