Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sociological metodology Quantification Petr Soukup Outline of the lecture 1. Data, variables, values in quantitative research 2. Data preparation, data entry, coding of variables 3. Presentation of data, basics of statistic (Quantification) 4. Advances statistical techniques – review 5. Statistical software 6. Data archives and its usage 1. Data, variables • Variables 3 types • Variables and questions • Variable = 1 question ? • Numerical data – necessary for computer Data, variables • Values of variables – for interval variables natural, for ordinal and nominal artificial values (assigned by a researcher) • Data matrix – rows (respondents), columns (variables) • Technical names of variables (queustionnaire) 2. Open-ended question and problem of coding 1) Predefined coding schemes Examples – ISCO, EGP, ISEI 2) Creating of our own coding scheme – example Collapsing of categories 3. Basics of statistics (Quantification) 1) Frequencies 2) Central tendency 3) Crosstabulations Basics of statistics – Frequencies Frequencies – absolute and relative (percentages) Interpretation Example in MS Excel (data EVS) Basics of statistics – Central tendencies MEAN –arithmetic mean of values (sum of values divided by the number of values) MEDIAN - the value in the middle of data ) MODE – the most frequent value in our data Basics of statistics – Dispersion Range – difference between maximum and minimum Standard deviation – dispersion for interval variables (interpretation of standard deviation) Examples in Excel Bivariate statistics Nominal and ordinal variables: Crosstabulations – total counts, row and column percentages Chi-square test of independence Interval variables: Correlation Correlation coefficient and its values Examples in Excel (EVS data) Functions in Excel: FREQUENCY – enables to prepare frequency table AVERAGE – compute arithmetic mean of values (sum of values divided by the number of values) MEDIAN - compute median (middle) - the values in the middle of data ) MODE – compute mode, the most frequent value in our data MAX – maximum of variable MIN – minimum of variable STDEV – standard deviation 4. Advanced statistics - review More than two variables Usually called modeling Examples: Loglinear modeling Regression modeling Structural Equation modeling Etc. Usually called Multivariate or Multidimensional Statistics (try Google) 5. Statistical software – review A. General statistical software Can prepare data (enter data, labels, clean data, transform data) and also compute individual statistical procedures. They include many statistical procedures so these can be used nearly every time. List of the most common: SPSS (Statistical Package for Social Sciences) Origin. USA 13 modules more information see http://www.spss.com/ or http://www.acrea.cz/ Current version : 20 Price : approx. 10 th USD for the full system Trial version: for 15 days SAS (Statistical analysis system) Origin. USA More info: http://www.sas.com/offices/europe/czech/index/index.html again individual modules Current version: 10 Price: approx. 50 th USD Trial version: NO 5. Statistical software – review List of the most common (continue): STATISTICA Origin. USA More info: http://www.statsoft.cz/page/index.php again individual modules Current version: 8 Trial version: for 30 days Price: approx. 5 th USD Statsoft Textbook: see(http://www.statsoft.cz/page/index2.php?pg=navigace&nav=31). STATA Origin. USA More info: http://www.stata.com/capabilities/statisticalcap.html again individual modules Current version: 10 Trial version: NO Price: approx. 3 th USD 5. Statistical software – review List of the most common (continue): R (R project) FREEWARE OPENSOURCE SOFTWARE Download: http://www.r-project.org/. Current version: 2.7. Disadvantages: No menu, worse graphical abilities Advantages: the most up-to-date to new statistical procedures, very low requirements for hardware, many forums about software 5. Statistical software – review B) Special statistical software Can be used for specialized (advanced statistical procedures). Usually it is necessary to prepare data in some general statistical software (see above) Examples: AMOS – software for Structural Equation Modeling HLM – software for Multilevel Modeling lEM – software for Latent Vlase Analysis etc. there are hundreds of these softwares, some of them are freeware, some are commercial softwares (ususally not very cheap one) 6. Data archives in Social sciences Store data from quantitative surveys (sometimes also from qualitative ones) - Usually national archives - There are many archive associations What can be reached via data archives? - Original Data – in many formats .for SPSS, SAS, Excel etc. - Original Questionnaire - Codebook – information about individual variables, their values and their labels (sometimes also frequency tables for all variables included in the data file) 6. Data archives in Social sciences - Additional services • Possibility to find data for individual topic • Possibility to find the list of books and articles based on selected data file • Possibility to compute basic statistics without reaching a data (frequencies, crosstabulations, correlations) etc. • • Note: Some operations are free for some operations is necessary to register or pay 6. Data archives in Social sciences Example: Czech data archive (http://archiv.soc.cas.cz) Use system NESSTAR Other archives and its associations – see LINKS Thanks for your attention [email protected]