Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Using statistics in the analysis of quantitative data A good way to use this material for detailed study is to print the whole file then to run the slide show, while reading the text from the printed version. This will allow you to use the links and animations that are included in some of the slides. Suggested Print settings for use in the print dialogue box: Print Range: Print what: All Notes Pages (from the drop down box) Then tick: Black & White, Scale to fit paper Types of data Data type Example Nominal or Categorical Eye colour Ordinal Job seniority Interval: parametric non-parametric Ratio parametric non-parametric Language comprehension test score; IQ Age Uses of statistics Use of statistics Describing a sample Looking for relationships between variable in a sample Estimating parameters in a population Testing hypotheses Inferential or Noninferential Non-inferential Non-inferential Inferential Usually used inferentially but can be used noninferentially SPSS task Entering data Describing a sample SPSS calculation of mean Descriptive Statistics N attitude to school Valid N (lis twis e) 40 40 Mean 9.95 Finding the spread of scores in a sample Standard Deviation S ( ( x x ) / n ) 2 Standard Deviation Descriptive Statistics attitude to s chool Valid N (listwise) N Range Minimum Maximum Mean Std. Deviation N 40 18 1 19 9.95 4.15 40 Finding how scores are distributed Distribution of attitude scores Distribution of attitude scores Properties of the Normal Distribution Mean ± 1 standard deviation Mean ± 2 standard deviations Mean Standard Deviation Mean Standard Deviation 10 2 10 2 Range from 10-2 to 10+2 ie from 8 to 12 contains 68% of cases Range from 10-4 to 10+4 ie from 6 to 14 contains 95% of cases Checking normality Descriptive Statistics attitude to s chool Statis tic Std. Error Valid N (listwise) Statis tic N Mean Std. Deviation Skewness Kurtosis Skewness Kurtosis N 40 9.95 4.15 -.074 .197 .374 .733 40 An overall test for normality Tests of Normality a attitude to s chool Kolmogorov-Smirnov Sig. df Statis tic .200* 40 .109 *. This is a lower bound of the true s ignificance. a. Lilliefors Significance Correction Median and Mode for ordinal data Statistics confidence in s peaking time 1 N Valid 40 Miss ing 0 Median 4.0000 Mode 4.00 Range 6.00 confidence in speaking time 1 Valid 1.00 2.00 3.00 4.00 5.00 6.00 7.00 Total Frequency 1 2 8 10 8 6 5 40 Percent 2.5 5.0 20.0 25.0 20.0 15.0 12.5 100.0 Valid Percent 2.5 5.0 20.0 25.0 20.0 15.0 12.5 100.0 Cumulative Percent 2.5 7.5 27.5 52.5 72.5 87.5 100.0 Describing ordinal data Bar charts (no gaps) Nominal data - Mode Statistics school N Valid Mis s ing Mode 40 0 1 school Valid s chool 1 s chool 2 s chool 3 Total Frequency 14 13 13 40 Percent 35.0 32.5 32.5 100.0 Valid Percent 35.0 32.5 32.5 100.0 Describing nominal data – Bar Chart Describing nominal data – Pie Chart school 3 school 1 school 2 Exploring relationships between data Correlation IQ and attitude to school 180 160 140 120 100 IQ score 80 60 40 0 attitude to school 10 20 Correlation Correlations IQ s core attitude to s chool Pears on Correlation Sig. (2-tailed) N Pears on Correlation Sig. (2-tailed) N attitude to IQ s core s chool 1.000 .564** . .000 40 40 .564** 1.000 .000 . 40 40 **. Correlation is significant at the 0.01 level (2-tailed). IQ and attitude to school 180 160 140 120 100 IQ score 80 60 40 0 attitude to school 10 20 Review of meaning and importance of linearity • http://www.aiaccess.net/English/Glossarie s/GlosMod/Flash/e_gm_fla_covariance.ht m • http://www.fon.hum.uva.nl/Service/Statistics.html Extreme groups – a warning IQ and attitude to school 180 160 140 120 100 IQ score 80 60 40 0 attitude to school 10 20 Correlation - effect of measurement error Test result Actual points motivation Correlation - effect of measurement error Test result Actual points Measured points motivation Correlation - effect of measurement error Test result motivation Correlation & Regression Correlations s caled iq Pears on Correlation Sig. (2-tailed) N Pears on Correlation Sig. (2-tailed) N **. Correlation is s ignificant at the 0.01 level (2-tailed). 180 170 160 150 140 130 120 110 100 90 80 70 60 50 scaled iq attitude to school attitude to s chool s caled iq 1.000 .564** . .000 40 40 .564** 1.000 .000 . 40 40 40 30 20 10 0 0 10 20 attitude to school Correlations IQ and attitude to school attitude to s chool Pears on Correlation Sig. (2-tailed) N Pears on Correlation Sig. (2-tailed) N attitude to s chool .564** .000 40 1.000 . 40 **. Correlation is significant at the 0.01 level (2-tailed). 160 140 120 100 80 IQ score IQ s core IQ s core 1.000 . 40 .564** .000 40 180 60 40 0 attitude to school 10 20 Spearman Correlation Ordinal data Correlations Spearman's rho lis tening comprehens ion s core time 1 lis tening comprehens ion s core time 2 Correlation Coefficient Sig. (2-tailed) N Correlation Coefficient Sig. (2-tailed) N **. Correlation is s ignificant at the .01 level (2-tailed). listening listening comprehen comprehen s ion s core s ion s core time 1 time 2 1.000 .878** . .000 40 40 .878** 1.000 .000 . 40 40 Chi squared test of association Nominal data gender * school Crosstabulation gender male female Total Count Expected Count Count Expected Count Count Expected Count s chool 1 7 7.0 7 7.0 14 14.0 s chool s chool 2 7 6.5 6 6.5 13 13.0 s chool 3 6 6.5 7 6.5 13 13.0 Chi-Square Tests Pearson Chi-Square N of Valid Cas es Value .154 a 40 df 2 As ymp. Sig. (2-s ided) .926 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 6.50. Total 20 20.0 20 20.0 40 40.0 Chi squared showing an association HAIRCOL * EYECOL Crosstabulation blue HAIRCOL black brown blond Total Count Expected Count Count Expected Count Count Expected Count Count Expected Count 18 9.6 6 8.6 6 11.8 30 30.0 EYECOL brown 3 7.7 15 6.9 6 9.4 24 24.0 other 6 9.6 3 8.6 21 11.8 30 30.0 Chi-Square Tests Pearson Chi-Square N of Valid Cas es Value 36.853 a 84 df 4 As ymp. Sig. (2-s ided) .000 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 6.86. Total 27 27.0 24 24.0 33 33.0 84 84.0 Calculating chi-squared from cell values • http://www.physics.csbsju.edu/stats/ contingency.html Item analysis, reliability and validity Cronbach’s Alpha R E L I A B I L I T Y A N A L Y S I S N of Cases = Statistics for Scale Mean 3.2500 - S C A L E (A L P H A) 28.0 N of Variables 10 Variance 10.4167 Std Dev 3.2275 Scale Mean if Item Deleted Scale Variance if Item Deleted Corrected ItemTotal Correlation 3.1071 2.9643 2.8929 2.9643 2.9643 2.9643 2.8214 2.8571 2.7500 2.9643 8.7659 7.8876 8.0992 8.3320 8.4061 9.2950 8.6706 9.2381 8.9352 7.8876 Item-total Statistics ITE0001 ITE0003 ITE0005 ITE0007 ITE0009 ITE0002 ITE0004 ITE0006 ITE0008 ITE0010 Reliability Coefficients Alpha = .8782 .7222 .8968 .7487 .7052 .6744 .3244 .5027 .3080 .4015 .8968 Squared Multiple Correlation . . . . . . . . . . 10 items Standardized item alpha = .8839 Alpha if Item Deleted .8610 .8437 .8547 .8587 .8611 .8863 .8746 .8892 .8827 .8437 Estimating population values Terminology Population (described by parameters) Sample (described by statistics) Estimating population values Sampling Samples that allow statistical generalisation • • • • • random systematic stratified random cluster multi-stage Samples that don’t allow statistical generalisation • quota • convenience • snowball Sampling Samples that allow statistical generalisation • • • • • random systematic stratified random cluster multi-stage Samples that don’t allow statistical generalisation • quota • convenience • snowball Making it practicable whilst retaining validity Calculating required sample sizes • http://StatPages.org • http://www.jalt.org/test/bro_25.htm and related web pages Statistics and parameters Statistics of sample Parameters of population • Mean = m • Standard Deviation = s • Correlation = r • Mean = μ • Standard deviation = σ • correlation = ρ Statistics and parameters Statistics of sample • m Parameters of population Best estimate is… • μ=m • s • σ= s. n n 1 • r • ρ = r (for large samples >30) 95% confidence limits for the population mean - large samples m 1.96.s / (n 1) to m 1.96.s / ( n 1) Calculation of confidence intervals Mean • http://glass.ed.asu.edu/stats/analysis/mci.h tml Correlation • http://glass.ed.asu.edu/stats/analysis/rci.ht ml Standard deviation • Walpole R (1982) Introduction to statistics 3rd Edition p277-8;482 Confidence interval for 2 Walpole R. (1982) Introduction to Statistics 3rd Edn New York: Macmillan pp277-8 The Surprising Effect of Population Size As long as the population is at least ten times as large as the sample, the size of the population has almost no influence on the accuracy of sample estimates. The margin of error for a sample size of 1000 is about 3% whether the number of people in the population is 30,000 or 200 million. You can make a good check on how salty a well stirred bowl of soup is by tasting one spoonful – whatever the size of*the . bowl What’s the surprise? There is no effect!