Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
APPENDIX B Data Preparation and Univariate Statistics • How are computer used in data collection and analysis? • How are collected data prepared for statistical analysis? • How are missing data treated in statistical analyses? • When is it appropriate to delete data before they are analyzed? • What are descriptive statistics and inferential statistics? • What determines how well the data in a sample can be used to predict population parameters? Preparing Data for Analysis • Collecting the data 1. Ask participants to fill out a questionnaire 2. Ask participants to enter their response via keyboard into a computer. • Analyzing the data 1. SPSS contains a spreadsheet data editor,a output editor, and a syntax editor. 2. SPSS contains subprogram to compute the statistical analyses such as Frequency Distribution, Descriptive statistics, ANOVA, Correlation, Regression • Entering the data into the computer 1. Use coding systems> Label variables. 2. Keep notes> You will forget which variable name refers to which data 3. Save and back up the data 4. Check and clean the data Missing Data • When the respondent has decided not to answer a question because it is inappropriate or because the respondent has personal reasons for not doing so. 1. Think carefully about whether all questions are appropriate 2. Save respondents from embarrassing situations. • When the respondent forgot to answer the question or completely missed an entire page of the questionnaire. 1. Test the research procedure before you carry out it 2. Check the respondents answers before they leaves • When the research requires the respondents to participate in it at more than one time. Attrition Problem Deleting and Retaining Data • When do we delete variables? Cases in which the reliability analysis indicates that the variable did not measure the same things that other variable measured. • When do we delete responses? Cases in which the respondents gave a very extreme score>outlier • When do we delete participants? Cases in which the respondents did not understand the instruction or wasn’t able to perform the task • How do we trim the data? Cases in which the scores that are more than 3 standard deviation above or below the variable’s mean. • When do we transform the data? Cases in which you use reverse-score, or you have skewed data Conducting Statistical Analysis Descriptive Statistics Statistical approach in which the researcher summarize the pattern of scores observed on a measured variable. Inferential Statistics Statistical approach in which the researcher infers statistical significance in total population based on the pattern of scores observed in your sample of respondents Your Data Analysis Population Your Data Analysis Summation Notation Sample data ( X , X , X , X , X ) X1 = 6 X2 = 5 X3 = 2 X4 = 7 X5 = 3 1 2 3 4 5 = 6 + 5 + 2 + 7 + 3 = 23 N Xi 23 X 23 i 1 Summation Starts from 1 To N (in this case, N = 5) Rounding APA Publication manual generally suggests to round the presented figures (including both descriptive and inferential statistics) to two decimal places. = 3.14159265…… 3= 1.732... p = 0.0041... 3.14 1.73 .004 Computing Descriptive Statistics Frequency Distribution: A table that indicates how many, and in most cases what percentage, of individual in the sample fall into each of a set of categories. (e.g. bar chart, grouped frequency distribution, histogram, frequency curve, stem and leaf plot) Central Tendency The point in the distribution around which the data are centered. (e.g. mean, median, mode) Dispersion: The extent to which the scores are all tightly clustered around the central tendency (e.g. range, variance, standard deviation) Frequency Distribution X1 = 6 X2 = 5 X3 = 2 X4 = 7 X5 = 3 X6 = 4 X7 = 6 X8 = 2 X9 = 1 X10 = 8 Bar Chart 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 3 2 Histogram 1 0 1 2 3 4 5 6 7 8 2.5 2 Frequency Curve 1.5 1 0.5 0 1 2 3 4 5 6 7 8 Central Tendency Sample Data X1 = 6 X2 = 5 X3 = 2 X4 = 7 X5 = 3 X6 = 4 X7 = 6 X8 = 2 X9 = 1 X10 = 8 The Mean (average): the value in which the sum of all of the scores devided by the sample size. X X 1 X 2 X 3... XN = X = N N (6 5 2 7 3 4 6 2 1 8) 44 = = 4.4 10 10 The Median: The score at which half of the observations are greater and half are smaller. 1, 2, 2, 3, 4, 5, 6, 6, 7, 8 45 = 4.5 2 The Mode: the most frequently occurring value in a variable. 1, 2, 2, 3, 4, 5, 6, 6, 7, 8 Dispersion The Range The Distance between the largest (the maximum) and the smallest (the minimum) observed values of the variable. The variance = S2 The sum of squares ( sum of (Xi - mean)2 )divided by N The Standard Deviation =S The square root of the variance The variance and the Standard Deviation Mean Deviation Score X X1 = 6 X2 = 5 X3 = 2 X4 = 7 X5 = 3 X6 = 4 X7 = 6 X8 = 2 X9 = 1 X10 = 8 = 4.4 (6 - 4.4) (5 - 4.4) (2 - 4.4) (7 - 4.4) (3 - 4.4) (4 - 4.4) (6 - 4.4) (2 - 4.4) (1 - 4.4) (8 - 4.4) 1 2 3 4 5 6 7 8 (X X ) = 0 Sum of Squares SS = SS = ( X X ) ( X = 244 = 50.4 - 2 ) - (6 - 4.4)2 = 2.56 (5 - 4.4)2 = 0.36 (2 - 4.4)2 = 5.76 (7 - 4.4)2 = 6.76 (3 - 4.4)2 = 1.96 (4 - 4.4)2 = 0.16 (6 - 4.4)2 = 2.56 (2 - 4.4)2 = 5.76 (1 - 4.4)2 = 11.56 (8 - 4.4)2 = 12.96 2 = 50.4 ( X ) N 1.936 10 2 Variance and Standard Deviation Variance S2 = SS 50.4 = = 5.04 N 10 Standard Deviation (SD) S= 2 S = 5.04 = 2.24 Standard Score (Z score) The distance of a score from the mean of the variable expressed in standard deviation unit. To compare two scores that have different mean and different standard deviation (SD). Taro had received a score of 80 on a test. The average was 50, and standard deviation was 15. 50 80 Susan had received a score of 75 on a test. The average was 60, and standard deviation was 10. 60 Z= X X s ZTaro = ZSusan = 75 80 50 = 2.0 15 75 60 10 2.0 = 1.5 0 1.5 Standard Nominal Distribution Hypothetical population distribution of standard scores when the original scores are normally distributed. = 0, = 1 -1 < Z < 0, or 0 < Z < 1 34.13% -2 < Z < -1, or 1< Z < 2 13.59 % -3 < Z < -2, or 2 < Z < 3 2.15% Z > -3, or 3 < Z 0.13% Working with Inferential Statistics Example. A researcher estimate the average GPA of all of the psychology majors at UM. Mean of the population Standard deviation of the population Population Descriptive Statistics of 100 students. MMMMM MMMMM WWWWW WWWWW WWWW X = 3.40 S = 2.23 Mean of the sample Standard deviation of the sample Unbiased Estimator X The sample mean ( mean . ) is an unbiased estimator of the population The sample standard deviation ( s ) , however, is not an unbiased estimator of the population standard deviation . How can we estimate , using the sample standard deviation? ^ S= SS N 1 The standard error If we take all possible samples of N = 100 from a given population, the resulting distribution of the sample means have X = The distribution would be normally distributed with a standard deviation known as standard error of mean (or simply the standard error). The standard error is symbolized as S X SX = s N 1 Confidence Intervals The range of scores within which the population mean is likely to fall. The exact width of the confidence interval is determined with a statistic known as Student’s t Example. Now, we sampled 100 students. Degree of freedom = 100 - 1 = 99 If we set alpha = .05, The appropriate t value = 1.99 (see Table C, Appendix E) Lower limit = X - t(s X ) = 3.40 - 1.99 ( .22) = 2.96 Upper limit = X - t(s X ) = 3.40 +1.99 ( .22) = 3.84