Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Transcript

Handout Two: Describing/Explaining Quantitative Data and Introduction to SPSS EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr. Amery Wu 1 About Analysis of Variance Designs • Measurement of the data: quantitative • Type of statistical inference: descriptive and inferential • Type of Modeling: summative/descriptive and explanatory/predictive. Analysis of Variance Design Descriptive Type of Inference Inferential Measurement of Data Quantitative Categorical Summative/Descriptive Summative/Descriptive Explanatory/Predictive Explanatory/Predictive Summative/Descriptive Summative/Descriptive Explanatory/Predictive Explanatory/Predictive 2 Goals of Today’s Class Review of “Describing and Explaining Quantitative Data” Brushing up Your SPSS 3 Computing the Standard Deviation of a Sample - D2CAR D: Deviation 2: Square: 2 C: Collection A: Average R: Square Root Lab Activity-See Excel File “Mean, SD, Z Scores, & Pearson’s Correlation” 4 Interpretation of the Standard Deviation of a Sample Individuals in a sample differ in their values of DV. These differences are of our interest to study. Standard deviation (SD) is a summative measure of the extent to which individuals in a sample differ. Within a sample, some individual have values close to the mean, others far away from the mean. Standard Deviation is the average difference from the mean across the n individuals in a given sample. 5 Transforming the Raw Scores to the Z Scores Lab Activity-See Excel File “Mean, SD, Z Scores, & Pearson’s Correlation” 6 Interpretations of the Z Scores Z scores transformation re-scales the data to have a center of 0 and unit of 1., so called standardization. That is, the mean of the Z scores of a sample is 0, and the SD is 1. A person’s Z scores indicates how many standard units he/she is away from either side of the mean. A Z score shows a person’s relative standing on the scale (-∞ to ∞) to others in the sample. For example, if Mary’s Z score is -1.75, she is 1.75 standard units away on the left hand side of the mean. Note that Z score transformation does not normalize a skewed raw score distribution. 7 Computing the Pearson’s Correlation r Lab Activity-See Excel File “Mean, SD, Z Scores & Pearson’s Correlation” 8 Use & Interpretation of Pearson’s r • One of the X and Y should be quantitative data. • X and Y are assumed to be linearly related. • It is a standardized measure (-1 to 1) for quantifying the covariation between the two variables. • A positive r indicates if people’s X scores are high (low) , their Y scores tend to be high (low). A negative r indicates if people’s X scores are high, their Y scores tend to be low. If there is no trend between the scores of X and Y, then r is zero. • The square of the r, the coefficient of determination, provides an estimate of the proportion of overlapping variance between X and Y (i.e., the degree to which the two sets of numbers vary together). 9 Use of Data in the Course Data Source Special thanks to professor Susan J. Henly from School of Nursing, the University of Minnesota for the SPSS data file presented in today’s class. Ethics for Data Use Under the guidelines of Behavioral Research Ethics Board (BREB) UBC, data circulated in this course cannot be used for purposes other than the learning activities required by this course, unless they are open to public use. 10 Description of Professor Susan J. Henly’s Data This data set includes 40 participants (20 boys) who were randomly assigned to the treatment (new method to reduce injection pain) or control group (just do it quickly!) Immediately after the injection, the children were asked to rate their pain on a 0-100 scale, while a nurse observer who could not hear their response also rated their pain based on their behavioural cues. The dependent variable (i.e., data) we are modeling (describing/summarizing or explaining/predicting) today is the level of pain reported by the kid -“kidrate” Q: Judging by the above description, what was the research question? What type of design was used? What type of data was collected? and what kind of inference could be made? 11 Quantitative Methodology Network Research Question Design Inference Experimental Observational Descriptive vs. Inferential Relational vs. Causal Model Data Descriptive/Summative Explanatory/Predictive Continuous Categorical 12 This Is Where We will be Today (A, Blue Cell) Measurement of Data Descriptive Type of the Inference Inferential Continuous Categorical A B C D • Remember, what we are doing is to model the data by 1. Describe/Summarize 2. Explain/Predict Data = Model + Residual • Note that the inferences remain at the sample level with no intention to generalize to the population. Namely, neither C nor D is covered today. 13 Describing/Summarizing Central Tendency by Using Numbers kidrate Valid Statistics kidrate N Mean Median Mode Valid Missing 40 0 65.3250 64.5000 77.00 Q1: In your opinion, which statistics best characterize the central tendency of kidrate, and why? Q2: Can you tell proximately whether the distribution of kidrate is normal, positively skewed, or negative skewed, and how? 35.00 36.00 38.00 41.00 46.00 47.00 50.00 51.00 52.00 54.00 57.00 59.00 60.00 61.00 63.00 64.00 65.00 69.00 70.00 71.00 72.00 73.00 74.00 75.00 77.00 83.00 84.00 86.00 95.00 100.00 Total Frequency 1 1 1 1 2 1 2 1 1 1 1 1 2 2 1 1 2 2 1 1 1 1 1 1 3 1 1 1 1 3 40 Percent 2.5 2.5 2.5 2.5 5.0 2.5 5.0 2.5 2.5 2.5 2.5 2.5 5.0 5.0 2.5 2.5 5.0 5.0 2.5 2.5 2.5 2.5 2.5 2.5 7.5 2.5 2.5 2.5 2.5 7.5 100.0 Valid Perc ent 2.5 2.5 2.5 2.5 5.0 2.5 5.0 2.5 2.5 2.5 2.5 2.5 5.0 5.0 2.5 2.5 5.0 5.0 2.5 2.5 2.5 2.5 2.5 2.5 7.5 2.5 2.5 2.5 2.5 7.5 100.0 Cumulative Percent 2.5 5.0 7.5 10.0 15.0 17.5 22.5 25.0 27.5 30.0 32.5 35.0 40.0 45.0 47.5 50.0 55.0 60.0 62.5 65.0 67.5 70.0 72.5 75.0 82.5 85.0 87.5 90.0 92.5 100.0 14 Describing/Summarizing Dispersion by Using Numbers Descriptives kidrate Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtos is Lower Bound Upper Bound Statistic 65.3250 59.7784 Std. Error 2.74221 70.8716 65.0556 64.5000 300.789 17.34327 35.00 100.00 65.00 25.25 .289 -.374 .374 .733 Q1: How can minimum and maximum help detect aberrant data points? Q2: By looking at the mean and SD, can you tell whether the data is normally distributed, positively skewed? or negatively skewed? 15 Describing & Summarizing Distribution by Using Pictures Histogram 10 Frequency 8 6 4 2 0 30.00 40.00 50.00 60.00 70.00 80.00 90.00 100.00 Mean = 65.325 Std. Dev. = 17.34327 N = 40 kidrate Q: What are the advantages and disadvantages of displaying data using a histogram? 16 Describing & Summarizing Distribution by Using Pictures Q: What are the advantages and disadvantages of displaying data using a stem and leaf plot? 17 Describing & Summarizing Distribution by Using Pictures Boxplot 100.00 90.00 80.00 70.00 60.00 50.00 40.00 30.00 kidrate 18 Explaining/Predicting the data (kidrate) by Gender- Using Numbers Statistics kidrate Boy Gi rl Mean Median Mode Std. Deviati on Variance Skewness Std. Error of Skewness Range Mi nimum Maximum Mean Median Mode Std. Deviati on Variance Skewness Std. Error of Skewness Range Mi nimum Maximum 62.5000 61.5000 60.00a 14.89790 221.947 .249 .512 59.00 36.00 95.00 68.1500 71.5000 77.00a 19.45920 378.661 .131 .512 65.00 35.00 100.00 a. Multiple modes exis t. The sm allest value is shown 19 Explaining/Predicting the data (kidrate) by Gender - Using Pictures 20 Revisiting the Concept of Statistical Modeling Using Mean Data = Model = Kidrate = = Mean + Res. + Res. + Res. + 21 Variable View of SPSS Data Editor - To specify the format of the spread sheet 22 Data View of SPSS Data Editor - To enter and view the raw data 23 Lab Activity- Hands on SPSS (Statistics) Please report the following statistics for the variable “Nurse-rated Pain” Instruction: Analyze/Descriptive Statistics/Frequencies/ Variables (Enter Nurse-rated Pain)/Statistics… Central Tendency Mean Medium Mode Dispersion Minimum/Maximum Range Quartiles/Interquartile SD/Variance Alternatively, you can use Instruction: Analyze/Descriptive Statistics/ Descriptives/ Variables (Enter Nurse-rated Pain) /Options 24 Lab Activity- Hands on SPSS (Graphs) Please report the histogram for the variable “Nurse-rated Pain” Instruction: Analyze/Descriptive Statistics/Frequencies/ Variables(Enter Nurse-rated Pain)/Charts/Histograms Alternatively, you can use Graphs menu Instruction: Graphs/Histogram/Variable (Enter Nurse-rated Pain) 25 Lab Activity- Hands on SPSS (Explore) My personal preference for describing a continuous variable is to use the following command, which gives output of crucial and comprehensive information in both numbers and pictures Instruction: Analyze/ Descriptive Statistics /Explore /Dependent list (enter “Nurse-rated Pain”) 26 Becoming A Competent User Of SPSS How do I remember all these commands & paths? 1. There is no need to memorize them!! explore the drop-down menus. 2. Your navigation of SPSS should be guided by the conceptual frameworks and the statistical methods you learned in this or previous stats courses. 3. SPSS is just a tool not a brain! Be a clever user! 27 Supplemental Learning Resources for SPSS You can find very useful Youtube tutorials on various SPSS tools and analyses. They are less time consuming to learn than reading texts. As necessary, read the following chapters from the website of Social Science Research and Instructional Council (SSRIC): http://www.csubak.edu/ssric-trd/spss/spsfirst.htm Chapter One: Getting Started With SPSS for Windows Chapter Two: Creating a Data File Chapter Three: Transforming Data Chapter Four: Univariate Statistics 28 This Is Where We Have Been Today Measurement of Data Descriptive Type of the Inference Inferential Continuous Categorical Summative/Descriptive Summative/Descriptive Explanatory/Predictive Explanatory/Predictive Summative/Descriptive Summative/Descriptive Explanatory/Predictive Explanatory/Predictive 29