Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Methods of Research and Enquiry Basic Statistics and Correlational Research by Dr. Daniel Churchill What is statistics? R&D in IT in Education Statistics is a body of mathematical techniques or processes for gathering, organizing, analyzing, and interpreting numerical data. Basic Concepts R&D in IT in Education Measurement – assigning a number of observation based on certain rules Variable – a measured characteristic (e.g., age, grade level, test score, height, gender) A constant – a measure that has only one value Continuous variable – can have a wide range of values (e.g., height) Discrete variables – have a finite number of distinct values between any two given points (age between 30-50) Basic Concepts R&D in IT in Education Independent variables -- purported causes Dependent variables -- purported effects Two instructional strategies, co-operative groups and traditional lectures, were used during a three week social studies unit. Students’ exam scores were analyzed for differences between the groups. The independent variable is the instructional approach (of which there are two levels) The dependent variable is the students’ achievement Obj. 2.3 Basic Concepts R&D in IT in Education A population – entire group of elements that have at least one characteristics in common A sample – a small group of observations selected from the total population A parameter – a measure of a characteristics of an entire population A statistic - a measure of a characteristics of a sample Statistics – a method Basic Concepts R&D in IT in Education Descriptive statistics – classify, organize, and summarize numerical data about a particular group of observations (e.g., a number of students in HK, the mean maths grade, ethnic make-up of students) Inferential statistics – involve selecting a sample from a defined population and studying it. These two statistics are not mutually exclusive Probability and Level of Significance R&D in IT in Education Studies yield statistical results which are used to decide whether to retain or reject the null hypothesis The decision is made in term of probability, not certainty Once we obtain sample statistic, we compare the obtained value to the appropriate critical value (from tables) Mostly, the probability level of 5% (p of .05) is considered statistically significant Data Collection Measurement scales R&D in IT in Education Nominal – categories Gender, ethnicity, etc. Ordinal – ordered categories Rank in class, order of finish, etc. Interval – equal intervals Test scores, attitude scores, etc. Ratio – absolute zero Time, height, weight, etc. Obj. 2.1 R&D in IT in Education Measurement Scales R&D in IT in Education Watch videos from Learner.org http://learner.org/resources/series158.html Watch Video 5. Variation About the Mean Statistical measures R&D in IT in Education Measures of central tendency or averages Mean Median -- a point in an array, above & below which one-half of the scores fall Mode -- the score that occurs most frequently in a distribution R&D in IT in Education Organizing Data Source: http://www.learnactivity.com/lo/ R&D in IT in Education Example Here is a set of maths test scores (raw scores) for a class of 31 students 37, 42, 52, 58, 58, 61, 65, 73, 74, 42, 39, 56, 66, 54, 57, 59, 69, 72, 67, 61, 51, 56, 63 78, 45, 63, 72, 48, 63, 48, 54, Organizing measurements R&D in IT in Education 37, 42, 52, 58, 58, 61, 65, 73, 74, 42, 39, 56, 66, 54, 57, 59, 69, 72, 67, 61, 51, 56, 63 78, 45, 63, 72, 48, 63, 48, 54, Steam Leaf 3 7 9 4 2 2 5 8 8 5 1 2 4 4 6 6 7 8 8 9 6 1 1 3 3 3 5 6 7 9 7 2 2 3 4 8 Organizing measurements –frequency tables R&D in IT in Education 37, 42, 52, 58, 58, 61, 65, 73, 74, 42, 39, 56, 66, 54, 57, 59, 69, 72, 67, 61, 51, 56, 63 78, 45, 63, 72, 48, 63, 48, 54, Cumulative Percent Frequency Frequency Cumulative Percentage Test Score Frequency Midpoint 36-40 2 38 30 7 100 41-45 3 43 28 10 93 46-50 2 48 25 7 83 51-55 4 53 24 13 80 56-60 6 58 20 20 67 61-65 6 63 14 20 47 66-70 3 68 8 10 27 71-75 4 73 5 13 17 76-80 1 78 1 3 3 -------------N=30 Organizing measurements – Histogram 58, 61, 65, 73, 74, 42, 39, 56, 66, 54, 57, 59, 69, 72, 67, 61, 51, 56, 63 78, 45, 63, 72, 48, 63, 48, 54, 7 6 Frequency R&D in IT in Education 37, 42, 52, 58, 5 4 3 2 1 0 36-40 41-45 46-50 51-55 56-60 61-65 66-70 71-75 76-80 Test Score Organizing measurements – Mean R&D in IT in Education 37, 42, 52, 58, 58, 61, 65, 73, 74, 42, 39, 56, 66, 54, 57, 59, 69, 72, 67, 61, 51, 56, 63 78, 45, 63, 72, 48, 63, 48, 54, X= The mean, the median and the The mean X=X N X = mean Σ = sum of X = scores in a distribution N = number of scores It is the base from which many measures are computed. 37 + 58 + 74 + … + 72 + 63 31 = 58 Organizing measurements – Mode and Median R&D in IT in Education 37, 42, 52, 58, 37, 42, 52, 58, 58, 61, 65, 73, 58, 61, 65, 73, 74, 42, 39, 56, 66, 54, 57, 59, 69, 72, 74, 42, 39, 56, 66, 67, 61, 51, 56, 63 54, 57, 59, 69, 72, 67, 61, 51, 56, 63 78, 45, 63, 72, 78, 45, 63, 72, 48, 63, 48, 54, 48, 63, 48, 54, Mode -- the score that occurs most frequently in a distribution Median -- a point in an array, above & below which one-half of the scores fall 63 59 Organizing measurements – Histogram 58, 61, 65, 73, 74, 42, 39, 56, 66, 54, 57, 59, 69, 72, 67, 61, 51, 56, 63 78, 45, 63, 72, 48, 63, 48, 54, Median 7 Mean Mode 6 Frequency R&D in IT in Education 37, 42, 52, 58, 5 4 3 2 1 0 36-40 41-45 46-50 51-55 56-60 61-65 66-70 71-75 76-80 Test Score R&D in IT in Education Statistical measures Measures of spread or dispersion Range -- the difference between the highest and the lowest scores plus one Standard deviation – average distance from the mean (also see calculator) Variance – squared standard deviation Z-score -- a number of standard deviations from the mean Z=(score-mean)/SD R&D in IT in Education Basic Formulas for Sample Variance = S2= (X-X)2 n Standard 2 = S = S Deviation Normal Distribution Variance = S2= (X-X)2 n R&D in IT in Education Standard 2 Deviation = S = S Source: http://en.wikipedia.org/wiki/Normal_distribution R&D in IT in Education Source: http://noppa5.pc.helsinki.fi/koe/flash/histo/histograme.html Z-Score Example R&D in IT in Education X-X z score = z = S Example, compare a student’s performance on Maths and English tests if the student’s scores, class means and standard deviations for the classes are known Student's Score Class Mean Class S English 50 45 5 Maths 68 56 6 Subject 50-45 zEnglish= =+1 5 68-56 zMaths = = +2 6 Z-Score Example R&D in IT in Education zEnglish zMaths R&D in IT in Education Z score vs. T score, and Percentile Rank Correlational Studies R&D in IT in Education Attempts to describe the predictive relationships between or among variables The predictor variable is the variable from which the researcher is predicting The criterion variable is the variable to which the researcher is predicting Objectives 10.1 & 10.2 Relationship Studies General purpose Gain insight into variables that are related to other variables relevant to educators R&D in IT in Education Achievement Self-esteem Self-concept Two specific purposes Suggest subsequent interest in establishing cause and effect between variables found to be related Control for variables related to the dependent variable in experimental studies Objectives 5.1 & 5.2 Correlational Data R&D in IT in Education Income/month ($) Expenditure/month ($) 4000 4000 4000 5000 5000 6000 2000 2000 9000 6000 4000 2000 7000 5000 8000 6000 9000 9000 5000 3000 Scatter Diagram Expenditure/month ($) 4000 4000 4000 5000 5000 6000 2000 2000 9000 6000 4000 2000 7000 5000 8000 6000 9000 9000 5000 3000 10000 9000 8000 7000 Expenditure R&D in IT in Education Income/month ($) 6000 5000 4000 3000 2000 2000, 2000 1000 0 0 2000 4000 6000 Income 8000 10000 R&D in IT in Education Source: http://noppa5.pc.helsinki.fi/koe/corr/index.html Correlation Coefficients R&D in IT in Education The general rule +.95 is a strong positive correlation +.50 is a moderate positive correlation +.20 is a low positive correlation (small correlation) -.26 is a low negative correlation -.49 is a moderate negative correlation -.95 is a strong negative correlation Predictions Between .60 and .70 are adequate for group predictions Above .80 is adequate for individual predictions Objective 3.3 & 3.5 Conducting a Prediction Study Identify a set of variables Limit to those variables logically related to the R&D in IT in Education criterion Identify a population and select a sample Identify appropriate instruments for measuring each variable Ensure appropriate levels of validity and reliability Collect data for each instrument from each subject Typically data is collected at different points in time Compute the results Regression coefficient Regression equation Hypotheses for Correlation R&D in IT in Education H0: r = 0 HA: r 0 R&D in IT in Education Collecting Measurement Instrument – a tool used to collect data Test – a formal, systematic procedure for gathering information Assessment – the general process of collecting, synthesizing, and interpreting information Obj. 3.1 & 3.2 The Process R&D in IT in Education Participant and instrument selection Minimum of 30 subjects Instruments must be valid and reliable Higher validity and reliability requires smaller samples Lower validity and reliability requires larger samples Design and procedures Collect data on two or more variables for each subject Data analysis Compute the appropriate correlation coefficient Objectives 2.2 & 2.3 Selection of a Test R&D in IT in Education Sources of test information, e.g.,: Mental Measurement Yearbooks (MMY) Buros Institute ETS Test Collection ETS Test Collection Types of Correlation Coefficients R&D in IT in Education The type of correlation coefficient depends on the measurement level of the variables Pearson r - continuous predictor and criterion variables Math attitude and math achievement Spearman rho – ranked or ordinal predictor and criterion variables Rank in class and rank on a final exam Phi coefficient – dichotomous predictor and criterion variables Gender and pass/fail status on a high stakes test Objectives 7.1, 7.2, & 7.3 Calculating Pearson Correlation Coefficient R&D in IT in Education Z-score formula r= zxzy N Raw score formula r= NXY-( X)( Y) (NX2-(X)2) (NY2-(Y)2) R&D in IT in Education Just for information Critical Values of the Pearson Product-Moment Correlation Coefficient: First you determine degrees of freedom (df). For a correlation study, the degrees of freedom is 2 less than the number of subjects. Use the critical value table to find the intersection of alpha .05 (see columns) and 25 degrees of freedom (see rows). The value found at the intersection (.381) is the minimum correlation coefficient needed to confidently state 95 times out of a hundred that the relationship you found with your subjects exists in the population from which they were drawn. If the absolute value of your correlation coefficient is above .381, you reject your null hypothesis (there is no relationship) and accept the alternative hypothesis: e.g., there is a statistically significant relationship between arm span and height, r (25) = .87, p < .05. If the absolute value of your correlation coefficient were less than .381, you would fail to reject your null hypotheses: There is not a statistically significant relationship between arm span and height, r (25) = .12, p > .05. Source: http://www.gifted.uconn.edu/siegle/research/Correlation/alphaleve.htm R&D in IT in Education Prediction and Regression The position of the line is determined by “b” or the slope (the angle), and “a” of the interceptor (the point where the line intersects with Y-axis). Y= bX + a Source: http://noppa5.pc.helsinki.fi/koe/corr/index.html Other Correlation Analyses Multiple Regression Two or more variables are used to predict R&D in IT in Education one criterion variable Cannonical correlation An extension of multiple regression in which more than one predictor variable and more than one criterion variable are used Factor analysis A correlational analysis used to take a large number of variables and group them into a smaller number of clusters of similar variables called factors References R&D in IT in Education Gay, L. R., Mills, G. E., & Airasian, P. (2006). Educational Research: Competencies for Analysis and Applications. Upper Saddle River, N.J. : Pearson/Merrill Prentice Hall. Ravid, R. (2000). Practical statistics for educators. (2nd ed). New York, NY.: University Press of America, Inc.