Download Chapter 15 - standardized testing

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update CHAPTER 15 Standardized Tests and Teaching © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Learning Goals 1. Discuss the nature of standardized tests. 2. Compare aptitude and achievement testing and describe current uses of achievement tests. 3. Identify the teacher’s role in standardized testing. 4. Evaluate some key issues in standardized testing. 15.2 © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Standardized Tests and Teaching The Nature of Standardized Tests What Is a Standardized Test? The Purposes of Standardized Tests Criteria for Evaluating Standardized Tests 15.3 © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update The Nature of Standardized Tests Standardized Tests • Have uniform procedures for administration and scoring. • Allow comparison of student scores by age, grade level, local and national norms. • Attempt to include material common across most classrooms. 15.4 © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Enter the Debate Should students have to pass a test to earn a high school diploma? YES NO 15.5 © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Purposes of Standardized Tests Diagnose students’ strengths and weaknesses Provide information about student progress and program placement Contribute to accountability Provide information for planning and instruction Help in program evaluation 15.6 © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update The Nature of Standardized Tests Standards-based tests assess skills that students are expected to have mastered before they can be permitted to move to the next grade or be permitted to graduate. High-stakes testing is using tests in a way that will have important consequences for the student, affecting major educational decisions. 15.7 © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Evaluating Standardized Tests Norms – Does the normative group represent all students who may take the test? Reliability – Are test scores stable, dependable and relatively free from error? Validity – Does the test measure what it is purported to measure? 15.8 Correlation Indicates strength of relationship (0.00 to 1.00) Correlation coefficient Correlation Coefficient is a statistical measure of relationship between two variables. r = + 0.37 Indicates direction of relationship (positive or negative) 9 Pearson correlation coefficient • r = the Pearson coefficient • r measures the amount that the two variables (X and Y) vary together (i.e., covary) taking into account how much they vary apart • Pearson’s r is the most common correlation coefficient; there are others. Computing the Pearson correlation coefficient • To put it another way: degree to which X and Y vary toge ther r degree to which X and Y vary separately • Or covariabil ity of X and Y r variabilit y of X and Y separately Sum of Products of Deviations • Measuring X and Y individually (the denominator): – compute the sums of squares for each variable • Measuring X and Y together: Sum of Products – Definitional formula SP  ( X  X )(Y  Y ) – Computational formula XY SP  XY  n • n is the number of (X, Y) pairs Correlation Coefficent: SP r • the equation for Pearson’s r: SS X SSY • expanded form: XY XY  n r 2 2      X  Y   2 2  X   Y      n n    Correlation Coefficient Interpretation Coefficient Range 0.00 - 0.20 Strength of Relationship Practically None 0.20 - 0.40 Low 0.40 - 0.60 Moderate 0.60 - 0.80 High Moderate 0.80 - 1.00 Very High © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Reliability Test-retest: The extent to which a test yields the same score when given to a student on two different occasions Alternate-forms: Two different forms of the same test on two different occasions to determine the consistency of the scores Split-half: Divide the test items into two halves; scores are compared to determine test score consistency 15.19 Methods of Studying Reliability Interrater Reliability- The consistency of a test to measure a skill, trait, or domain across examiners. This type of reliability is most important when responses are subjective or open-ended. Terry Overton Assessing Learners with Special Needs, 5e Copyright ©2006 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Types of Validity… Content: Test’s ability to sample the content that is being measured Criterion-related: 1. Concurrent: The relation between a test’s score and other available criteria 2. Predictive: The relationship between test’s score and future performance Construct: The extent to which there is evidence that a test measures a particular construct 15.21 Factor Analysis statistical technique which uses the correlations between observed variables to estimate common factors and the structural relationships linking factors to observed variables. The diagram below illustrates how two observed variables can correlate because of their relationships with a common factor. © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Standardized Tests and Teaching Aptitude and Achievement Tests Comparing Aptitude and Achievement Tests Types of Standardized Achievement Tests District and National Tests High-Stakes State-Mandated Tests 15.23 © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Aptitude vs. Achievement Tests Aptitude Tests Achievement Tests Predict a student’s ability to learn a skill or accomplish a task. (Stanford Binet, Wechsler, SAT when used to predict success) Measure what the student has learned or mastered. (California Achievement, IOWA Basic Skills, SAT when used to determine what has been learned) 15.24 High-Stakes State-Mandated Tests Possible Advantages Criticisms - Improved student performance - More teaching time - Higher student expectations - Identification of poor-performing schools/teachers - Improved confidence in schools - “Dumbing down” and more emphasis on rote memorization - Less time for problem-solving and critical thinking skills - Teachers “teaching to the test” - Discrimination against low-SES and ethnic minority children © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update National Assessment of Educational Progress A federal “census-like” exam of students’ knowledge, skills, understanding, and attitudes Reading 1992–2000 4th grade no improvement 1992–1998 8th and 12th no improvement Math 1990–2000 4th and 8th 1990–2000 12th Science 1996–2000 4th and 8th 1996–2000 12th improvement decline no change decline 15.26 © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Standardized Tests and Teaching The Teacher’s Role Preparing Students to Take Standardized Tests Administering Standardized Tests Using Standardized Test Scores to Plan and Improve Instruction Understanding and Interpreting Test Results 15.27 © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update The Don’ts of Standardized Testing DON’T •Teach to the test • Use the standardized test format for classroom tests • Describe tests as a burden • Tell students that important decisions will be made solely on the results of a single test • Use previous forms of the test to prepare students • Convey a negative attitude about the test 15.28 Descriptive statistics are the mathematical procedures that are used to describe and summarize data. Counting the Data-Frequency Look at the set of data that follows on the next slide. A tally mark was made to count each time score occurred Which number most likely represents the average score? Which number is the most frequently occurring score? a Frequency Distribution Average Score? Scores 100 99 98 94 90 89 88 82 75 74 68 60 Tally 1 1 11 11 1111 1111 11 1111 1111 1111 1 11 1 1 1 Frequency 1 1 2 2 5 7 10 6 2 1 1 1 88 Most Most Frequent Score? 88 Tally 1 1 11 11 1111 1111 11 1111 1111 1111 1 11 1 1 1 This frequency count represents data that closely represent a normal distribution. © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Descriptive Statistics 15.32 Frequency Polygons 5 Data 100 89 99 89 98 89 98 89 94 88 94 88 90 75 90 75 90 74 90 68 90 60 4 3 2 1 60 68 74 75 88 89 90 94 98 99 100 Scores © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Measures of Central Tendency Measures of central tendency provide information about the average or typical score in a data set Mean: The numerical average of a group of scores Median: The score that falls exactly in the middle of a data set Mode: The score that occurs most often 15.34 Central tendency = representative or typical value in a distribution X  Mean M  Same thing as an average N Computed by Summing all the scores (sigma, ) Dividing by the number of scores (N) Mean- To find the mean, simply add the scores and divide by the number of scores in the set of data. 98 + 94 + 88 + 75 = 355 Divide by the number of scores: 355/4 = 88.75 Mean Measures of Central Tendency • Steps to computing the median 1. Line up scores from highest to lowest 2. Count up to middle score • If there is 1 middle score, that’s the median • If there are 2 middle scores, median is their average Median-The Middlemost point in a set of data Data Set 1 100 99 99 98 97 96 90 88 85 80 79 Data Set 2 Median 96 100 99 98 97 86 82 78 72 70 68 The median is 84 for this set. 84 represents the middle most point in this set of data. Mode-The most frequently occurring score in a set of data. Find the modes for the following sets of data: Data Set 3 99 89 89 89 89 75 Mode: 89 Data set 4 99 88 88 87 87 72 70 88 and 87 are both modes for this set of data. This is called a bimodal distribution. Measures of Variability (Dispersion) Range- Distance between the highest and lowest scores in a set of data. 100 - 65 = 35 35 is the range in this set of scores. Variance - Describes the total amount that a set of scores varies from the mean. 1. Subtract the mean from each score. When the mean for a set of data is 87, subtract 87 from each score. 100 - 87 = 13 98- 87 = 11 95- 87 = 8 91- 87 = 4 85- 87 = -2 80- 87 = -7 60- 87 = -27 2. Next-Square each differencemultiply each difference by itself. 13 x 13 = 169 11 x 11 = 121 8 x 8= 64 4 x 4 = 16 -2 x -2 = 4 -7 x -7 = 49 -27x -27= + 729 1,152 3. Sum these differences Sum of squares 4. Divide the sum of squares by the number of scores. 1,152 divided by 7 =164.5714 This number represents the variance for this set of data. Standard Deviation-Represents the typical amount that a score is expected to vary from the mean in a set of data. 5. To find the standard deviation, find the square root of the variance. For this set of data, find the square root of 164.5714. The standard deviation for this set of data is 12.82 or 13. Ceiling and Floor Effects • Ceiling effects – Occur when scores can go no higher than an upper limit and “pile up” at the top – e.g., scores on an easy exam, as shown on the right – Causes negative skew • Floor effects – Occur when scores can go no lower than a lower limit and pile up at the bottom – e.g., household income – Causes positive skew Skewed Frequency Distributions • Normal distribution (a) • Skewed right (b) – Fewer scores right of the peak – Positively skewed – Can be caused by a floor effect • Skewed left (c) – Fewer scores left of the peak – Negatively skewed – Can be caused by a ceiling effect © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Understanding Descriptive Statistics The Normal Distribution: A “bell-shaped” curve in which most of the scores are clustered around the mean; the farther from the mean, the less frequently the score occurs. 15.52 Bell Curve © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Commonly Reported Test Scores Based on the Normal Curve 15.54 Z Scores • When values in a distribution are converted to Z scores, the distribution will have – Mean of 0 – Standard deviation of 1 • Useful – Allows variables to be compared to one another even when they are measured on different scales, have very different distributions, etc. – Provides a generalized standard of comparison Z Scores • To compute a Z score, subtract the mean from a raw score and divide by the SD • To convert a Z score back to a raw score, multiply the Z score by the SD and then add the mean (X  M ) Z SD X  ( Z )( SD)  M © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Standardized Tests and Teaching Issues in Standardized Testing Standardized Tests, Alternative Assessments, High-Stakes Testing Diversity and Standardized Testing 15.57 © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Issues in Standardized Testing Alternative Assessments • • • • Assessments of oral presentations Real-world problems Projects Portfolios Diversity and Standardized Tests • Gaps on standardized tests have been attributed to environmental rather than hereditary factors • Special concern in creating culturally unbiased tests 15.58 © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Crack the Case Standardized Tests 1. What are the issues involved in this situation? 2. Examine Ms. Carter’s testing procedures. What does she do incorrectly? How might this reduce the validity of the students’ scores? 3. How would you answer each of the parents’ questions? 15.59 © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Reflection & Observation Reflection: What standardized tests have you taken? How have these tests affected your perceptions of competence? Observation: What are some of the mother’s concerns regarding her son’s standardized test scores? What error does the teacher make in interpreting one of the test scores? How would you explain this score?

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 15 - standardized testing