Download Chapter 30: Interpreting Standardized Test Results

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Lecture 12: Interpreting Standardized Test Results
In addition to preparing and using classroom tests, you will be expected to administer and
interpret standardized tests.
A.
What Makes a Test Standardized?
They are administered, scored, and interpreted in a standardized (consistent) manner.
Standardization permits comparisons across classrooms, schools, and school districts.
These comparisons are made using a representative norm group.
1. ADMINISTERED: remember that there are standard directions, a specific
way to respond to the question (e.g., fill in the bubble correctly), a given number of proctors, an
appropriate setting
2. SCORED:
remember that the use of selected response items insures
objective scoring. Remember also that the scoring of constructed response items (e.g.,
ILLINOIS WRITING ASSESSMENT) are evaluated using a defined rubric and the consistency
of scoring is continually checked
3.
INTERPRETED:
A.
B.
C.
D.
B.
we will discuss five score interpretations
PERCENTILE (OR PERCENTILE RANK)
GRADE-EQUIVALENT SCORES
STANINES
Z SCORES
ADVANTAGES and DISADVANTAGES of each of these five types of score reporting
1.
PERCENTILE (OR PERCENTILE RANK)
advantage –
disadvantage - dependent on the quality of the norm group
2.
GRADE-EQUIVALENT SCORES
advantage – easy to communicate
disadvantage – history of misinterpretation
3.
STANINE
advantage - given the existence of measurement error, it is a report of
general performance
disadvantage 1
4.
C.
Z SCORES
advantage – expressed in standard deviation units (easy to understand)
disadvantage –
Important Terms:
_____________________ – student’s performance is compared to a standard of performance
called a criterion. Test items are drawn carefully selected from a specified set of skills that
make up the goal. (Absolute interpretations)
_____________________ – student’s performance is compared to the performance of others.
When referring to standardized test results, the others are those individuals from the norm
group. (Relative interpretations)
_______________ – typically, a stratified random sample of individuals chosen to represent
the population of individuals about whom inferences will be made. A sample of examinees that is
geographically, ethnographically, racially, parochially, and gender wise similar to the population
who will take the test. These individuals take the test first, their results are analyzed, and
then your students’ scores can be interpreted relative to this group’s performance.
I. Comparing Individual Performance to the Group
As you have seen, we can divide the score scale into performance categories using the mean and
standard deviation
Well Below Below
Average
Average
Average
Above
Average
Well
Above
Average
Standard
deviation
X < -2
-1 ≤ X ≥ +1
+1 < X ≤ +2
X > +2
Percentile
Rank
Below 60
-2 ≤ X < -1
60 to 65
65 to 75
75 to 80
Above 80
*given a hypothetical mean = 70 and standard deviation = 5.
2
II. Calculating and Using Z Scores
Z-Scores
i.  =
-
= (Score – Mean) / standard deviation
s
ii. the distribution of z-scores has a mean = 0
iii.
iv. Z-scores express the distance of a raw score from the mean in standard
deviation units.
Well Below
Average
Below Average
Average
Above Average
Well Above
Average
Z < -2
-2 ≤ Z < -1
-1 ≤ Z ≥ +1
+1 < Z ≤ +2
Z > +2
Individual Test
Score
Practice Calculating Z-Scores
Test Mean
Standard
Deviation
Calculus: 90
98
4
Biology: 85
70
5
U.S. History: 40
45
3
Spanish: 22
21
3
English: 88
85
5
Music: 50
80
15
Geography: 90
85
3
=
-
s
90  98  8

 -2
4
4
85  70 15
 3
5
5
40  45  5

 -1.7
3
3
22  21 1
  0.33
3
3
3
III. More Common Standardized Test Scores
A. Stanines (Standard Nine Point Scale)
i. Range from 1 (lowest) to 9 (highest).
ii. Stanines break the distribution into 9 equal intervals.
iii. Stanines of 1, 2, and 3 reflect below-average achievement in the subject,
compared to the norm group.
iv. Stanines of 4, 5, and 6 reflect average achievement in the subject compared
to the norm group.
v. Stanines of 7, 8, and 9 reflect above-average achievement in the subject
compared to the norm group.
vi. They provide a rough approximation of performance that takes into account
the error of measurement.
B. Percentile Ranks
Percentile Ranks - indicates a student’s relative position in a group by indicating the percentage
of scores the student surpassed. A percentile rank of 80 indicates that the student surpassed
80% of the other students in the norm group who took this same test.
i. Range from 1 to 99.
ii. Depend on the quality of the norm group.
Cautions:
Below Average
Average
Above Average
below 25th
25th - 75th
above 75th
1. Percentile scores do not indicate the percent correct.
2. Percentile scores cannot be manipulated arithmetically (ordinal scale).
3. Percentile scores are based on frequency, so small score increases at the
middle of the distribution result in large percentile rank changes, while even
large score increases at the tails of the distribution do not result in large
percentile rank changes.
4
Approximate comparisons of Scores
Stanine 1
2
3
4
5
6
7
8
9
P. Rank
<4
4-10
11-22
23-39
40-59
60-76
77-88
89-95
96+
Z
< -2
-1.5
-1
-0.5
0
0.5
1
1.5
2+
C. Grade Equivalent Scores
Grade Equivalent Scores - identifies the grade level (year.month) at which a typical student
might obtain the same observed raw score as the examinee
i.
ii.
iii.
iv.
v.
vi.
Grade equivalents range from (year.0) to (year.9) for every grade level (1
through 12).
They are based on the assumption that students learn an equal amount of
information during each of the 9 months of the school year and nothing
during the summer months.
They are based on the performance of three contiguous norm groups.
The scale is created using mathematical interpolation of observed
performance of the 3 contiguous norm groups and extrapolation of
anticipated performance of students who have never taken the test.
Grade equivalent scores are only appropriate to interpret performance in
basic skills areas.
Every grade equivalent score of (year.0) corresponds to the median
performance for that grade level, which has serious implications for
interpretation.
Interpretations:
Rodney is a fourth grade student who has taken a standardized test. For math
computation he received a Grade Equivalent of 7.3. This means that he performed as well as a
seventh grade student in the third month of seventh grade on this fourth grade test!!!!! It does
not mean that he can do seventh grade work, only that he is performing very well for a fourth
grader!
Mary is a ninth grade student who has taken a standardized test. Her language total was
a G.E. of 8.6 and a stanine of 5. Her performance is average, not below grade level!
5
Rachel is a sixth grade student who has taken a standardized test. Her language score
was a G.E. of 3.5 and she placed in the 10th percentile. Her performance is significantly below
grade level.
Cautions:
1. G.E.S. is NOT an estimate of which grade a student should be placed in.
2. Don’t expect all students to gain 1.0 grade each year.
3. G.E. scores on different tests are not comparable.
4. G.E. scores that are below grade level may not be low, remember, 50% of
the students in the fourth grade norm group received G.E. scores < 4.0
because they fell below the median, and your students who place below
the median will as well.
Types of Standardized Tests
1.
Criterion-referenced Achievement Tests
a. Developed by school districts, state departments of education and
commercial testing companies. Their purpose is to measure student
progress toward stated curriculum goals.
b. Example:
2.
Norm-referenced Achievement Tests
a. Used primarily to compare students’ achievement to that of a large,
representative group of students at the same grade level, called the
norm group. Their purpose is to ascertain whether a student’s or a
group’s achievement level is above average, average, or below average
when compared to the norm group.
b. Examples:
3.
Scholastic Aptitude Tests
a. Developed by commercial test publishers, these tests measure students’
thinking and reasoning skills rather than curriculum-based skills. Student
performance is compared to a norm group. Their purpose is to predict
how well the student will achieve in given subjects, special training
programs, higher education, and certain careers.
b. Examples:
6
Which standardized test should you use?
They are all expensive so many schools just use a norm-referenced achievement test.
While many companies now report criterion-referenced scores along with norm-referenced
interpretations, be careful during interpretation due to the high item difficulty and
discrimination indices and selected content sampling typical of norm-referenced tests.
Just like on your tests, the items must match the instructional objectives. If the curriculum of
your classroom does not match the instructional criteria of the test, students will not perform
well.
How Should Teachers Help Students Prepare for Standardized Tests?
1. General teacher instruction on objectives not determined by looking at the objectives
measured on a standardized test
2. Teaching general test taking skills
3. Specific instruction based on objectives that specifically match those on the
standardized tests
4. Instruction on specific objectives where the practice follows the same format as the
test questions
7