Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SES1 part two Basic Quantitative Concepts • Measurement and assessment are based on the concepts related to quantitative concepts • Scale of Measurement – Nominal – Ordinal – Ratio – equal intervals Measurement Scales • There are four levels of measuring variables and data. Each level provide greater information – Nominal – Ordinal – Interval – Ratio Ratio Scale • Not used a great deal in education. Show data as represented in ratio or percentages. • His weight was half as much as Tom’s. • Not the same as percentile scores. Nominal • Nominal defines several mutually exclusive categories (remember this is quantitative, there are no gray areas) – Gender, race, type of school. – (The term Nominal also refers to the nature of the data that are collected. You can simply count instances of number of participants, fourth, fifth, sixth) Measurement Scales • In education, Ordinal and equal interval scales are by far the most used scale • Nominal and ratio scales are rarely used in education Ordinal Scale • Categories are rank order. No indication of by how much they are separated. Top, bottom, middle, first, second, third. • It ranks things from better to worse, worse to better – Example- good, better or novice, intermediate, expert – High risk, low risk, emerging, reader Ordinal Scale • Used in education a great deal- Serious, more serious, most serious, • In an ordinal scale, the difference between the scales is unknown- we cannot determine how much better an intermediate is from a novice • Because the value between the scales is unknown, and presumed unequal it cannot be averaged • For example, an emerging and an expert cannot be averaged to be a proficient Equal–interval scales • Like an ordinal scale, levels are ordered from best to worse and vice versa • The difference is that you know the magnitude of the difference between the levels • The values between the scales is also equal, therefore you can add, subtract and average them – Ex. Length or weight Frequency Distribution • If you have a set of scores for a class. A frequency distribution organizes these scores into a format that makes it easier to see how subjects preformed. The two most commonly used formats are the Table and the Graph Distribution • Sets of scores( a class set of test scores, or a school set) can be described in terms of four characteristics – Mean – Variance – Skew – Kurtosis Mean • Mean is just a term for the average 95 88 99 91 69 442 442÷5=88.4 Variance • Variance is the difference between each score and every other score • It is calculated with a rather complicated formula, which I will show you next • It is a key concept in understanding most statistics Variance Skew • Refers to the symmetry of of a distribution of scores. – In a symmetrical set, the scores above the mean mirror the scores below the mean, they are not skewed – If a set of scores represent all good grades, because the students learned the material well, or the test was easy, the chart for the distribution of scores would be skewed – The opposite can also happen when the test scores are not good Bell Shaped Curve Bell Shaped Curve • Symmetrical • Used to determine many standards – A, B,C – IQ – SAT Negative Skewed Curve Negative Skewed Curve • Shows most students scored well or high on the measure Positive Skewed Curve Positive Skewed Curve • Shows most students scored poorly or low on the measure kurtosis • Describes the peakedness of a curve • Platykuric- flat curve • Leptokuric – fast rising curve kurtosis kurtosis Measures of Central Tendency • Mean- Average • Median- Middle • Mode- Most frequent Average • Is a general description of a group as a whole • Already demonstrated Mode • THE MODE- MOST FREQUENT SCORE • Some distributions have two mode, called bimodal some have more than two modes Median • Important concept in understanding certain scores • The mean is the point where 50% of test takers are above the point and 50% of test takers are below the point Mean • The mean is the same thing as the average • It is symbolized by Measures of dispersion • There are three measures of dispersion – Range – Variance – Standard deviation Range • Is the distance between extremes, between the highest and lowest score • A crude measure of dispersion because it uses only two points of data • What if the scores were • 35 88,89,94,95,98, 99,97,96 • The range is 61, but it is deceiving, the scores are really clustered in the 90s Variance • Shows how far numbers are spread from the mean • Some data have high variance, some have low • Already saw video on this Standard Deviation • It is the positive square root of the variance • It is often used as a unit of measurement similar to an inch or a foot • When scores are equal interval, standard deviation Standard Deviation • On a classic curve 34 % fall in first SD – 13% on second SD – 2% on third SD – .13% on fourth SD Correlation • Measures how much two variables are related. The value of one variable can predict the other. When one occurs, so does the other, • Positive correlation when one variable increases, the other variable increases • Negative correlation- when one variable increases, the other decreases Correlation • Correlation Coefficient-it is measured by a number from 1 to -1. It measures he direction and strength of the relationship of the correlation. .10 and .3o are small correlations, ,4 and .6 moderate correlation., .8 to 1.O are high relationships Objective vs. Subjective scoring • Objective- based on observable qualities that is not based on emotion • Subjective- is based on observable scoring that is based on qualities that rely on personal impression Activity • List the strengths and weaknesses of subjective and objective scoring Summarizing Student Performances • As you assess students you have a number of ways to determine if they understand the knowledge you are interested in assessing. • Some are very simple and provide limited amounts of information while others are more detailed and relay a variety of information Activity • Meet in your groups • As I describe all of the different methods for summarizing students data, think of examples of when it would be used or a test that uses it. • Save these and we will share them in the end Dichotomous Scoring • Used when you have one right or wrong answer • It either is right, or it is wrong Dichotomous Scoring Partial Answer • Partial answer is used when heir are multiple steps and you are looking for more than a right or wrong answer • Some of the criteria could include steps or a procedure in the process in addition to the right answer Summary Index • When you are concerned with multiple items and you are concerned with performance on all of the items as a whole • The sum of the correct items is the most basic summary index Other summary indexes • Often a summary index does not provide enough information as just a raw score • These scores are then converted into more meaningful scores • There are five basic types – – – – – Percent correct Percent accuracy Rate of correct responses Fluency Retention Percent correct • Divide the number correct into the number possible – 40 correct, 50 possible 40÷50=.8x100=80% – Usually used when students have ample time to complete test- Power test – Very frequently used Accuracy • This is the number of correct responses divided by the number of attempted responses multiplied by 100 – 150÷175=.86x100=86% correct • Yields a higher score, reveals if a child has a skill, but does not have automaticity or processing speed, • Provides important information on intervention. Accuracy • Often these percentages are given labels – Above 90% mastery – Below 90- non-mastery – The labels are somewhat arbitrary • The other form is labeling them as and instructional level – 95% or above independent – Between 85 and 94- instructional – Below 85- frustration Accuracy • Independent- means is the point where a students can perform without assistance • Instructional- is the point that has enough challenge where a students is likely to be successful, but is not guaranteed success • Frustration- is the level that is too difficult for a student Fluency • Is the number of correct responses per minute • Measures what a child can do automatically, what is at their fingertips Retention • Refers to the percentage of learned information that is recalled • Sometimes called maintenance or recall of what has been learned • Calculated the same way – The number recalled(15) by the number originally learned (20) – 15÷20=.75x100=75% correct Feedback • Dichotomous Scoring • Partial Answer • Summary Index – Percent correct – Percent accuracy – Rate of correct responses – Fluency – Retention Interpretation of Test Performance • There are three common was to interpret students performance – Criterion –Referenced Interpretations – Achievement-Referenced Interpretations – Norm-Referenced Interpretations Criterion-Referenced Interpretations • When you are interested in what a student’s knowledge about a single fact, you compare it against an objective and absolute standard (criterion) of performance • To be considered criterion-Referenced, there must be a clear objective response to each portion of the question if partial credit is to be given Achievement Standards-Referenced Interpretations • In large scale assessments, school districts measure the degree to which they are meeting state and national achievement levels. • to do so the indices consist of four components – – – – Level of performance Objective criteria Examples Cut scores Achievement Standards-Referenced Interpretations • Level of performance – There is a range of level that are attached and assigned to bands or ranges of performance, below basic, basic, proficient, advance • Objective criteria – Each level of performance is described by a precise, objective description that can be quantified • Examples – Examples of student work at each level • Cut scores- these scores delineate student performance at each level Norm-Referenced Interpretations • Sometimes testers are interested in how their students compare to the performance of other students, usually with similar demographics characteristics – Grade, age, gender and so forth • In order to make these comparisons, the students scores are transformed into derived scores Derived Scores • Two types – Developmental scores– Relative standing scores Derived Scores • Developmental scores- There are two types of developmental scores – Developmental Equivalents– Developmental quotients Derived Scores • Developmental scores– Developmental Equivalents- maybe be based on age or grade • Developmental equivalents are based on the average performance of individuals in a given grade or age • If the average fifth grader averaged 25 correct on the test, and Jim scored 25,he would have a grade equivalent of 5th grade • Grade equivalents are expressed in decimals- 5.3 means 5th grade third month • Age equivalents are expressed in years and months with a hyphen 7-11, seven years 11 months Interpreting scores • Often scores are misinterpreted • They should be interpreted as the average performance of that age level There are five problems that people make – Systematic misrepresentation – Need for interpolation and extrapolation – Promotion of typological thinking – Implications of false standard of performance – Tendency for scales to be ordinal, not interval Interpreting scores • Systematic misrepresented – Because a child scores a 12-0 age equivalent it does not mean preformed as a 12-0 child. They got the same score, they may have attacked the problems differently – A younger child and an older child could get the same score, but could have went about getting the scores very differently Interpreting Scores • Need for interpolation and extrapolation – The scores that students are compared to are estimates, there may not be a sample for each grade, age and month – Levels were determined by estimates based on what students did take the test Interpreting Scores Promotion of typological thinking- the logic behind these equivalences, it that average children perform the same What is an average child, lives in a family with 1.2 siblings, their family has 2.3 cars, has ,8 dogs In other words, there is no average family Interpreting Scores Implications of a false standard performanceOften the performance level stated is not accurate Think of how you get an averageLook at the set of numbers to the right • • • • • • + 95 90 85 80 75 70 65 560÷7= 80 Interpreting Scores The average in this case is 80 80 would be the score the equivalent would be based on (except for more numbers) Half of the scores are highertherefore, is it really a representative as an age or grade equivalent? • • • • • • + 95 90 85 80 75 70 65 560÷7= 80 Interpreting Scores Tendency for Scales to be ordinal, not equal interval Because the scales are ordinal , and not equal distances, you can not add, or subtract or average them Interpreting Scores When interpreting score , you must always relate it to chronological age You need to compare MA and CA simultaneously IQ = MA/CA a Developmental age of 120 months is great for a fourth grader, but bad for a eight grader Stopped 2-14-14 Interpreting Scores • Percentile scores- these are scores that are below a given rank • A raw score (58) converts to a percentile rank 68. this means that this person scored the same or better than 68% of the test takers • Score can range from .1 to 99.9 • The 50th is median Interpreting Scores • Percentile scores- are sometimes presented in bands- the two most common are Deciles and quartiles ( PSSAs uses quintiles) – Deciles- is when 10 percentile ranks are within each norm group – The first decile is .1 to 9.9 – The second is 10 to 19.9 – The tenth is from 90 too99.9 Interpreting Scores • Percentile scores- are sometimes presented in bands- the two most common are Deciles and quartiles ( PAAS uses quintiles) Quartiles are percentile bands 25 percentiles wide – First quartile - .1 to 24.9 – Second is 25 to 49.9 Interpreting Scores • Percentile scores- are great to compare a child. It allows you to see how they are doing to another students in math • Done all the time with heights and weights • You cannot compare between the test, you cannot say that tom is 10 percentile points better in reading than math • You also need to interpret them in context • See next slide Activity If a child had a 40 percent locally and a 85 percent nationally, is this the sign of a high or low achieving school district You are in a meeting, Describe to a parent what this means Activity It means that 60% of the students in the district scored at or above the 85th percentile nationally. Interpreting Scores • Standard Scores– There are many different types of standard scores – T-scores and Z-scores are the most popular Interpreting Scores • Standard Scores- are scores with a Predetermined mean and standard deviation • The most common is a Z score • The mean is a 0 • The standard deviation is 1 • Positive scores are above the mean and negative scores are below the mean • The larger the number, the further you are away from the mean Interpreting Scores z-score Interpreting Scores t-score • T scores have a mean of 50 with a standard deviation of 10 • 60 is one standard deviation above the mean Interpreting Scores Standard scores • IQ are standard deviations with a mean of 100 and a standard deviation of 15 Interpreting Scores Standard scores • As I mentioned, Standard scores have predetermined mean and Standard deviations • Figure out what is the mean and standard deviation for the sat math or reading? Interpreting Scores Standard scores • Standards scores are used frequently and the mean can be determine arbitrarily as well as the standard deviation • SAT Mean = 500, +1sd=600, +2sd=700, +3sd=800, -1sd=400,-2=300,-3sd=200 • This is why the lowest you can get on any test is 200 Interpreting Scores Standard scores • NCE- Normal curve equivalent are standard scores of a mean equal to 50 and a standard deviation 21.06 Interpreting Scores Standard scores • NCE- A normal curve equivalent (NCE) allows meaningful comparison between different test sections within . For example, if a student receives NCE scores of 53 on the Reading test and 45 on the Mathematics test, you can correctly say that the Reading score is eight points higher than the Mathematics score. • NCEs are represented on a scale of 1 - 99. This scale coincides with the national percentile scale at 1, 50, and 99. • Interpreting Scores Standard scores • NCEs have the advantage of being based on an equal-interval scale. That is, the difference between two successive scores on the scale is the same over all parts of the scale. This means that, unlike percentiles, you can average NCE scores to compare groups of students. • You can also convert average NCEs to national percentiles for a more meaningful understanding of the scores. This is because NCEs and NPs have a consistent relationship Norms • Normative groups allow us to compare one person performance to the performance of others • It is important to know what norm group you are being compared to • The assumption always is that you are being compared to the general population-make sure Norms • Take our example • It means that 60% of the students in the district scored at or above the 85th percentile nationally. • What help is this standardized test in placing students in a high math class? • Be sure you know the characteristics of the norm group and how it compares to what you are using it for? Characteristics of a Norm Group • Gender – Girls develop physically before and faster than boys – After puberty, it shifts – Some say there are different role expectations – Even though, people see differences, On most test, gender differences are small – If differences are pronounced, the you should have different norm groups Characteristics of a Norm Group • Age – Is important, different abilities develop at different times – Younger children see differences every couple of months ( pre-school) once school age a few months should not make a difference- look for differences by every six months to a year Characteristics of a Norm Group • Grade – All achievements test should use this – Some subjects might be off, but math, reading should be accurate Characteristics of a Norm Group • Acculturation – Very imprecise concept, should not rely on Characteristics of a Norm Group • Race and culture – An important concept to understand and watch for, but in some cases not a lot you can do with • There is an overrepresentation of certain groups, be aware of this • Yet if there are needs, you need to provide support – Steps have been taking to correct this problem Characteristics of a Norm Group • Geography, – I never have considered this – There are differences between different areas of the country Characteristics of a Norm Group • Proportional representation- should have representation for each sub group- k-12 is 13 groups, if male female- that is 26 • Can’t be to date- do not want data from the 1950s Characteristics of a Norm Group