Download Introduction - UNT College of Education

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Psychometrics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
Data Analysis Instructional Module
Introduction
The purpose of this module is an example designed for 5th grade math teachers to
analyze data from the math scores and use the information to improve teaching strategies.
This module is designed to improve school scores and increase student performance. The
module will allow teachers the opportunity to exercise initiative, self-direction, and
creativity. This module will take you through the data analysis step by step and allow the
user to create their own analysis of their classes.
Data Analysis
When looking at state assessment data we must look at the information through
many different analysis tools. We will be looking at descriptive analysis, correlation, t
tests, z score, ANOVA, and growth models.
The Kentucky School Testing System is the state assessment system for the
Commonwealth of Kentucky. Currently, the components are the Kentucky Core Content
Test (KCCT), Alternate Assessment, the ACT, PLAN and the nonacademic component.
Schools and districts receive the Interim Performance Report (IPR), which provides
detailed information and performance results. Electronic files are available containing the
data that appears on the IPR. This information is helpful for math departments selfimprovement plan, however more detailed information is also needed. Therefore, we
must look at the assessment data of students on an individual basis as well.
Kentucky Assessment Information
The purpose of the KCCT assessment is to measure the students’ mastery of the
state curriculum. The results are used to hold districts, schools, and educators
accountable.
The summative data is received too late in the school year for us to take action
and make change. Assessment should be used to provide immediate feed back to help the
teachers understand each child’s knowledge and deficiencies and not used for general
school accountability statistics. The test simply shows us what was learned throughout
the year.
Analysis of Our School as an Organization
Before we begin change in our school, we should analyze or organization and
decide if what we have been doing is working or if an intervention is needed and change
for the better should be made. We should look into getting away from the bureaucratic
system and move towards a learning organization. A learning organization will allow us
to adapt to new challenges and continue to evolve. It will also utilize our teachers by
harnessing the conscious thinking of individuals in achieving the goals of the
organization.
Change
After analyzing our school as an organization it is time to consider making
changes to improve our student’s achievement. Significant changes in math departments
are listed below.

All teachers use the same curriculum

Curriculum will be based on the Common Core Standards

Students will take a series of benchmark/interim (MAP) tests to insure
each student is on track. Also, benchmark tests will be designed to
correlate will the KCCT

Feed back will be given to students and parents on a regular basis

The assessment will guide the class room lesson plans
Formative Assessments
Before we begin to analyze data from test scores we must first understand the
importance of quality assessments, which are designed to help our school assess the
progress of each student. Notice this is a different goal from school accountability.
Analysis of test scores is only effective if the school provides consistent, well
written, benchmark tests that are aligned to the standards that are also aligned to the
end of year test. The benchmark test should also be vertically linked from year to
year. These tests should be given throughout the year to determine if a student is
staying on track and progressing. Another important quality of a good formative
assessment test is providing immediate feedback. Students, teachers, and parents
must know immediately if the student needs help in a specific area. Once quality
formative assessments are in place data analysis of the scores will be a very effective
way to improve student achievement.
Standard deviation is a good measurement tool for student achievement. Most
students should be in the middle area however the tests should also be able to analyze
abilities that are high and low. This can be achieved by creating tests that are the
correct difficulty level. Therefore, the test will provide information on all levels of
performance.
Examples of Data Analysis from Assessment Systems
Steps to Retrieve Data from Eduphoria – A regional system used by school
districts
1. Go to https://eduphoria.friscoisd.org
2. Log in using your user name and password
3. Click on Aware
4. Scroll down the left column and click on 5th grade math
5. Open the class or classes you are going to analyze
6. On the top right of the screen select student objective break down
7. Once the information is retrieved select export to Excel at the top of the web page
8. Repeat these procedures for as many classes as you wish to analyze
9. Your data should now be in Excel. (See example)
Descriptive Analysis
Before we begin using data analysis in excel it must be installed. To install Data
Analysis go to Tools and select add ins.
Next, select Data Analysis tool pack and click ok.
Once your Data Analysis tool pack is installed you are ready to begin analyzing data.
To analyze descriptive data in Excel select Tools, Data Analysis
Select Descriptive Statistics by clicking ok
In the Input Range select the data you want to use, summary statistics, then press OK.
Your data should look similar to this.
Definitions of Excel’s Descriptive Statistics
Mean - A simple measure of the central tendency of the data is the mean (or average):
The standard error of a method of measurement or estimation is the standard deviation
of the sampling distribution associated with the estimation method
Median is described as the numeric value separating the higher half of a sample, a
population, or a probability distribution, from the lower half. The median of a finite list of
numbers can be found by arranging all the observations from lowest value to highest
value and picking the middle one. If there is an even number of observations, then there
is no single middle value; the median is then defined to be the mean of the two middle
values.
Mode - the value that occurs the most frequently in a data set or a probability
distribution.
Standard Deviation - a statistical population, a data set, or a probability distribution is
the square root of its variance. Standard deviation is a widely used measure of the
variability or dispersion, being algebraically more tractable though practically less robust
than the expected deviation or average absolute deviation. (Wikipedia)
Standard Variance - A common measure of describing the spread of observations in a
distribution. A variance is related to another statistical measure, the standard deviation. A
variance is equal to the square of the standard deviation.
Kurtosis - A measure of the "peakedness" of the probability distribution of a real-valued
random variable. Higher kurtosis means more of the variance is the result of infrequent
extreme deviations, as opposed to frequent modestly sized deviations.
Skewness - A measure of the asymmetry of the probability distribution of a real-valued
random variable. The skewness value can be positive or negative, or even undefined.
Qualitatively, a negative skew indicates that the tail on the left side of probability density
function is longer than the right side and the bulk of the values (including the median) lie
to the right of the mean. A positive skew indicates that the tail on the right side is longer
than the left side and the bulk of the values lie to the left of the mean. A zero value
indicates that the values are relatively evenly distributed on both sides of the mean,
typically but not necessarily implying a symmetric distribution.
Range - The simplest measure of the spread of your data is the range which tells you the
distance between your most extreme data values.
Sum – Sum of all values.
Count – Number of values.
Descriptive Statistics Analysis of State Assessment Scores 2009-2010 using Texas
Standards – Can be replaced with Kentucky or Common Core Standards
Mean scores for each objective are listed below
1. (M 1) 86.3% - Demonstrate an understanding of numbers, operations, and quantitative
reasoning.
2. (M 2) 82.2% - Demonstrate an understanding of patterns, relationships, and algebraic
reasoning.
3. (M 3) 69.8% -Demonstrate an understanding of geometry and spatial reasoning.
4. (M 4) 87.9% - Demonstrate an understanding of the concepts and uses of
measurement.
5. (M 5) 69.8% - Demonstrate an understanding of probability and statistics.
6. (M 6) 91.1% - Demonstrate an understanding of the mathematical processes and tools
used in problem solving.
Raw Score - 36.3
Percentage score - 82.5%
In Conclusion, M3 and M5 are overall deficiencies on the STATE ASSESSMENT test.
Both objectives received an average score that is failing. M3 and M5 five will need to be
focused on and Bench mark tests will be analyzed to determine a correlation between the
bench marks and the objectives.
Correlation
To use the correlation function in excel follow the same procedures as the data statistics
except click on correlation
The Correlation tells us the strength and relationship between 2 variables. A perfect
positive correlation is 1.0. A perfect negative correlation is -1.0
The Jacob Cohen Effect
.31 to .5 moderate effect
.5 to 1 large effect
Below is a chart to visualize correlations plotted on an x y axis. (Wikipedia)
Correlation of State Assessment Objective Scores and Total Percentage Score
This correlation chart tells us that M3 and M5 have the lowest correlation to the final
score. This confirms from our previous data that objectives M3 and M5 are the lowest
performance of all the objectives.
T-Test
A T-test is used to compare the mean of 2 sets of data. 3 types of T-Tests are listed
below.
T-Test: Paired Two Sample for Means - This test compares the same piece of data with
2 different results. This test is only used when both:


the two sample sizes are equal;
it can be assumed that the two distributions have the same variance.
This test is good for a before and after comparison. For example, you could test a group
of people during a weight lose study. You are comparing the same group of people.
Comparing separate groups of data
T-Test: Two-Sample Assuming Equal Variances – This test compares 2 separate
groups of data. This test could be used to compare the test scores of males and females.
T-Test: Two-Sample Assuming Unequal Variances – This test should be used if the
standard deviations between the groups are different. Use this method if the scores look
very different.
Important data analysis
P- Value below 0.05 is considered statistically significant
T- Value is the difference between group means
http://www.socialresearchmethods.net
For this example we will compare the STATE ASSESSMENT test scores from students
of two separate teachers.
Click tools, Data Analysis, T-Test: Two Sample Assuming Unequal Variances
Variable 1 Range will be column A
Variable 2 Range will be column B
Alpha should stay at .05
Click ok
Results: The average score is very close however the P two tail is above .05. Therefore,
this test has no statistic significance.
The next T Test will compare test scores of males and females
We will be using T-Test: Two-Sample Assuming Unequal Variances
Click tools, Data Analysis, T-Test: Two Sample Assuming Unequal Variances
Variable 1 Range will be column A
Variable 2 Range will be column B
Alpha should stay at .05
Click ok
Results: No significant statistical value
The next T test will compare Objective 3 from the STATE ASSESSMENT results to the
Objective 3 from the bench mark test. The bench mark test is designed to align with the
STATE ASSESSMENT scores
Click tools, Data Analysis, T-Test: Two Sample Assuming Unequal Variances
Variable 1 Range will be column b
Variable 2 Range will be column c
Alpha should stay at .05
Click ok
Results: The bench mark test is too easy
Z Score
Z Score - indicates how many standard deviations a datum is above or below the mean. It
is a dimensionless quality derived by subtracting the population mean from an individual
raw score and then dividing the difference by the population standard deviation. The Z
score should be implemented in STATE ASSESSMENT score analysis when comparing
several years or scores because the STATE ASSESSMENT test is not vertically linked.
See chart below (Wikipedia):
The standard score is
where:
x is a raw score to be standardized;
μ is the mean of the population;
σ is the standard deviation of the population.
The Z score analysis below shows a comparison of the STATE ASSESSMENT scores
from 2009 and 2010. This chart allows us to compare 2 years of scores and analyze
students on an individual basis. A large difference between the z scores from one year to
another tells us that a student made significant change. Therefore, this chart helps us
identify students that are maintaining progress, making significant progress, or getting off
track.
Analysis of Variance - ANOVA
Analysis of variance is a statistical method used to determine a difference in means.
ANOVA is used to determine a difference in means between more than two groups of
data.
Select Tools, Data Analysis and select ANOVA single factor
Three difference classes organized be each teacher
Results of ANOVA Single Factor
The ANOVA Data Statistics tell us that Column 3, which was True’s class, had the
highest average score. The P-value is .011967. This is less than .05 therefore the
statistics are significant.
Growth Model
A growth model is used to track the growth of a student through duration of time. A
growth model must have a least 3 data points to analyze. Before a growth model can take
place the data must be organize like the example below.
Now we will find the intercept. The intercept is the mean of the first test score. This
gives us a starting place. Next, click an empty cell to the right of the data and type
intercept and slope. See example below.
The intercept slope will be calculated by using the equation
=INTERCEPT(C3:C44,B3:B44) To use Excel to calculate the intercept click on an
empty cell then click on the fx button at the top. Select all in the category and intercept
as the function.
The known Y’s will be the test score column and the known X’s will be the years.
The slope will be calculated by using the equation =SLOPE(C3:C44,B3:B44) Click fx
and then select slope
Then select the data the same way it was selected for the intercept. Click ok. You should
now have a slope of 57.17857. This tells us that the average test score improved 57
points per test. This is a very good growth of student progress.
Next, we will calculate if a child is on track or off track. This is a growth model for an
individual student. This will be useful so that the spreadsheet will automatically update
itself once the data is entered to help identify students immediately. For this next
example, we will be using sample data.
Growth 125, Target 2400, Actual 2100, No. of test Remaining 2
This information will be used to calculate weather a student is on track. This equation
will be =(F3)/((G3-H3)/I3). The outcome for this example is .833333 Any thing less
than 1.0 is off track. 1.0 or greater is on track. Therefore the student in the example is
off track. The column to the right says off track in red. This was created by clicking
conditioning format in the tools.
One formatted, the cell that says off track will be determined by the result in cell K3.
Notice that if the variable changes in the growth for example the off track indicator will
change to on track.
Regression Model
The first Regression model will be a change model used to get predicted scores. At lease
two sets of scores will be needed for this analysis. To begin, click on data analysis, then
regression.
The Y input range will be the STATE ASSESSMENT scores and the X input range will
be the years
Your summary output should look similar to the one below. Notice that the significance
is .0007. This is below .05 therefore the data does have statistical value.
Next, copy the residual values and past them next to the STATE ASSESSMENT scores
so that you can see the student, year of test, score, and residuals. The residual is the
actual score minus the predicted score. See below for example
The next column will be the gain. The gain is the actual-predicted/standard deviation of
residuals. For this example the change will be calculated by the equation =(D2E2)/87.47. The 87.47 came from the standard deviation of residuals
The advantage to using a change model is to be able to analyze students on an individual
basis. Keep in mind this looks at how each individual is doing based on everyone else in
the class.
Score Plotting
Score plotting is a way to visualize student progress over time on an individual and group
basis. When plotting scores the Z score should always be used and not the raw scores.
The Z scores will give a more accurate analysis of student progress.
To begin, we must have at least 3 separate test z scores to analyze. See below
Click on insert, picture, chart
Select line and click next
The data range will be the three columns of test scores and the series will be in rows.
Click Finish
You will now have a line graph that plots student progress. This allows us to visualize
student progress individually and as a group. The 2009 data was copied to the 2011 data
for demonstrative purposes only since there must be at least 3 test scores. This is why the
lines end up in the same location they started.