Download Student Retention Prediction Using Data Mining Tools

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
NWEUG
2015
STUDENT RETENTION
PREDICTION
USING DATA MINING TOOLS AND
BANNER DATA
Admir Djulovic
Dennis Wilson
Eastern Washington University
Business Intelligence
Coeur d’Alene, Idaho
SESSION RULES OF ETIQUETTE

Please turn off you cell phone/pager

If you must leave the session early, please do so as discreetly as
possible

Please avoid side conversation during the session
Thank you for your cooperation!
Coeur d’Alene, Idaho
NWEUG
2015
INTRODUCTION

Focus: Why first time freshmen students are leaving in the first
year?

Benefits of attending this session

You will learn how we use Banner and Data Mining tools to identify
students at risk

Learn about factors that influence student retention

We will share our results and findings
Coeur d’Alene, Idaho
NWEUG
2015
AGENDA
1.
Why first time freshmen students are leaving in the first year?
2.
Retention Data Mining Model Creation
3.
Results and Findings
4.
Future Work
5.
Questions
Coeur d’Alene, Idaho
NWEUG
2015
WHY STUDENTS ARE LEAVING IN THE
FIRST YEAR?
Coeur d’Alene, Idaho
NWEUG
2015
WHY STUDENTS ARE LEAVING IN THE
FIRST YEAR?


What are the factors that cause student to leave the university?

Pre-enrollment Information (i.e. SAT and ACT test scores)

Poor academic performance

Financial hardship
We want to determine data driven factors that influence student
retention
Coeur d’Alene, Idaho
NWEUG
2015
RETENTION DATA MINING MODEL
CREATION
•
The model uses existing student and financial data in Banner to give us a
prediction of how many first time freshmen students will or will not return
the following Fall term
Coeur d’Alene, Idaho
NWEUG
2015
RETENTION DATA MINING MODEL
CREATION
• Determine what student attributes would provide the greatest
benefit with these constrained
• Pre-enrollment information
• Financial Information
• Housing Information
• Financial Aid Information
• Determine what Data Mining Predictive algorithms to use
Coeur d’Alene, Idaho
NWEUG
2015
STUDENT ATTRIBUTES USED TO BUILD
THE MODEL



Special Attributes

ID – unique record identifier

RETAINEDNXTYR (Known Outcome/Target variable): Student retained next
year (0: No, 1: Yes)
Pre-Enrollment Attributes

Age

Gender

SAT Scores in Reading, Math and Writing

Previous GPA (typically high school GPA)
Term Related Attributes

Account Balance

Cumulative GPA

Successive term GPA

Living on or off campus

Financial aid received or not
Coeur d’Alene, Idaho
NWEUG
2015
STUDENT ATTRIBUTES USED TO BUILD
THE MODEL
Table 1: Normalized Weights of
Independent Variables Using
Relief Statistical Method
(All weights above 0.5 are deemed
important in determining student
retention.)
Coeur d’Alene, Idaho
NWEUG
2015
STUDENT ATTRIBUTES USED TO BUILD
THE MODEL
Table 2: Normalized Weights of
Independent Variables Using
Information Gain Statistical
Method
(All weights above 0.5 are deemed
important in determining student
retention.)
Coeur d’Alene, Idaho
NWEUG
2015
STUDENT ATTRIBUTES USED TO BUILD
THE MODEL
Table 3: Normalized Weights of
Independent Variables Using Chi
Squared Statistics Method
(All weights above 0.5 are deemed
important in determining student
retention.)
Coeur d’Alene, Idaho
NWEUG
2015
DATA USED
• First time full time freshmen – Fall cohort (Could be applied to any
population)
• Cohort groups of data
• Fall 2006 – 2011 Freshmen to train the model
• Fall 2013 Freshmen to test model
Coeur d’Alene, Idaho
NWEUG
2015
ALGORITHM SELECTION
•
The following
predictive algorithms
have been used in
many research paper
Coeur d’Alene, Idaho
NWEUG
2015
TRAINING THE MODEL USING
HISTORICAL DATA
•
Historical Data:
•
•
From 2006 through 2012
Test Data:
•
2013 Academic Year
Coeur d’Alene, Idaho
NWEUG
2015
MODEL(S) TRAINING AND TESTING
PHASE
Coeur d’Alene, Idaho
NWEUG
2015
MODEL(S) ACCURACY
Coeur d’Alene, Idaho
NWEUG
2015
MODEL(S) ACCURACY CONT.
Coeur d’Alene, Idaho
NWEUG
2015
APPLYING THE MODEL(S) USING THE
NEW DATASET
Coeur d’Alene, Idaho
NWEUG
2015
APPLYING MODELS USING NEW
DATASET

Academic Year 2013-2014
Coeur d’Alene, Idaho
NWEUG
2015
RESULTS AND FINDINGS
Coeur d’Alene, Idaho
NWEUG
2015
RESULTS AND FINDINGS

Winter Balance vs RETAINEDNXTYR (0:No; 1:Yes)
Coeur d’Alene, Idaho
NWEUG
2015
RESULTS AND FINDINGS

Winter Living on Campus vs RETAINEDNXTYR (0:No; 1:Yes)
Coeur d’Alene, Idaho
NWEUG
2015
RESULTS AND FINDINGS

Winter Received Financial Aid vs RETAINEDNXTYR (0:No; 1:Yes)
Coeur d’Alene, Idaho
NWEUG
2015
HOW COULD THIS RETENTION MODEL HELP?


Provide early warning of students at risk

Lists can be provided to different offices for student outreach

Improve student retention
Use it to forecast future student retention
Coeur d’Alene, Idaho
NWEUG
2015
EXAMPLES

Not returning due to the low GPA

(0:No; 1:Yes)
Coeur d’Alene, Idaho
NWEUG
2015
EXAMPLES CONT.

Not returning due to the high balance

(0:No; 1:Yes)
Coeur d’Alene, Idaho
NWEUG
2015
FUTURE WORK
Coeur d’Alene, Idaho
NWEUG
2015
FUTURE WORK
 Attributes








for future consideration
Student Attendants List
Student Credit Hours
Repeat Class Indicator
Types of Financial Aid
Major
College
Residency
Other Attributes
Coeur d’Alene, Idaho
NWEUG
2015
SESSION SUMMARY

We have demonstrated how Banner data and data mining tools
are used to identify students at risk

We have demonstrated how predictive models are created and
how they work


Factors that contribute to a student’s dropping out

Data mining Algorithms used
Demonstrate how retention models can be used as a early
warning system to identify students at risk
Coeur d’Alene, Idaho
NWEUG
2015
QUESTIONS & ANSWERS
Coeur d’Alene, Idaho
NWEUG
2015
THANK YOU!
Admir Djulovic, Dennis Wilson
Coeur d’Alene, Idaho
NWEUG
2015