Download Retention Risk Modeling: Targeting *At

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
WHO ARE YOUR AT-RISK STUDENTS?
USING DATA MINING TO TARGET
INTERVENTION EFFORTS
Lalitha Agnihotri , Ph.D., Senior Systems Analyst, DWH
Alex Ott , Ed.D., Associate Dean, Academic & Enrollment Services
Niyazi Bodur, Ph.D., VP, Information Technology & Infrastructure
New York Institute of Technology
EDUCAUSE Annual Conference
October 16th, 2013
Presentation Description and Goals
 Learn how to improve targeted intervention
by building a model to identify and classify
at-risk students using data at your institution.
 Gain an understanding of the complete life
cycle of the At-Risk Student Identification
Model.
Targeted Intervention for At
Risk Students
The Goal: Early targeted intervention based on
risk factors for each at-risk student to
improve retention
Rationale for Key Elements:
 Early
 Targeted intervention
 Risk factors for each student
Before the Model, All We Had
Was…
Students At Risk (STAR) Model
Version 1.0
Data sources:
 Admissions data
 Registration/Placement test data
 Survey data
Method:
 Combine all risk variables into an aggregated
measure.
Version 1.0 Report Output:
Major Challenges with STAR 1.0
 Limited attributes.
 Attributes of unknown strength, relevance, or
even direction.
 Attributes equally weighted.
 Static Excel document: Big effort in getting
all the attributes in one place.
Data Mining
Data Mining Classification
 Given a collection of records (training set )
Each record contains a set of attributes, one of
the attributes is the class.
Student ID
Attributes
Class
 Goal: previously unseen records should be
assigned a class as accurately as possible.
 Find a model for class attribute as a
function of the values of other attributes.
 Select the model that performs the best.
STAR Model: Version 2.0 with
Data Mining and Automated Tools
1. Built and automated the full dataset in our
Data Warehouse
2. Used Data Mining tools (SQL Server Analysis
Services) to train multiple dynamic
statistical models
3. Enterprise solution
SQL Build Data
SSAS Modeling
DMX Prediction Query
SSRS Report
Models Trained
Logistic Regression Logistic Regression
Naïve Bayes
Naïve Bayes
Neural Network
Neural Network
Ensemble
Decision Trees
Decision Trees
Data Mining Knowledge Discovery:
BIG Picture
Attribute
Work Hours Per Week
Work Hours Per Week
Major Certainty
Career Goal Certainty
Stafford Loan Amount
Remedial Math
Completion Plan
NYITScholarshipAmount
Work Hours Per Week
Developmental Math
CareerGoalCertainty
College Reading Strategies
Value
Favors Students Returning Favors Students Not Returning
31+
0
Not Sure
Not sure
>$5,900
Registered
Undecided
>$13,000
1-10
Not Registered
Fairly Sure
Registered
Data Mining Knowledge Discovery:
Detailed Picture
Model Significance And Results
Foundation
Methodology
Data
Number of
Variables and
Strengths
Discovery
Version 1.0
Desktop
Manual calculation
Manual Collection Local
data
Limited and Equally
Weighed
Indication of whether a
student is returning or not
Model Name
Version 2.0
Enterprise
Data mining, combined more than one
methods
Enterprise data
Fairly high and Weight Depends on Data
mining model
Uncover big picture for the University
and individual variables for each
student based on the prediction of
returning or not
Recall
Precision
Accuracy
Manual
34%
42%
59%
Logistic Regression
64%
55%
68%
Neural Network
54%
55%
67%
Naïve Bayes
49%
67%
73%
Decision Trees
39%
72%
72%
Ensemble
75%
56%
69%
So How Did the Model
Actually Perform?
Model Name:
Recall:
Precision:
Accuracy:
2011 Fall New Students Data
75%
56%
69%
2012 Fall New Students Data
73%
54%
59%
Key Takeaways
 Success depends on productive partnership
between IT and business.
 Data is the KEY.
 Data mining is a process.
 Select attributes based on (retention) research
and particulars of your school.
Questions?
 Lalitha Agnihotri, [email protected]
 Alexander Ott, [email protected]
 Niyazi Bodur, [email protected]