Download Predictive Analytics Demystified

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Predictive Analytics Demystified:
Techniques and Applications
Larry D. Long, Michigan State University
works.bepress.com/ldlong
@LongLarryD
Monday, Match 13, 2017
Room Lone Star B – Grand Hyatt
Data Mining
Common Data Mining Approaches
• Clustering
• Associations (Pattern Mining)
• Classification & Regression
• Anomaly Detection
@LONGLARRYD
Supervised vs. Unsupervised
Supervised:
• The outcome variable is known
• Analysts use a “target” variable to train the
model
Unsupervised:
• The outcome variable isn’t known
• The computer looks for similarities in the data
and puts cases into groups (clustering)
@LONGLARRYD
Classification
Tid
Attrib1
Attrib2
Attrib3
Class
1
Yes
Large
125K
No
2
No
Medium
100K
No
3
No
Small
70K
No
4
Yes
Medium
120K
No
5
No
Large
95K
Yes
6
No
Medium
60K
No
7
Yes
Large
220K
No
8
No
Small
85K
Yes
9
No
Medium
75K
No
10
No
Small
90K
Yes
Learn
Model
10
Tid
Attrib1
Attrib2
Attrib3
11
No
Small
55K
?
12
Yes
Medium
80K
?
13
Yes
Large
110K
?
14
No
Small
95K
?
15
No
Large
67K
?
Apply
Model
Class
10
@LONGLARRYD
Classification Methods
•
•
•
•
•
Linear Regression
Decision Trees
Random Forests
Neural Networks
Boosting
@LONGLARRYD
Concrete/Explainable
“Black box”
Potential Analyses
•
•
•
•
Analyzing dining data for intervention
Course combinations for success/struggle
Public computer use/frequency
Identifying “disengaged” students
@LONGLARRYD
Software
Freeware:
• R
• Rattle (package in R)
• Weka
• KNIME
Products:
• SPSS Modeler
• SAS
• RapidInsights
• Watson Analytics
@LONGLARRYD
Pattern Mining
@LONGLARRYD
Combination
Male, Black, ISS210, MTH1825 (15) ==> Probation (8)
Male, Black, MTH1825, UGS101 (20) ==> Probation (10)
Black, ISS210, MTH1825, UGS101 (18) ==> Probation (10)
Black, ISS210, MTH1825 (40) ==> Probation (19)
Black, ISS210, UGS101 (29) ==> Probation (13)
Black, CEM141, MTH116 (22) ==> Probation (9)
Black, UGS101, WRA150 (20) ==> Probation (8)
Male, China, EC202, ISB202, MTH124 (27) ==> Probation (12)
Male, China, ISB202, MTH124 (45) ==> Probation (20)
Male, China, EC202, ISB202 (34) ==> Probation (14)
China, ISB202, MTH103 (17) ==> Probation (10)
China, CEM151, MTH132 (19) ==> Probation (8)
Male, Hispanic, WRA0102, WRA1004 (20) ==> Probation (10)
Multi-Racial, MTH1825, UGS101 (13) ==> Probation (8)
@LONGLARRYD
AP
53%
50%
56%
48%
45%
41%
40%
44%
44%
41%
59%
42%
50%
62%
@LONGLARRYD
Example using R
Download R
www.r-project.org
www.rstudio.com
@LONGLARRYD
Example using R
> library(datasets) #Load datasets
> data(iris) #Load Iris dataset
> head(iris) #Display first 6 entries
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1
5.1
3.5
1.4
0.2
setosa
2
4.9
3
1.4
0.2
setosa
3
4.7
3.2
1.3
0.2
setosa
4
4.6
3.1
1.5
0.2
setosa
5
5
3.6
1.4
0.2
setosa
6
5.4
3.9
1.7
0.4
setosa
@LONGLARRYD
Example using R
> library(randomForest) #Load RandomForest
package
> model <- randomForest(Species ~ Sepal.Length
+ Sepal.Width + Petal.Length + Petal.Width,
data=iris)
> print(model)
> pred <- predict(model, newdata = test) #score
new test dataset based on the model
@LONGLARRYD
Confusion Matrix
setosa
versicolor
virginica
setosa
50
0
0
versicolor virginica
0
0
47
4
@LONGLARRYD
3
46
class.error
0.0%
6.0%
8.0%
Predictive Analytics
@LONGLARRYD
@LONGLARRYD
Confusion Matrix
Overall accuracy is 77%.
Is this a good model?
@LONGLARRYD
Implications for Practice
•
•
•
•
Establishing a Taskforce
Forming Cross-departmental Partnerships
Getting Buy-in
Educating Faculty and Staff
@LONGLARRYD
Implications for Policy
• What protocols are already in place?
• Who should intervene?
• Ethical concerns and student privacy
@LONGLARRYD
Implications for Assessment
• Incorporate an assessment plan into the
overall plan
• Mine existing data to understand students’
experiences
– Residential density
– Course enrollment by location
– Challenging courses
@LONGLARRYD
Questions?
Larry Long
[email protected]
@LongLarryD
@LONGLARRYD
Thank you for
joining us today!
Please remember to complete
your customized online
evaluation following the
conference.
See you in Philly in 2018!
@LONGLARRYD
Related documents