Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Predictive Analytics Demystified: Techniques and Applications Larry D. Long, Michigan State University works.bepress.com/ldlong @LongLarryD Monday, Match 13, 2017 Room Lone Star B – Grand Hyatt Data Mining Common Data Mining Approaches • Clustering • Associations (Pattern Mining) • Classification & Regression • Anomaly Detection @LONGLARRYD Supervised vs. Unsupervised Supervised: • The outcome variable is known • Analysts use a “target” variable to train the model Unsupervised: • The outcome variable isn’t known • The computer looks for similarities in the data and puts cases into groups (clustering) @LONGLARRYD Classification Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K No 3 No Small 70K No 4 Yes Medium 120K No 5 No Large 95K Yes 6 No Medium 60K No 7 Yes Large 220K No 8 No Small 85K Yes 9 No Medium 75K No 10 No Small 90K Yes Learn Model 10 Tid Attrib1 Attrib2 Attrib3 11 No Small 55K ? 12 Yes Medium 80K ? 13 Yes Large 110K ? 14 No Small 95K ? 15 No Large 67K ? Apply Model Class 10 @LONGLARRYD Classification Methods • • • • • Linear Regression Decision Trees Random Forests Neural Networks Boosting @LONGLARRYD Concrete/Explainable “Black box” Potential Analyses • • • • Analyzing dining data for intervention Course combinations for success/struggle Public computer use/frequency Identifying “disengaged” students @LONGLARRYD Software Freeware: • R • Rattle (package in R) • Weka • KNIME Products: • SPSS Modeler • SAS • RapidInsights • Watson Analytics @LONGLARRYD Pattern Mining @LONGLARRYD Combination Male, Black, ISS210, MTH1825 (15) ==> Probation (8) Male, Black, MTH1825, UGS101 (20) ==> Probation (10) Black, ISS210, MTH1825, UGS101 (18) ==> Probation (10) Black, ISS210, MTH1825 (40) ==> Probation (19) Black, ISS210, UGS101 (29) ==> Probation (13) Black, CEM141, MTH116 (22) ==> Probation (9) Black, UGS101, WRA150 (20) ==> Probation (8) Male, China, EC202, ISB202, MTH124 (27) ==> Probation (12) Male, China, ISB202, MTH124 (45) ==> Probation (20) Male, China, EC202, ISB202 (34) ==> Probation (14) China, ISB202, MTH103 (17) ==> Probation (10) China, CEM151, MTH132 (19) ==> Probation (8) Male, Hispanic, WRA0102, WRA1004 (20) ==> Probation (10) Multi-Racial, MTH1825, UGS101 (13) ==> Probation (8) @LONGLARRYD AP 53% 50% 56% 48% 45% 41% 40% 44% 44% 41% 59% 42% 50% 62% @LONGLARRYD Example using R Download R www.r-project.org www.rstudio.com @LONGLARRYD Example using R > library(datasets) #Load datasets > data(iris) #Load Iris dataset > head(iris) #Display first 6 entries Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa @LONGLARRYD Example using R > library(randomForest) #Load RandomForest package > model <- randomForest(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, data=iris) > print(model) > pred <- predict(model, newdata = test) #score new test dataset based on the model @LONGLARRYD Confusion Matrix setosa versicolor virginica setosa 50 0 0 versicolor virginica 0 0 47 4 @LONGLARRYD 3 46 class.error 0.0% 6.0% 8.0% Predictive Analytics @LONGLARRYD @LONGLARRYD Confusion Matrix Overall accuracy is 77%. Is this a good model? @LONGLARRYD Implications for Practice • • • • Establishing a Taskforce Forming Cross-departmental Partnerships Getting Buy-in Educating Faculty and Staff @LONGLARRYD Implications for Policy • What protocols are already in place? • Who should intervene? • Ethical concerns and student privacy @LONGLARRYD Implications for Assessment • Incorporate an assessment plan into the overall plan • Mine existing data to understand students’ experiences – Residential density – Course enrollment by location – Challenging courses @LONGLARRYD Questions? Larry Long [email protected] @LongLarryD @LONGLARRYD Thank you for joining us today! Please remember to complete your customized online evaluation following the conference. See you in Philly in 2018! @LONGLARRYD