Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Loan Default Model Saed Sayad www.ismartsoft.com 1 Data Mining Steps 1 • Problem Definition 2 • Data Preparation 3 • Data Exploration 4 • Modeling 5 • Evaluation 6 • Deployment www.ismartsoft.com 2 1. Problem Definition Build loan default prediction model for small business using the historical data to assess the likelihood of default by an obligor. www.ismartsoft.com 3 Data Mining Team Modeler Domain Expert DBA Analyst www.ismartsoft.com 4 2. Data Preparation • • • • • No of Cases: 35,500 No of Defaults: 2,500 (7%) Number of Variables: 25 Total balance for all cases: $554,000,000 Total balance for defaults: $58,000,000 (10.4%) www.ismartsoft.com 5 3. Data Exploration Univariate Analysis Frequency, Average, Min, Max, ... Bar, Line, Pie, ... Charts Data Exploration Correlation Bivariate Analysis www.ismartsoft.com Z test, ... Combination Charts 6 Data Exploration - Univariate Months in Business www.ismartsoft.com 7 Data Exploration - Bivariate Months in Business and Default Default% www.ismartsoft.com 8 4. Modeling Classification Regression Clustering Bayesian Linear Regression Hierarchical Decision Tree Robust Regression K-Means Logistic Regression Neural Network Association A Priori SVM www.ismartsoft.com 9 Modeling - Classification Logistic Regression Age f www.ismartsoft.com Default Y or N 10 Logistic Regression Model Linear Model 1 Logistic Model Default 0 Months in Business www.ismartsoft.com 11 5. Evaluation Charts Stats Gain Chart Confusion Matrix Lift Chart Mean Square Error K-S Chart Variables Contribution www.ismartsoft.com 12 Evaluation – Variables Contribution Variables Contribution - Top 10 Maximum Number of Delinquency Maximum Line Usage Number of Delinquent Days Credit Score Total Number of Deliquency Total Number of Deliquent Days Business Type Months in Business Number of Line of Credit Total Balance 0.0% 5.0% 10.0% 15.0% www.ismartsoft.com 20.0% 25.0% 30.0% 35.0% 13 Evaluation - Confusion Matrix Predicted Positive Positive Cases Negative Cases 247 264 3% 3% Predicted Negative 8167 313 7343 4% 90% www.ismartsoft.com 14 Evaluation – Gain Chart Default% 100% 58% 10% Population% 10% 50% www.ismartsoft.com 100% 15 Return On Investment • • • • Total Number of Loans = 8,167 Total Number of Defaults = 560 Total Balance for Defaults = $12,281,589 Top 10% Random – Number of Defaults = 56 – Total Balance = $1,230,000 • Top 10% Model – Number of Defaults = 305 – Total Balance = $7,655,772 www.ismartsoft.com 600% ROI 16 6. Deployment SQL Batch Scoring HTML Webbased Scoring www.ismartsoft.com 17 Questions? www.ismartsoft.com 18