Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
General circulation model wikipedia , lookup
Financial economics wikipedia , lookup
Enterprise risk management wikipedia , lookup
History of numerical weather prediction wikipedia , lookup
Computer simulation wikipedia , lookup
Regression analysis wikipedia , lookup
Least squares wikipedia , lookup
Predictive analytics wikipedia , lookup
Spar Nord Bank’s application score - a rating system for new retail customers Rasmus Waagepetersen Spar Nord Bank DK-9100 Aalborg Basel II background • Basel II: regulations for calculation of capital requirements (solvency). • Capital requirement depends on the bank’s risk profile. • Three types of risk: • credit risk: risk that customer does not pay back his/her loan (i.e. default) • market risk: e.g. the risk that stock holdings loose value • operational risk: e.g. break down of computer systems or fraud Capital requirement related to credit risk: own capital at least 8% of risk weighted assets. Risk weighted assets: each asset (loan) is multiplied with a risk weight depending on probality of default (PD) and loss given default (LGD). 22-05-2017 2 Risk weight formula • Based on binomial mixed model: • latent variable Z: state of economy • given Z, indicators of default X_1,…,X_n conditionally independent Bernouilli variables with conditional PD P(X_i=1|Z)=F(a_i+bZ) where F() standard normal distribution function (binomial GLM with probit link). • a_i controls size of PD for ith exposure, b controls correlation between defaults. • Portfolio loss L=L_1*X_1+…+L_n*X_n (L_i: loss given default for i’th costumer). • Risk weight formula based on asymptotic formula for 99% quantile of L (value of risk) (n tends to infinity, L_i tends to zero): q0.99 ( L) i Li F (ai bq0.99 ( Z )) 22-05-2017 3 IRB: internal rating based approach • Risk weight F (ai bq0.99 ( Z )) • value of b: supplied by Basel II regulations (correlation depending on loan type) • value of a_i: obtained from PD_i=P(X_i=1) estimated internally (IRB). • IRB: PD_i and LGD_i estimated from banks internal assessment of risk/banks own historical data. • Central ingredient: rating system places loans/customers in rating classes which are differentiated with respect to risk (PD and LGD). 22-05-2017 4 Rating system for new retail customers/application score • Rating of new customers based on variables such as age, type of housing, income, assets, debts,… • NB: for existing customers additional information is available: transaction behaviour, overdrafts, cash flow,… (behavioural score) 22-05-2017 5 Empirical model • Rating system may be based on direct estimation of probability of default (logistic regression). • Problem: low quality of historical data. Missing variables or incorrect records. • Common problem: use of quantitative methods for credit risk management still quite new in conventional danish banks. • “Problem”: frequency of default quite low (1% within a one year timespan) in historical data. Hence large data sets needed in order to fit a differentiated model. (binary observations provided limited information) 22-05-2017 6 Expert model • Aim: construct a model which based on customer variables gives a rating which an experienced Spar Nord Bank credit officer would give based on the same variables. • Advantage: historical data obtained in a period of favorable economic conditions. Ratings from experienced bank people may reflect knowledge of difficult times (1990’s). • Accept among users: model reflects best practice. 22-05-2017 7 Basic model for rating systems Customer variables age, capital, income,… score Weights w1, w2, w3,… • Problem: obtain weights so proper balance between variables contributing to the score • Problem: convert score into rating consistent with rating of an experienced bank person • Consultants in PWC or the like will suggest various ad hoc solutions • Credit people trained to assess customers not to assign weights • Better solution: let credit people rate customers and leave computation of weights to statistician Red, yellow, green rating (traffic light) 22-05-2017 8 Statistical model based on expert ratings Weights w1, w2,… parameters to be estimated in regression model for expert ratings given customer variables. Data: Population of around 3000 customer cases case1: age, capital,… case2: age, capital,… case3: age, capital,… … Ratings from panel of 19 experts red green, green, … 22-05-2017 9 Rating scale and design of experiment 6 step scale: Green above average Green average Green below average Yellow above average Yellow below average Red 1 2 3 4 5 6 2 rating workshops – one week between: 1. 2. 15 experts each rated 105 cases (25 cases common to all experts) 17 experts each rated 102 cases (13 also took part in first workshop) In total 3321 cases rated. The 25 common cases enables direct comparison of experts. Stratified sample of customer cases for each expert: capital, debt factor (i.e. debt/income) and good/bad status (”manual” classification) 22-05-2017 10 Rating af common cases 1-4 25 cases rated by all experts G/V indicates good/bad status Considerable variation for average customers 22-05-2017 11 Threshold-model for rating data Score weighted sum of customer variables s=x1*w1+x2*w2+… Expert assessment (latent variable): V=s+U where E(U)=0 Thresholds: V below T1 yields red, between T1 og T2 yellow below average etc. Greater probability for red rating with score S1 compared with score S2 22-05-2017 12 Interpretation of V=s+U • experts only see customer variables and not score (score mathematical construction). • U reflects rating variation: an expert may assign different ratings to customers with same score (measurement error) • - moreover: 1. variation between experts. 2. variation between workshops. • Obvious: variance component model (later) • Logistic distribution for U yields cumulative logistic regression/proportional odds model: ~ exp(Tr s) P( R r ) ~ P( R r ) log( ) Tr s ~ P( R r ) exp(Tr s) 1 22-05-2017 13 Variables in model • Basic variables: age, size of household, type of housing, type of loan (fixed or variable interest rate, with or without amortization), income, assets, debts • Derived variables: single parent, capital, debt factor, income per person in household, solvency ratio (capital/assets) • Interactions: capital/age, capital/debt factor, type of housing/debt factor, age/debt factor In total 76 parameters (grouped quantitative variables) estimated from ~2900 customer cases (omitted two ”extreme experts”) 22-05-2017 14 Evaluation of model 1. 2. 3. 4. does model fit expert ratings ? is it useful for identifying weak customers (sensitivity)? is model’s assessment of risk concordant with empirical risk ? does model classify too many good customers as weak (specificity) 22-05-2017 15 Deviations between model ratings and expert ratings Model rating: rating with highest probability according to model For 90% of expert ratings at most one step deviation from model. 22-05-2017 16 Comparison of model ratings and expert ratings of common customer cases ”model”-plot shows model probabilities for each rating We can provide both the most probable rating but also the precision of the rating 22-05-2017 17 Standardized residuals (consider ratings in {1,…,6} as quantitative variables) Boxplot for each expert 22-05-2017 18 Validity of proportional odds assumption P( R r ) ~ log( ) Tr s P( R r ) Plot empirical estimates of log odds after grouping according to estimated score Note: very small odds when r=1 and groups with small scores – sensitive to outliers 22-05-2017 19 Sensitivity: model ratings of weak customer cases 20062007 and 2008 (new customers identified as weak by internal credit surveillance team) 2006-2007: 72 % rated red or yellow 2008: 78 % rated red or yellow 22-05-2017 20 Specificity: model ratings of ”strong” customers Tricky issue: definition of a strong customer ? Model ratings of customer cases with behavioural score 1-3 2 years after first loan: 7% red 19% yellow 22-05-2017 21 Basel II • Strict Basel II definition of default: devaluation of loan or loss • 0nly ~30 defaults in data set with ~3500 customer cases • Dimension reduction: expert model reduces large number of variables to just one number (score) • Estimate PD using logistic regression with score as covariate: Rating-class 1 2 3 4 5 6 Mean score 9.77 6.35 4.14 2.73 1.36 -1.11 Estimated PD (log. reg.) 0.001 0.003 0.005 0.008 0.013 0.028 Confidence interval (0.000; 0.003) (0.001; 0.005) (0.003; 0.009) (0.006; 0.013) (0.009; 0.019) (0.018; 0.042) Proportion default (with bootstrap CI) 0.002 (0.000; 0.006) 0.000 - 0.004 (0.000; 0.009) 0.012 (0.002; 0.025) 0.009 (0.003; 0.016) 0.045 (0.025; 0.068) 22-05-2017 22 Variance components Decomposition of latent expert assessment Vijk s U ij U ijk Vijk : k' th rating i' th expert. j 1,2 : rating workshop U ij : variatio n between eksperts (normally distribute d) U ijk : variation between ratings for same expert (logistic distributi on) 22-05-2017 23 Potential advantages of variance component model • In model without expert effects, opinion of experts who rated two data sets count more than experts who only rated one data set. • More appropriate quantification of variation in data. • Need numerical integration to compute likelihood and predictive probabilities exp(Tr s U ij ) P( Rijk r ) E P( Rijk r | U ij ) E exp(T s U ) 1 r ij 22-05-2017 24 Results for variance component model (GLIMMIX) • variance for logistic distribution: 3.29 • estimated variance for expert effects: 0.55 • with 2 ”extreme experts” omitted: 0.31 Predictions of expert effects • largest variance component ”measurement error” (i.e. logistic distribution) 22-05-2017 25 Representation of rating Bar represents probabilities of red, green and yellow – representation of model certainty 22-05-2017 26