Download document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
watsonwyatt.com
CAS 2008 Spring Meeting
Joint Meeting CIA/SOA/CAS
A Survey of P&C Predictive
Modeling Applications
Gaétan Veilleux, FCAS, MAAA
June 18, 2008
What is Predictive Modeling?
A statistical process which estimates the value of an
observed item (dependent variable) based upon the
values of other explanatory variables.
2
Copyright © Watson Wyatt Worldwide. All rights reserved
P&C Predictive Modeling applications
 Generalized Linear Models (GLM)
 Data mining and other methods
– Artificial neural networks
– Classification and regression trees (CART)
– Multivariate adaptive regression splines (MARS)
– Cluster analysis
– Principal components analysis / factor analysis
3
Copyright © Watson Wyatt Worldwide. All rights reserved
Generalized linear models
E[Y] = m = g-1(X.b + x)
Var[Y] = f.V(m) / w
 Consider all factors simultaneously
 Allow for nature of random process
 Provides diagnostics
 Robust and transparent
 Increasingly a global standard
4
Copyright © Watson Wyatt Worldwide. All rights reserved
Insurance applications of GLMs









Ratemaking
Underwriting
Marketing
Retention
Expense analysis
Claims management
Risk management / reinsurance
Sales channel
Reserving
5
Copyright © Watson Wyatt Worldwide. All rights reserved
Applications
 Ratemaking
– Revise existing rating factor relativities with
multivariate analysis
– Introduce new rating variables or underwriting tiers
– Re-define territorial boundaries
– Re-define vehicle classifications
– Unbundle homeowners by-peril
– Understand effect of proposed rate changes at
renewal (including moderator algorithms)
– Define rating plan that optimizes profit while retaining
required volume
6
Copyright © Watson Wyatt Worldwide. All rights reserved
Ratemaking objective
Age
Sex
Vehicle
Rating Plan
Premium
Area
Claim
Limit
7
Copyright © Watson Wyatt Worldwide. All rights reserved
Modeling the cost of claims
Age
Sex
Vehicle
Area
Model
Expected
cost of
claims
Claim
Limit
8
Copyright © Watson Wyatt Worldwide. All rights reserved
Modeling the cost of claims
BI
Freq
x
Amt
= Cost 1
PD
Freq
x
Amt
= Cost 2
MED Freq
x
Amt
= Cost 3
COL
Freq
x
Amt
= Cost 4
OTC
Freq
x
Amt
= Cost 5
9
Copyright © Watson Wyatt Worldwide. All rights reserved
GLM output (significant factor)
1.2
200000
180000
1
154%
138%
0.8
160000
105%
140000
84%
73%
0.6
120000
72%
58%
100000
45%
0.4
39%
80000
31%
Exposure (years)
Log of multiplier
93%
60000
0.2
5%
40000
0%
0
20000
-0.2
0
1
2
3
4
5
6
7
8
9
10
11
12
13
Vehicle symbol
P value = 0.0%
Onew ay relativities
Approx 95% confidence interval
Parameter estimate
10
Copyright © Watson Wyatt Worldwide. All rights reserved
Age - sex interaction
Example job
Run 5 Model 3 - Small interaction - Third party material damage, Numbers
1
155%
138%
300000
0.8
250000
63%
63%
46%
0.4
200000
40%
28%
19%
24%
20%
150000
0.2
13%
Exposure
Log of multiplier
0.6
6%
0%
-2%
100000
-6%
0
-11%
-18%
-19%
-0.2
50000
-0.4
0
17-21
22-24
25-29
30-34
35-39
40-49
50-59
60-69
70+
P level = 0.0%
Rank 6/6
Age of driver.Sex of driver
Approx 2 SEs from estimate, Sex of driver: Female
Approx 2 SEs from estimate, Sex of driver: Male
Unsmoothed estimate, Sex of driver: Female
Unsmoothed estimate, Sex of driver: Male
Smoothed estimate, Sex of driver: Female
Smoothed estimate, Sex of driver: Male
11
Copyright © Watson Wyatt Worldwide. All rights reserved
Impact analysis
Example job
Age of driver
7000
180%
170%
160%
6000
150%
140%
5000
120%
4000
110%
100%
3000
Loss ratio
Count of records
130%
90%
80%
2000
70%
60%
1000
50%
40%
0
30%
0.450 0.500
0.600 0.650
0.750 0.800
0.900 0.950
1.050 1.100
1.200 1.250
1.350 1.400
1.500 1.550
1.650 1.700
1.800 1.850
1.950 2.000
2.100 2.150
2.250 2.300
2.400 2.450
Ratio: Risk Premium / Current premium
tariff
17-21
22-24
25-29
30-34
35-39
40-49
50-59
60-69
70+
Claims / Earnedprem
12
Copyright © Watson Wyatt Worldwide. All rights reserved
Applications
Underwriting
– Provide guidelines on debits/credits
– Produce scorecards to automate some elements of
risk selection
Marketing
– Improve direct mail conversion rate for most profitable
risks
13
Copyright © Watson Wyatt Worldwide. All rights reserved
Scoring
Distribution of score
2500
160%
140%
2000
100%
1500
80%
1000
60%
Actual loss ratio
Number of policies
120%
40%
500
20%
0
0%
0
2 4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100
Score based on expected loss ratio
Number of policies
Actual loss ratio
14
Copyright © Watson Wyatt Worldwide. All rights reserved
Applications
 Retention
– Understand effect of capping rate changes at renewal
– Develop lifetime customer value model
 Expense analysis
– Vary acquisition costs by other criteria
 Claims management
– Develop fraud scorecard
– Advise how TPAs affect claim costs
– Analyze the drivers of claim cost and hence loss
control
15
Copyright © Watson Wyatt Worldwide. All rights reserved
Applications
Risk management / reinsurance
– Determine which risks to cede
Sales channel
– Align compensation with expected profitability
Reserving
– Provide additional method to assist reserving
actuaries with ultimate projections
– Identify predictors of “serious” claims
16
Copyright © Watson Wyatt Worldwide. All rights reserved
P&C Predictive Modeling applications
 Generalized Linear Models (GLM)
 Data mining and other methods
– Artificial neural networks
– Classification and regression trees (CART)
– Multivariate adaptive regression splines (MARS)
– Cluster analysis
– Principal components analysis / factor analysis
17
Copyright © Watson Wyatt Worldwide. All rights reserved
Data Mining





aka Knowledge Discovery in Databases (KDD)
Broad range of methods
Good at discovery, weak at estimation
Many (most) are not being applied to P&C insurance
ACM SIGKDD International Conference on
Knowledge Discovery & Data Mining
– Evolutionary spectral clustering by incorporating temporal smoothness
– Making generative classifiers robust to selection bias
– Nonlinear adaptive distance metric learning for clustering
18
Copyright © Watson Wyatt Worldwide. All rights reserved
Data Mining – 5 Common Techniques
 Artificial neural networks
– Non-linear predictive models that learn through
training
– Resemble biological neural networks in structure
 Decision trees
– Tree-shaped structures that represent sets of
decisions
– These decisions generate rules for the classification
of a dataset
19
Copyright © Watson Wyatt Worldwide. All rights reserved
Data Mining – 5 Common Techniques (2)
 Genetic algorithms
– Optimization techniques
– Genetic combination, mutation, and natural selection
 Nearest neighbor
– Classification of each record based on a combination
of the classes of the k record(s) most similar to it in a
historical dataset
 Rule induction
– Extraction of useful if-then rules from data based on
statistical significance
20
Copyright © Watson Wyatt Worldwide. All rights reserved
Artificial Neural Networks
 ID
–
–
–
structural components for a GLM
Variables
Binning
Interactions
Input
Hidden
Output
 Fraud detection
– Staged accidents
– Other PM techniques
21
Copyright © Watson Wyatt Worldwide. All rights reserved
Classification and Regression Trees - CART





Decision tree based method
Binary recursive partitioning
Brute force non-parametric method
Response is discontinuous
Doesn’t capture strong linear relationships well
N = 100,000
Applications
 Variable selection
 Binning
 Identify predictors of “serious” claims
Area = {1, 2, 3}
Area = {others}
N = 41,127
N = 58,873
Density <50
Density >100
N = 11,245
N = 2,743
Density 50-100
N = 44,885
22
Copyright © Watson Wyatt Worldwide. All rights reserved
Multivariate Adaptive Regression Splines MARS




Multivariate non-parametric regression procedure
Brute force
Response is continuous
Piece-wise linear segments to describe non-linear
relationships
Applications
 Variable selection
 Binning
23
Copyright © Watson Wyatt Worldwide. All rights reserved
Cluster Analysis




Seek to identify homogeneous subgroups
Average linkage or centroid methods
No good literature explaining which is best
Minimize within-group variation and maximize
between-group variation
Applications
 Vehicle symbols
 Segmenting/Tiering
 Fraud detection
24
Copyright © Watson Wyatt Worldwide. All rights reserved
Principal Components/Factor Analysis
 Reduce number of variables
 Detect structure
 Consecutive factors are independent of (orthogonal
to) each other
Applications
 Economic models s/a trend
 Transform/reduce variables
25
Copyright © Watson Wyatt Worldwide. All rights reserved
ISO Innovative Analytics - Risk Analyzer
Modeling Techniques Employed
 Variable Selection – univariate analysis, transformations, known




relationship to loss
Sampling
Regression / general linear modeling
Sub models/data reduction – neural nets, splines, principal
component analysis, variable clustering
Spatial Smoothing – with parameters related to auto insurance
loss patterns
26
Copyright © Watson Wyatt Worldwide. All rights reserved
Quotes
“Prediction is very difficult, especially if it's about
the future.”
- Nils Bohr, Nobel laureate in Physics
"I have seen the future and it is very much like the
present, only longer."
- Kehlog Albran, The Profit
"A good forecaster is not smarter than everyone
else, he merely has his ignorance better
organized."
- Anonymous
27
Copyright © Watson Wyatt Worldwide. All rights reserved
watsonwyatt.com
CAS 2008 Spring Meeting
Joint Meeting CIA/SOA/CAS
A Survey of P&C Predictive
Modeling Applications
Gaétan Veilleux, FCAS, MAAA
June 18, 2008
Related documents