Download A Pragmatic Overview of Predictive Analytics Applications

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

RELX Group wikipedia , lookup

Transcript
A Pragmatic Overview of Predictive Analytics Applications
Lee Sarkin (South Africa)
Gavin Maistry (Singapore)
Agenda
1
External Trends Driving Analytics
2
Analytics Concepts for Actuaries
Philosophies and key concepts
Analytics Ecosystem
3
Use Cases and Pitfalls
Application Triage
Experience Analysis and Pricing Models
4
What does this mean for
the pragmatic actuary?
Lapse / churn
Know your business model!
Cross Selling and Targeted Sales
Feedback loops
Claims Rules Engine and Fraud
Post model development
Unstructured Text Mining
Unintended consequences
Predictive return to work for DI claimants
Blind spots
Non-Life
And more…
Types of Errors
The Gap between a model and ‘basis’
Who wins in an arms race?
2
External Trends
We have an interface to visualise ourselves as a collective…
• 3
•
Visualising Ourselves by Aaron Koblin
A Pragmatic Overview of Predictive Analytics
“What's really going to make big data go mainstream is…
…the ability to connect not just with data scientists and technologists but…business people.
And absolutely one of the keys to that is visualization, is being able to show people…
…not just tell people, not just show numbers or charts…
…but to have those visualizations come alive.”
CHRIS SELLAND, VICE PRESIDENT OF MARKETING AND BUSINESS DEVELOPMENT, TABLEAU
A Pragmatic Overview of Predictive
Analytics
External Trends
How are the trends related?
Big Data
Augmented and virtual worlds
3D Printing
Loc-based services
Telematics
Smart Home
Computing
Everywhere
Industrialization 4.0
Digitalization
Wearable Devices
Predictive
Analytics
Robotics/Drones
Internet
of Things
Open Data
Collaborative
Consumption
Big Data
Citizen Development
Mobile Health
Services
User Centered Design
Crowdsourcing
Virtual
Assistant
Systems
On-Demand-Everything
Context-aware
Computing
Integrated
Systems
Digital
Identity
Cybersecurity
Autonomous Systems
and Devices
Automated
Decision
Taking
27. Oktober 2016
Digitization
Risk-based
Security
Web 4.0
Web-Scale IT
Internet of Things
Software-defined
Anything
Haptic
Technologies
New Payment
Models
Cloud/Client
Architecture
Big Data Analytics @ Munich Re / Wolfgang Hauner
5
External Trends
So what is Predictive Analytics?
The study and application of statistical / mathematical models to help predict future behaviour.
It utilizes technology and data to uncover relationships and patterns that can be used to
predict behaviour or events, forecasting probabilities and trends.
- Harvard Business Review, October 2012
A Pragmatic Overview of Predictive
Analytics
Data Analytics is a Combination of Methods,
Technology, Data and People
Technology
Data
 Hardware
(Compute power)
 Internal Data
 Software
(SAS, R, Spark, …)
 External Data
 Structured Data
 Unstructured Data
Methods
People
 Regression Models
 Data Scientists
 Machine Learning
Models
 Data Engineers
 Text Mining
Data
Analytics
 Business People
7
When does it become BIG Data?
40,000,000,000,000,000,000,000
Zettabyte
Exabyte
Petabyte
Terabyte
Gigabyte
Megabyte
Kilobyte
Byte
Yes or No
 43 zettabytes of data will probably be generated by 2020
4 KB Commodore VC 20
3.5 inch floppy disk
 300 times the volume in 2005
Data contained in a library floor
4 TB in Memory Big Data Platforms
Petabyte Storage Big Data Plattform
All words ever spoken by humans
Google, Facebook, Microsoft…
Source: IBM
8
How much data is generated every minute?
Source: www.domo.com and SAS
A Pragmatic Overview of Predictive Analytics
External Trends
Brief History of Statistical Learning
A Pragmatic Overview of Predictive
Analytics
The Opportunity Set – Examples for Life Insurers
Pricing and product
development
Sales &
marketing
Are current best
estimate assumptions
adequate?
Where do I have
concrete up-selling
opportunities in my
existing book?
What would be an
accurate pricing basis
for a promising new
product?
Which clients are most
likely to take up crossselling offers?
What are relevant risk
drivers and how are
they affecting my
current portfolio?
Do I have the right
target groups in focus
for sales campaigns?
Which features make
my products most
appealing for certain
target groups?
Which of my
distribution channels /
offices are really
performing best?
Inforce
management
Underwriting
For which client
groups can I simplify
the underwriting
process to improve
the customer
experience?
How can I reduce the
need for medical
exams to lower the
cost of underwriting?
How profitable is my
business?
Which customers are
at risk to lapse their
policy? Which should I
try to retain and how?
Does my portfolio
composition meet my
pricing assumptions?
How can I use my
underwriting
resources more
efficiently?
A Pragmatic Overview of Predictive Analytics
Claims
How good is my risk
selection process? Am
I attracting poor risks?
How can I streamline
the claims process
The Opportunity Set – Examples for Life Insurers
Applications of predictive analytics can significantly improve a wide variety of core operations for life insurance companies
A Pragmatic Overview of Predictive Analytics
Example: why streamline the UW process?
A Pragmatic Overview of Predictive Analytics
Integrating Analytics in our Business
Retention
Claims
Business
quality
Lead
generation
Media spend
Analytics enables the most value when embedded in broader processes…
Our Customers’ Experiences and Journeys!
14
Survey of Current and Future Applications by Insurers
% of responses: “To which function do you apply Predictive Analytics?”
70 insurers
surveyed for Bain’s
benchmarking database
in 2015
A Pragmatic Overview of Predictive Analytics
External Trends
Unprecedented coverage of machine learning algorithms
A Pragmatic Overview of Predictive
Analytics
Reasons to advance predictive analytics in
actuarial science
• Identifying the most significant variables and quantifying their effect and
interactions, particularly for large numbers (>50!) of variables
• Automating variable selection
• Quantifying and optimizing the predictive power of models
• A method for knowing when you’re over-fitting the data
• Reducing operational risk through statistical code that can be applied consistently
• Easy-to-maintain models
• Facilitates audit trails
A Pragmatic Overview of Predictive Analytics
External Trends
Charting a new course
A Pragmatic Overview of Predictive Analytics
External Trends
Where to start?
Data modelling -> statistical learning paradigm
Advanced analytics has a steep learning curve
A Pragmatic Overview of Predictive
Analytics
External Trends
Powerful IT Infrastructure and Software
Multi-core processing with large in-memory analytics software
A Pragmatic Overview of Predictive
Analytics
Agenda
1
External Trends Driving Analytics
2
Analytics Concepts for Actuaries
Philosophies and the basics
Analytics Ecosystem
3
Use Cases and Pitfalls
Application Triage
Experience Analysis and Pricing Models
4
What does this mean for
the pragmatic actuary?
Lapse / churn
Know your business model!
Cross Selling and Targeted Sales
Feedback loops
Claims Rules Engine and Fraud
Post model development
Unstructured Text Mining
Unintended consequences
Predictive return to work for DI claimants
Blind spots
Non-Life
And more…
Types of Errors
The Gap between a model and ‘basis’
Who wins in an arms race?
22
Analytics Concepts for Actuaries
Modelling objectives
• Optimise the model’s ability to predict unseen test data
• Understand which predictors and interactions are most significant and be able
to interpret their effects on the target variable
• Post-development considerations: how will the model be integrated in
practice?
A Pragmatic Overview of Predictive
Analytics
Analytics Concepts for Actuaries
Predictive Modellers have Culture!
X
Y
Natural
Complexity
Known
Data Modelling
Culture
Unknown
Algorithmic Modelling
Culture
A Pragmatic Overview of Predictive
Analytics
Analytics Concepts for Actuaries
Types of Advanced analytics methods
Supervised
methods
Unsupervised
methods
Regression
(Numeric)
Classification
(Categorical)
Risk rate
Frequency
Severity
Loss Cost
Underwriting decision
Does the customer have life
assurance?
Do
A Pragmatic Overview of Predictive Analytics
Analytics Concepts for Actuaries
Advanced analytics (AA) methods can be divided in two groups
A Pragmatic Overview of Predictive Analytics
Analytics Concepts for Actuaries
Types of models
•
•
•
•
•
•
•
•
Linear models (Y is a linear function of X)
Generalised Linear Models (g(Y) is a linear function of X)
Mixed Effects Models (fixed and random effects))
Linear Model Selection and Regularisation
Non-linearity (GAM, GAMM, etc.)
Tree-based methods
Other machine learning methods
Unsupervised methods
A Pragmatic Overview of Predictive Analytics
Analytics Concepts for Actuaries
The “honest” predictive power
•
Training and Testing Errors – train the model on the blue and test on the red
1
2
3
4
5
A Pragmatic Overview of Predictive
Analytics
Analytics Concepts for Actuaries
Bias-Variance Tradeoff
A Pragmatic Overview of Predictive Analytics
Analytics Concepts for Actuaries
Bias-Variance Tradeoff
The test-set error
never drops
below the
irreducible error
A Pragmatic Overview of Predictive Analytics
Basics
Automating variable selection: regularisation
A Pragmatic Overview of Predictive
Analytics
External Trends
The Rise of the Use Case, Prototypes and Design Thinking
A Pragmatic Overview of Predictive
Analytics
Agenda
1
External Trends Driving Analytics
2
Key Analytics Concepts
Types of modelling and models
Basics
3
4
Use Cases and Pitfalls
Application Triage
Experience Analysis and Pricing Models
Lapse / churn
Cross Selling and Targeted Sales
Our Pilot and Partnership Offering
Global analytics innovation pilots in life, non-life
and health
State-of-the-art analytics IT Infrastructure and
Software
Claims Rules Engine and Fraud
Extensive global analytics expert community with
applied experience
Unstructured Text Mining
Our strategic business partners
Predictive return to work for DI claimants
Non-Life
And more…
A Pragmatic Overview of Predictive Analytics
Examples of Use Cases - Pilots
Life
Non-Life
Predictive underwriting
Early loss detection
Experience Analysis / Pricing and Lapse Models
Cross selling, Up selling
Textmining
GeoAnalytics
Targeted sales
Unstructured Text Mining
Telematics
Mobile, Wearable and medical 3rd party
Data
AI
Claims resource optimization
Churn Prediction
A Pragmatic Overview of Predictive
Analytics
Predictive Underwriting with Machine Learning
Which factors explain the underwriting outcome, which are not significant?
•
Remarks
Only 20 from 58 fields are required to predict the underwriting result
0
10
20
30
40
50
60
70
80
90
100
Occupation Code
Q: Sports
Subsidiary
BMI
• The set of explaining variables
differs based on the covers
included (as expected)
• Top 10 factors are mainly linked
to accidental risk (occupation,
activity, job position, free time
activity). Explained by the high
percentage of cases with
accidental covers included
Job activity
Covers included
Job position
Q: Under treatment
Free time activity
Age
Q: Systemdis./addict./scelet.
Gender
Q: Bike
Entry year
Diff. Age to partner
Sum_insured_Life
Sum_insured_TRANS
Relationship to benef. 1
Relationship to benef. 2
Insurance cover code
27. Oktober 2016
Big Data Analytics @ Munich Re / Wolfgang Hauner
35
Streamlining the Application Process
Visualising the interactions between questions on the App Form!
A Pragmatic Overview of Predictive
Analytics
Predictive Underwriting with Machine Learning
Which application questions impact the underwriting outcome, which do not?
•
Impact on probability for standard or loaded/rejected decision
Currently doing dangerous sports?
• There are no questions which
are always answered with
“YES” or “NO”
Currently under treatment or advised surgery?
Internal disease, skeletal condition, addition?
Cancer or neuro-psychol. condition in last 10y?
Motorbike as competition?
• Some questions did not have any
impact in the model (the data could
not explain why)
HIV/AIDS?
Motorcross?
Taken drugs in last 10y?
Daily use of motorbike?
Currently pregnant?
Had treatment or medical exams in last 3y?
Motorbike?
Hospitalization in last 3y?
Family history?
Legend:
Smoked in previous 12 months?
High
Higher
Low
Lower
No
Previous or advised rehabilitation for addition?
Pregnancy complication?
Stopped usual tasks in previous year?
Plan to visit/reside abroad?
Remarks
• Just because factors did not have
any impact in the model didn’t
mean the relating questions could
be waived (impact on selection
given, i.e., HIV question,
rehabilitation for addiction) →
careful consideration required
37
Cross-Selling with Machine Learning
Analysis of different product portfolios for product development and targeted sales
?
Organisation
Separation
Testing
Random Forest (RF)
Validation
Clients who are not active
anymore are removed. The
remaining data is split into
INSURANCE and no
INSURANCE
Now the data is randomly
split into 5 even boxes. Each
box contains both
INSURANCE and no
INSURANCE. However, the
portion within each box
varies.
For testing the first so called
“set of training data” the first
4 boxes are aggregated again.
Now they are used for
sampling the first buying
characteristics.
Using machine learning
methods, 300 decision trees
will be generated simulating
customer characteristics.
Simulations show chains of
combination for INSURANCE
and no INSURANCE
The just created random
forest will now be used to
back-test the remaining 5th
box: How accurate can we
forecast who bought
INSURANCE and who not?
27. Oktober 2016
38
Big Data Analytics @ Munich Re /
Wolfgang Hauner
Pricing/Lapse Models from Experience Data
Predictive Analytics Liberates Complex Relationships
‘In every block of marble I see a statue as
plain as though it stood before me,
shaped and perfect in attitude and action.
I have only to hew away the rough walls
that imprison the lovely apparition to
reveal it to the other eyes as mine see
it.’—Michelangelo
A Pragmatic Overview of Predictive Analytics
Pricing/Lapse Models from Experience Data
Predicting the data or the future?
•
•
•
•
•
•
Traditional A v. E ratios are calculated with the full experience dataset
Over fitting creates the risk of fitting random variation (noise)
Potentially leads to a false sense of predictive power
Build the model with the training data and test the model with the remainder of the data
Develop predictive power metrics that are meaningful and don’t mislead!
Provides an indicator of the model’s ability to predict independent data at a granular level
1
2
3
4
5
A Pragmatic Overview of Predictive
Analytics
Pricing/Lapse Models from Experience Data
Why use a predictive model?
• Interpretable standardised effects
• Higher predictive power
• Easier to maintain
Base
Gender
M
F
Smoker Status
NS
S
Duration
0
1
2
3+
0.00100
1
0.50
1
2.00
0.70
0.81
0.94
1
A Pragmatic Overview of Predictive
Analytics
Pricing/Lapse Models from Experience Data
Are our performance metrics reasonable?
10000
104%
AvE per candidate model
103%
8000
102%
Predictive power of each
candidate model
6000
101%
100%
5000
Predictive power of
existing rates
4000
3000
99%
Testing AvE
Models to the left outperform
the existing rates!
7000
98%
97%
2000
49
47
45
43
41
39
37
35
33
31
29
27
25
23
21
19
17
15
13
11
9
95%
7
0
5
96%
3
1000
1
Predictive Power Metric
9000
Model number
09.09.2016
42
Applications of Predictive Analytics
Pricing/Lapse Models from Experience Data
Get better performance + applicable to Big Data
 Compare different Machine
Learning algorithms (Support
Vector Machines, Random
Forests, Boosted Trees,
Regression Boosting, Lassoregularized Regression) with
classical GLMs
 Applied to Mortality data
 Additionally: Clear visualization
of main and interaction effects
Machine Learning helps in understanding and selecting the most relevant influential factors
43
Unstructured Text Mining
•
•
•
•
•
Analysis of tweets by location, time period
Search for key words
Text clensing
Sentiment clouds
Triggers of emerging risks
Insurability of impaired lives
Deriving Underwriting Guidelines for Medical Impairments
A Pragmatic Overview of Predictive
Analytics
AI
Neural Network Insurance specific Visual Intelligence
Insurance
specific Vision
Intelligence
Insurance
Companies, e.g.,
Munich Re, …
General Object
Vision
Intelligence
Images left: used under license from shutterstock.com
Image right: Getty Images
27. Oktober 2016
AI Community,
e.g., Google,
Facebook, …
46
AI
Neural Network
Input
Hidden
Output
No pothole identified
Image: Getty Images
Image: Getty Images
Pothole identified
Image: used under license from shutterstock.com
Image: used under license from shutterstock.com
No pothole identified
Image: used under license from shutterstock.com
Image: used under license from shutterstock.com
27. Oktober 2016
Big Data Analytics @ Munich Re / Wolfgang Hauner
• System of interconnected
nodes, exchanging
information
• Weights of connections can
be adjusted by supervised/
unsupervised “learning”
• Pros: Accuracy usually high,
prediction fast
• Cons: “Black box” –
acquired knowledge not
easily comprehensible,
training effort high,
appropriate data needed
• Application areas, e.g.,
speech recognition,
computer vision, medical
diagnosis, automated
trading, game-playing
(AlphaGo)
47
AI
Potential use-cases of Neural Network Infrastructure Insurance
Image: used under license from shutterstock.com
Detect road
damage
27. Oktober 2016
Categorize damage
Image: used under license from shutterstock.com
Estimate
claim
Big Data Analytics @ Munich Re / Wolfgang Hauner
Trigger repair
action
48
Geospatial Analytics
New data sets are triggering new business ideas
A Pragmatic Overview of Predictive Analytics
Digital analytics and transformations in insurance
A Pragmatic Overview of Predictive Analytics
Digital analytics Tools
A Pragmatic Overview of Predictive Analytics
Agenda
1
External Trends Driving Analytics
2
Analytics Concepts for Actuaries
Philosophies and key concepts
Analytics Ecosystem
3
Use Cases and Pitfalls
Application Triage
Experience Analysis and Pricing Models
4
What does this mean for
the pragmatic actuary?
Lapse / churn
Know your business model!
Cross Selling and Targeted Sales
Feedback loops
Claims Rules Engine and Fraud
Post model development
Unstructured Text Mining
Unintended consequences
Predictive return to work for DI claimants
Blind spots
Non-Life
And more…
Types of Errors
The Gap between a model and ‘basis’
Who wins in an arms race?
52
What does this mean for the pragmatic actuary?
•
•
•
•
•
•
•
•
Know your business model!
Feedback loops
Post model development
Unintended consequences
Blind spots
Types of Errors
The Gap between a model and ‘basis’
Who wins in an arms race?
Arms
Manufacturers
A Pragmatic Overview of Predictive Analytics
Any questions or comments?
Lee Sarkin ([email protected])
Gavin Maistry ([email protected])