Download Oracle Advanced Analytics, Data Mining, Predictive

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Oracle Advanced Analytics, Data Mining, Predictive Analytics,
Big Data, Exalytics—What, Where, When?
Charlie Berger
Sr. Director Product Management, Data Mining and Advanced Analytics
Oracle Corporation
[email protected] www.twitter.com/CharlieDataMine
Sources for “Big Data” are Growing
• 383+ Million Twitter
accounts (100m+ tweeting)
• 835+ Million Facebook
subscribers
• 1.2+ Billion Mobile Web
users
• Over 6 million OnStar
subscribers
2
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Structured Data & “Big Data”
Structured data from applications.
3
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Semi-structured “Big Data” from social
media and logs, sensors, feeds, etc.
“Big Data” + “Big Data Analytics”
“There was 5 exabytes of
information created between
the dawn of civilization
through 2003, but that much
information is now created
every 2 days, and the pace is
increasing.”
1.8 trillion gigabytes of data
was created in 2011…
 More than 90% is
unstructured data
(IN BILLIONS)
GIGABYTES OF DATA) CREATED
10,000
5,000
 Approx. 500
quadrillion files
- Google CEO Eric Schmidt
 Quantity doubles
every 2 years
Requires capability to rapidly:
 Collect and Integrate
 Understand
 Respond and Act
0
2005
4
2010
Content Provided By Cloudera.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
2015
Source: IDC 2011
STRUCTURED DATA
UNSTRUCTURED DATA
"Big data" warrants innovative processing solutions for a
variety of new and existing data to provide real business
benefits. But processing large volumes or wide varieties of
data remains merely a technological solution unless it is tied
to business goals and objectives.
5
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Customer Information Suddenly Worth Billions
There's another mining boom you may have missed
http://www.theage.com.au/technology/technology-news/digging-for-data-the-new-mining-boom-20120511-1yhu5.html#ixzz1uwlsl0gk
“It's about building algorithms and crunching facts
and numbers. It's mining for data.
Big data is the new business black. It's a catch-all phrase
for the billions of transactions and other bits of information
about their customers, suppliers and operations logged by
businesses and governments the world over every day.
Yesterday's storage problem has become today's strategic asset. Turns out
there's gold in them thar files.
Enterprises are using data analysis not just to improve their everyday business
processes, but also to build predictive models of consumer behavior. Retailers, telcos,
airlines, hotels, health care and credit card companies are among those with
information-rich customer data.‖
6
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Sample of Big Data Analytics Use Cases Today
AUTOMOTIVE
Auto sensors
reporting
location,
problems
HIGH TECHNOLOGY /
INDUSTRIAL MFG.
Mfg quality
Warranty analysis
OIL & GAS
Drilling
exploration
sensor analysis
COMMUNICATIONS
Location-based
advertising
CONSUMER
PACKAGED
GOODS
Sentiment analysis
of what’s hot,
problems
LIFE
SCIENCES
Clinical trials
Genomics
MEDIA/
ENTERTAINMENT
Viewers / advertising
effectiveness
RETAIL
Consumer
sentiment
Optimized
marketing
TRAVEL &
TRANSPORTATION
Sensor analysis for
optimal traffic flows
Customer sentiment
FINANCIAL
SERVICES
Risk & portfolio
analysis
ON-LINE
SERVICES /
SOCIAL MEDIA
People & career
matching
Web-site
optimization
UTILITIES
Smart Meter
analysis
Challenged by: Data Volume, Velocity, Variety
7
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
EDUCATION &
RESEARCH
Experiment
sensor analysis
HEALTH CARE
Patient sensors,
monitoring, EHRs
Quality of care
LAW
ENFORCEMENT
& DEFENSE
Threat analysis social media
monitoring, photo
analysis
Oracle Big Data and Big Data Analytics Platform
Accelerate time to market & reduce risk with end-to-end solution
Stream
Acquire
Organize
/Discover
Analyze
Visualize
/Decide
Oracle is the industry leader in database and information management.
Oracle provides all the components you need to get real results from your big data initiatives
8
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Oracle Exalytics
BI at the Speed of Thought
• Oracle Exalytics In-Memory Machine is the world's
first engineered system specifically designed to
deliver high performance analysis, modeling and
planning
• Built using industry-standard hardware, marketleading business intelligence software and inmemory database technology
• Oracle Exalytics is an optimized system that delivers
answers to all your business questions with
unmatched speed, intelligence, simplicity and
manageability
• Oracle Exalytics delivers BI and Advanced/Predictive
Analytics at the the Speed of Thought
9
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Advanced/Predictive Analytics
R
Oracle Real-Time Decisions
Powering the Intelligent Enterprise and Decision Framework
• Decision Management
• Collaborative environment to define decision management
strategies
• Business user controls over decision optimization logic
• Cross-channel customer experience management framework
Choices / Assets
Real-time Context
+ Historical Data
Performance
Goals
• Learning Engine
• Automatically learn from each interaction and discover important
correlations
• Learning can be analyzed by way of user friendly reports
• Learning can be used to make predictions
Rules & Predictive
• Can be deployed independently from decision engine
Models
Decisions
• Decision Engine
• Combines rules and [automated] predictive models to define
contextual, optimal and personalized decision logic
• Decision logic is highly scalable and self-adjusts based on
company defined performance goals
• Can make SQL calls to previously built ODM models
10
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Closed-Loop
Learning
Insight &
Foresights
Recommendations
Oracle Advanced Analytics,
Data Mining, Predictive
Analytics, Big Data,
Exalytics—What,
Where, When?
11
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
―But the bigger you get, the more likely you are to be
doing extensive data mining and the more likely you
are to be implementing or moving towards in-database
analytics. .‖
Quote from “Customary Data Warehouse Concepts vs. Hadoop: Forrester Makes the Call”,
Mark Brunelli, Senior News Editor This RSS Reprints Published: 11 Aug 2011
http://searchdatamanagement.techtarget.com/news/2240039468/Customary-data-warehouse-concepts-vs-Hadoop-Forrester-makes-the-call?vgnextfmt=print
12
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Oracle’s Big Data/Big Data Analytics Integrated Solution
Endeca Information Discovery
Oracle
Big Data
Appliance
Cloudera
Hadoop
Oracle
NoSQL
Open-Source
R
Acquire
13
Oracle
Exadata
Big Data
Connectors
InfiniBand
Oracle Data
Integrator
Oracle
Advanced
Analytics
(ODM + ORE)
InfiniBand
Oracle
Business
Intelligence
Oracle
Database
Organize & Discover
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Oracle
Exalytics
Analyze
Oracle
Real-Time
Decisions
Decide
Oracle Advanced Analytics—What, Where, When?
Do I have some data
and a problem to solve?
No
Yes
No
Does the data fit on a
piece of paper?
Is the data stored in an
Oracle Database?
Yes
No
Do you need realtime predictions
Are you looking for
complex patterns and
relationships?
Yes
Use Oracle
Oracle Data
Use
Data Mining
Mining
+/or Oracle
Oracle R Enterprise
+/or
Enterprise
Consider storing
data in Oracle
Is the problem mostly sum,
%s, pie charts and maybe
a map?
Is the data outside
of the database?
Yes
Yes
Print it out or
use Excel
Use OBIEE
No
Consider RTD
Is OBIEE Fast Enough?
Consider Predictive
Analytics
Consider Exalytics
Bummer!
Inspired by ―Do I Need Hadoop or SQL? Decision flow chart https://s3.amazonaws.com/aaroncordova-published/DataFlowchart.svg
14
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Data Mining / Predictive Analytics
Overview and Value
• Predictive analytics enables you to develop mathematical models
to help you better understand the variables driving success
• Predictive analytics relies on formulas that compare past successes and
failures, and then uses those formulas to predict future outcomes
• Predictive analytics, pattern recognition, and classification problems have been
long used in the financial services and insurance industries
• Predictive analytics is about using statistics, data mining, and game theory to
analyze current and historical facts in order to make predictions about future
events.
• The value of predictive analytics is obvious. The more you understand
customer behavior and motivations, the more effective your marketing will be.
– The more you understand why some customers are loyal and how to attract and retain different customer
segments, the more you can develop relevant, compelling messages and offers.
http://www.marketingprofs.com/articles/2010/3567/the-nine-most-common-data-mining-techniques-used-in-predictive-analytics
15
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Advanced Analytics
Becoming Strategic and Mission Critical
Competing on Analytics, by Tom Davenport
―Some companies have built their very businesses on their ability to collect,
analyze, and act on data.‖
16
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Oracle Advanced Analytics Option
Extending the Database into a Comprehensive Advanced Analytics Platform
• Big Data Analytics—Oracle Advanced Analytics Option
– Oracle Data Mining
• SQL & PL/SQL focused in-database data mining and predictive analytics
– Oracle R Enterprise
• Integrates Open Source R statistical programming language
with the Oracle Database
• STRATEGY:
– Extend Oracle Database into comprehensive adv. analytics platform
• More than a “tool” for data analysts
– Enable ―next-gen‖ enterprise-wide advanced analytical Applications
17
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
R
Oracle In-Database Advanced Analytics
Comprehensive Advanced Analytics Platform
Oracle R Enterprise
Oracle Data Mining
• Popular open source R statistical
programming language & environment
• Integrated with database for scalability
• Wide range of statistical and advanced
analytical functions
• R embedded in enterprise appls & OBIEE
• Exploratory data analysis
• Extensive graphics
• Open source R (CRAN) packages
• Integrated with Hadoop for HPC
• SQL kernel; automated knowledge
discovery inside the Database
• 12 in-database data mining algorithms
• Text mining
• Predictive analytics applications
development environment
• Star schema and transactional data mining
• Exadata "scoring" of ODM models
• SQL Developer/Oracle Data Miner GUI
Statistics
18
R
Advanced Analytics
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Data & Text Mining
Predictive Analytics
Independent Samples T-Test
(Pooled Variances)
• Query compares the mean of AMOUNT_SOLD between
MEN and WOMEN within CUST_INCOME_LEVEL ranges. Returns
observed t value and its related two-sided significance
SELECT substr(cust_income_level,1,22) income_level,
avg(decode(cust_gender,'M',amount_sold,null)) sold_to_men,
avg(decode(cust_gender,'F',amount_sold,null)) sold_to_women,
stats_t_test_indep(cust_gender, amount_sold, 'STATISTIC','F') t_observed,
stats_t_test_indep(cust_gender, amount_sold) two_sided_p_value
FROM sh.customers c, sh.sales s
WHERE c.cust_id=s.cust_id
GROUP BY rollup(cust_income_level)
ORDER BY 1;
SQL Plus
19
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Data Mining Provides
Better Information, Valuable Insights and Predictions
Cell Phone Churners
vs. Loyal Customers
Segment #3:
IF CUST_MO > 7 AND
INCOME < $175K, THEN
Prediction = Cell Phone
Churner, Confidence = 83%,
Support = 6/39
Insight & Prediction
Segment #1:
IF CUST_MO > 14 AND
INCOME < $90K, THEN
Prediction = Cell Phone
Churner, Confidence = 100%,
Support = 8/39
Customer Months
Source: Inspired from Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management by Michael J. A. Berry, Gordon S. Linoff
20
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Data Mining Provides
Better Information, Valuable Insights and Predictions
Cell Phone Fraud
vs. Loyal Customers
?
Customer Months
Source: Inspired from Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management by Michael J. A. Berry, Gordon S. Linoff
21
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Finding Needles in Haystacks
• Haystacks
are usually
BIG
• Needles are
typically small
and
22
rare
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Look for What is “Different”
23
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
A Real Fraud Example
• My actual credit card statement—Can you see the fraud?
May 22
1:14 PM
FOOD
May 22
7:32 PM
WINE
…
Gas Station?
June 14
2:05 PM
MISC
June 14
2:06 PM
MISC
June 15
11:48 AM
MISC
June 15
11:49 AM
MISC
May 28
6:31 PM
WINE
May 29
8:39 PM
FOOD
June 16
11:48 AM
MISC
June 16
11:49 AM
MISC
All same $75 amount?
24
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Monaco Café
Wine Bistro
$127.38
$28.00
Mobil Mart
Mobil Mart
Mobil Mart
Mobil Mart
Acton Shop
Crossroads
Mobil Mart
Mobil Mart
$75.00
$75.00
$75.00
$75.00
$31.00
$128.14
$75.00
$75.00
Monaco?
Pairs of
$75?
Oracle’s In-Database Advanced Analytics Option
Value Proposition
•10-100x PERFORMANCE
– Integrated features of the Database
– Perform analytics in-DB to eliminate data movement
– Reduce information latency: days-wks  mins-hours
•10x LOWER TOTAL COST OF OWNERSHIP
– Eliminate/minimize expensive annual usage fees associated with
traditional stats/mining packages
– Leverage Oracle DB, DW & BI technology platform
25
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
In-Database Data Mining
Traditional Analytics
Oracle Data Mining
Results
Data Import
Data Mining
Model “Scoring”
Data Preparation
and
Transformation
Savings
Data Mining
Model Building
Data Prep &
Transformation
Model ―Scoring‖
Data remains in the Database
Embedded data preparation
Data Extraction
Cutting edge machine learning
algorithms inside the SQL kernel of
Database
Model “Scoring”
Embedded Data Prep
Model Building
Data Preparation
Hours, Days or Weeks
Source
Data
26
• Faster time for
“Data” to “Insights”
• Lower TCO—Eliminates
• Data Movement
• Data Duplication
• Maintains Security
Dataset
s/ Work
Area
Analytic
al
Process
ing
Process
Output
Target
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Secs, Mins or Hours
SQL—Most powerful language for data
preparation and transformation
Data remains in the Database
Big Data Analysis Example
Social Network Analysis (SNA)
• Identify social relationships
– Communities, friends and families
– Hubs, influencers, lone wolfs,
• SNA-based strategies for Acquisition, Retention, and
Customer Value Growth
– Word of mouth
– Promote positive messages
– Suppress negative effects
27
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Big Data Analytics in Retail
Deeper Analytics
Leverage all customer touch-point information
Consider each customer’s demographics and past and most recent POS behavior
Deeper Analytics
POS data—shifting “market basket” items (Target Stores predicts women is pregnant)
Product description data—identify product clusters (“Green” products, “Favorite colors”)
Mine both and identify customer segments (“Country Squires”, “Green”, “New Empty Nests”)
Track and monitor shifts in customer behaviors and household purchases
Target promotions for up-selling and cross-selling
Anticipate customer’s likelihood to respond and optimize selling strategies
Deploy predictive models for real-time customer recommendations
1:1 Marketing—treat each customer as an individual relationship
Look for opportunities to combine sales, service, web, call center, payment, etc. data
Customer’s comments with Reps telegraph customer’s sentiment and needs
Geo-localized information provides opportunity for real-time recommendations
Purchases, changing consumption, provide opportunities for cross-selling/up-selling
28
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Deeper Analytics
Enhanced churn prediction with social network analytics
Deeper Analytics
Big Data Analytics in Communications
Predictive network monitoring and anomaly detection
Consider each customer’s value as part of their social network
Focus retention campaigns on high-value social networks
Identify new prospective high-value customers and their “friends and families”
Target promotions for up-selling and cross-selling to key social network influencers
Identify rotational churners and exclude from retention offers
Mine network traffic performance data
Identify patterns in network behavior
Proactively manage networks and customer service levels
Real-time prediction of future degraded service levels
29
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Deeper Analytics
360o view of the customer
Deeper Analytics
Big Data Analytics in Financial Services
Identify and combat fraud
Integrate silos of multi-business CRM data within large corporations
Combine customer data from multiple sources: investments, retail banking, mortgage
Gain 360o perspective of all touch points with a customer
Develop “best” customer profiles and sell them the right product at the right time
Anticipate customer’s needs for new products and services as their lifestyles change
Real-time check fraud
Transactional data combine with demographic data
Monitor velocity of recent purchases and amounts of checks written vs. historical averages
Flag transactions and individuals that appear “different” from normal behavior
30
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Big Data Analytics in Insurance
Deeper Analytics
Automated deep analytics for fraud and abuse in claims processing
Include more data in analysis:
Transactional data—trends in frequency of previous claims and transactions
e.g. increasing rate of claims made and amount of claims
Unstructured data—assessors’s report, police reports, witness interviews
e.g. “fractured” + “wrist” << “broken” + “femur”
Investigate claims that have the highest expected risk (P(fraud x $$ claim))
Focus scarce investigative resources and create feedback loop for automated analysis
Deeper Analytics
[See http://www.information-management.com/issues/20030701/6995-1.html among other “text mining insurance claims” references]
Individualized auto-insurance policies based on vehicle telematics
Insurers gain insight into customer’s driving habits
More accurate assessments of risks
Individualized pricing based on driving habits and risks
Guide and motivate customers to improve their driving habits
31
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Big Data Analytics High Performance Operations
Deeper Analytics
Learn from manufacturing, warranty, service data
Devices report back product’s performance
Analyze usage and part failure correlations and patterns
Identify new strategies for improved product design and service plans
Increase product uptime, performance and quality
Deeper Analytics
Characterize and understand all product performance scenarios
Streaming data from multiple sensors, weather, water, etc.
Clustering and response modeling to optimize each scenario
“The USA holds 250 sensors to collect raw data: pressure sensors on the wing;
angle sensors on the adjustable trailing edge of the wing sail .... … But
collecting data was only the beginning. BMW ORACLE Racing also had to
manage that data, analyze it, and present useful results. The team turned to
Oracle Data Mining in Oracle Database 11g.”
32
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
SQL Developer 3.0/Oracle Data Miner 11g Release 2 GUI
• Graphical User Interface
for data analyst
• SQL Developer Extension
(OTN download)
• Explore data—discover
new insights
• Build and evaluate data
mining models
• Apply predictive models
• Share analytical workflows
• Deploy SQL Apply
code/scripts
33
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
New GUI
Oracle Data Miner Nodes (Partial List)
Tables and Views
Transformations
Explore Data
Modeling
Text
34
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Oracle Data Mining and Unstructured Data
• Oracle Data Mining
mines unstructured
i.e. ―text‖ data
• Include free text and
comments in ODM
models
• Cluster and Classify
documents
• Oracle Text
used to preprocess
unstructured text
35
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Oracle Data Miner 11g Release 2 GUI
Churn Demo—Simple Conceptual Workflow
Churn models to product
and ―profile‖ likely
churners
36
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Fraud Prediction Demo
drop table CLAIMS_SET;
exec dbms_data_mining.drop_model('CLAIMSMODEL');
create table CLAIMS_SET (setting_name varchar2(30), setting_value varchar2(4000));
insert into CLAIMS_SET values
('ALGO_NAME','ALGO_SUPPORT_VECTOR_MACHINES');
insert into CLAIMS_SET values ('PREP_AUTO','ON');
commit;
begin
dbms_data_mining.create_model('CLAIMSMODEL', 'CLASSIFICATION',
'CLAIMS', 'POLICYNUMBER', null, 'CLAIMS_SET');
end;
/
-- Top 5 most suspicious fraud policy holder claims
select * from
(select POLICYNUMBER, round(prob_fraud*100,2) percent_fraud,
rank() over (order by prob_fraud desc) rnk from
(select POLICYNUMBER, prediction_probability(CLAIMSMODEL, '0' using *) prob_fraud
from CLAIMS
where PASTNUMBEROFCLAIMS in ('2to4', 'morethan4')))
where rnk <= 5
order by percent_fraud desc;
37
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
POLICYNUMBER
-----------6532
2749
3440
654
12650
PERCENT_FRAUD
------------64.78
64.17
63.22
63.1
62.36
RNK
---------1
2
3
4
5
Automated Monthly “Application”! Just add:
Create
View CLAIMS2_30
As
Select * from CLAIMS2
Where mydate > SYSDATE – 30
Real-time Prediction for a Customer
• On-the-fly, single record apply with new data (e.g. from
call center)
Select prediction_probability(CLAS_DT_1_1, 'Yes'
USING 7800 as bank_funds, 125 as checking_amount, 20 as
credit_balance, 55 as age, 'Married' as marital_status,
250 as MONEY_MONTLY_OVERDRAWN, 1 as house_ownership)
from dual;
Call Center
Social Media
Branch
ECM
BI
Get Advice
Web
Email
CRM
38
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Mobile
RTD Calling In-Database Predictive Analytics
• When appropriate, RTD
can make SQL queries
requesting retrieval of
previously built indatabase predictive
models OR additional
real-time ODM
predictions based on
current data
RTD SQL
Call
ODM &
ORE
RTD
Score
Returned
In-Database
Algorithms and
Data Mining
•Operationalize
Decisions
•Self-learning
models
•Arbitration of
scores and KPIs
39
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Realtime
Scoring
Platform
Exadata + Data Mining 11g Release 2
“DM Scoring” Pushed to Storage!
Faster
• In 11g Release 2, SQL predicates and Oracle Data Mining models are pushed to
storage level for execution
For example, find the US customers likely to churn:
select cust_id
from customers
where region = ‘US’
and prediction_probability(churnmod,‘Y’ using *) > 0.8;
40
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Predictive Analytics Applications
(Partial list as of 6/12)
• Oracle CRM Sales Prospector
• SaaS—Prediction of Sales opportunities, what to sell, amount, timing
• Fusion CRM Sales Prediction Engine
• Sales Prediction—Prediction of Sales opportunities, what to sell, amount, timing, etc.
• Oracle Fusion Human Capital Management (HCM)
• Predictive Workforce—Employee turnover and performance prediction and ―What if?‖ prediction
• Oracle Industry Data Models—factory installed data mining for specific industries
• Communications Data Model—churn, segmentation, profiling, etc.
• Retail Data Model—market basket analysis, loyalty, etc.
• Airline Data Model—frequent flyer loyalty, FF profiles, targeted promotions, etc.
• Oracle Spend Classification—auto review/real-time correction of submission mistakes
• Oracle Identify Management
• Adaptive Access Manager Real-time Security at user login
• Oracle FMW
• Complex Event Processing integrated with integrated ODM models
• Oracle Advanced Customer Support
• Predictive Incident Monitoring (PIM) Service for Oracle Database customers
• Oracle Retail Customer Analytics
• Market basket analysis application for Retail GBU
41
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
CRM Sales Prospector/ Fusion Sales Prediction Engine
Factory Installed PA/ODM Methodologies
Oracle Sales Prospector
Oracle Data Mining
predicts likelihood of
purchases
ODM Predictions
exposed via
Social CRM
Dashboards
Oracle Database 11G
Social CRM schema
ships with
Oracle Database EE
11g + Data Mining
Option
42
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Oracle Data Mining recommends
products customer is likely to buy
Oracle Data
Mining suggests
likely references
Oracle Communications Industry Data Model Example
Better Information for OBIEE Dashboards
ODM’s predictions & probabilities are
available in the Database for reporting
using Oracle BI EE and other tools
43
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Exadata with Analytics and Business Intelligence
Better Together
• In-database
data mining
builds
predictive
models that
predict
customer
behavior
• OBIEE’s
integrated
spatial
mapping
shows
where
44
Customer “most likely” be be
HIGH and VERY HIGH value
customer in the future
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Fusion HCM Predictive Analytics
Factory Installed PA/ODM Methodologies
Oracle Data Mining’s factory-installed predictive analytics
show employees likely to leave, top reasons, expected
performance and real-time "What if?" analysis
45
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Retail GBU (formerly Retek, GA in Q1FY13)
Market Basket Analysis
Market Basket Analysis to identify cooccurring items found in ―baksets‖
and potential product bundless
46
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
R Statistical Programming Language
Open source language and
environment
Used for statistical computing
and graphics
Strength in easily producing
publication-quality plots
Highly extensible with open
source community R packages
47
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Oracle R Enterprise Compute Engines
1
R Engine
2
3
Oracle Database
Other R
packages
SQL
R
Oracle R Enterprise packages
Results
R Engine
R
User tables
?x
Open Source
Other R
packages
Oracle R Enterprise packages
Results
User R Engine on desktop
Database Compute Engine
R Engine(s) spawned by Oracle DB
•
•
Scale to large datasets
•
•
Access tables, views, and external tables,
as well as data through
DB LINKS
•
•
Leverage database SQL parallelism
•
Leverage new and existing
in-database statistical and data mining
capabilities
•
R-SQL Transparency Framework intercepts R
functions for scalable
in-database execution
Function intercept for data transforms,
statistical functions and advanced analytics
•
Interactive display of graphical results and
flow control as in standard R
•
Submit entire R scripts for execution by
Oracle Database
48
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
•
•
Database can spawn multiple R engines for
database-managed parallelism
Efficient data transfer to spawned
R engines
Emulate map-reduce style algorithms and
applications
Enables “lights-out” execution of R scripts
R Graphics
R> boxplot(split(CARSTATS$mpg, CARSTATS$model.year), col = "green")
MPG increases
over time…
49
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Oracle R Enterprise Statistics Engine
https://blogs.oracle.com/R/entry/introduction_to_the_ore_statistics
Significance Tests
Chi-square, McNemar, Bowker
Simple and weighted kappas
Cochran-Mantel-Haenzel correlation
Cramer's V
Binomial, KS, t, F, Wilcox
Distribution Functions
Beta distribution
Binomial distribution
Cauchy distribution
Chi-square distribution
Exponential distribution
F-distribution
Gamma distribution
Geometric distribution
Log Normal distribution
Logistic distribution
Negative Binomial distribution
Normal distribution
50
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Poisson distribution
Sign Rank distribution
Student t distribution
Uniform distribution
Weibull distribution
Density Function
Probability Function
Quantile distribution
Other Functions
Gamma function
Natural logarithm of the Gamma function
Digamma function
Trigamma function
Error function
Complementary error function
Base SAS Equivalents
Freq, Summary, Sort
Rank, Corr, Univariate
Oracle R Enterprise ARIMA Forecasting Script
year200801 <- ONTIME_S[(ONTIME_S$YEAR==2008)& (ONTIME_S$MONTH==1),]
y <- ore.pull(year200801)
gc()
delays <- tapply(y$ARRDELAY, y$DAYOFMONTH, mean, na.rm=TRUE)
delays <- ts(delays, start=1, end=31, frequency=1)
# Create a Kalman filter with the first 5 delays and predict the rest
preds <- c()
ses <- c()
# 1 step predictions
for (i in 5:length(delays))
{
fit <- arima(delays[1:i], c(1,2,1))
# predict 1 step into the future.
pred <- predict(fit)
preds <- c(preds, pred$pred)
ses <- c(ses, pred$se)
}
plot(5:length(delays), preds, type='l', col='green',
ylim=range(c(preds+2*ses, preds-2*ses)), xlab="DEay of month",
ylab="Predicted average delay (in minutes)",
main="Average delays by day for January 2008")
lines(5:length(delays), preds+2*ses, col='red')
lines(5:length(delays), preds-2*ses, col='red')
points(5:length(delays), as.vector(delays[5:length(delays)]))
legend( 23, -8, c("Delay", "Predicted delay", "2 se confidence"),
col=c(1, 3, 8), lty=c(0, 1, 1), pch=c(1, -1, -1), merge=TRUE)
51
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Oracle BI Applications
Comprehensive, Prebuilt, Best Practice Analytics
Comms
& Media
Auto
Complex
Mfg
Service &
Contact Center
Sales
Consumer
Sector
Energy
Marketing
Financial
Services
Procurement &
Spend
Direct / Indirect
Pipeline
Analysis
Service
Effectiveness
Campaign
Effectiveness
Forecast
Accuracy
Customer
Satisfaction
Customer
Insight
Buyer Productivity
Sales Team
Effectiveness
Resolution
Rates
Product
Propensity
Off Contract Purchases
Up-sell/
Cross-sell
Service Rep
Efficiency
Loyalty &
Attrition
Cycle Times
Service Cost
Market Basket
Analysis
Lead
Conversion
Churn & Service
Trends
Campaign
ROI
Spend
High
Tech
Insurance
& Health
Supply Chain &
Order Management
Life
Sciences
Public
Sector
Human
Resources
Financials
Revenue and
Backlog
General
Ledger
Employee
Productivity
Inventory
Accounts
Receivable
Compensation
Fulfillment
Accounts
Opportunities
Status for predictive
Payable
Supplier Performance
Customermining Cash Flow
analytics/data
Status
Purchase
Cycle Time
Employee Expenses
Compliance
Reporting
Workforce
Profile
Order
Cycle Time
Profitability
Retention
Analysis
BOM
Analysis
Expense
Management
Return on
Human Capital
and Other Operational & Analytic
Sources
Source adapters:
Oracle BI Suite Enterprise Edition Plus
52
Travel
& Trans
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
52
Oracle In-Database Advanced Analytics
Comprehensive Advanced Analytics Platform
Oracle R Enterprise
Oracle Data Mining
• Popular open source statistical
programming language & environment
• Integrated with database for scalability
• Wide range of statistical and advanced
analytical functions
• R embedded in enterprise appls & OBIEE
• Exploratory data analysis
• Extensive graphics
• Open source R (CRAN) packages
• Integrated with Hadoop for HPC
• Automated knowledge discovery inside the
Database
• 12 in-database data mining algorithms
• Text mining
• Predictive analytics applications
development environment
• Star schema and transactional data mining
• Exadata "scoring" of ODM models
• SQL Developer/Oracle Data Miner GUI
Statistics
53
Advanced Analytics
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
R
Data & Text Mining
Predictive Analytics
Where, When?
Answers:
1. Oracle Advanced Analytics Option
2. In-database
3. Start now!
54
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
1.8 trillion gigabytes of data
was created in 2011…
 More than 90% is
unstructured data
(IN BILLIONS)
Exalytics—What,
10,000
GIGABYTES OF DATA) CREATED
Oracle Advanced Analytics,
Data Mining, Predictive
Analytics, Big Data,
5,000
 Approx. 500
quadrillion files
 Quantity doubles
every 2 years
0
2005
2010
2015
55
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
The preceeding is intended to outline our general product
direction. It is intended for information purposes only, and may
not be incorporated into any contract. It is not a commitment to
deliver any material, code, or functionality, and should not be
relied upon in making purchasing decisions.
The development, release, and timing of any features or
functionality described for Oracle’s products remains at the
sole discretion of Oracle.
56
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
57
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.