Download Oracle data mining algorithms

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Oracle Advanced Analytics
Oracle Data Miner & Oracle Enterprise R
Data mining, machine learning & data science in general are primarily algorithms.
Open source R, in addition, is a programming language, primarily made for statistics, but it is a language and does much more.
ORE is R that can run in the database and without data volume limitations.
Both exist for the Oracle DB EE and for the BDA.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
1
Oracle data mining algorithms (2/7):
Technique
CLASSIFICATION
Application areas
Most commonly used technique
for predicting a specific
outcome such as response / noresponse, high / medium / lowvalue customer, likely to buy /
not buy.
Algorithm(s)
Logistic Regression —classic
statistical technique but now
available inside the Oracle
Database and supports text and
transactional data
Naive Bayes —Fast, simple,
commonly applicable
Support Vector Machine—Next
generation, supports text and
wide data
Decision Tree —Popular,
provides human-readable rules
REGRESSION
Technique for predicting a
continuous numerical outcome
such as customer lifetime value,
house value, process yield rates.
Multiple Regression —classic
statistical technique but now
available inside the Oracle
Database and supports text and
transactional data
Support Vector Machine —Next
generation, supports text and
wide data
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Oracle data mining algorithms (4/7):
Technique
ATTRIBUTE IMPORTANCE
Application areas
Ranks attributes according to
strength of relationship with
target attribute. Use cases
include finding factors most
associated with customers who
respond to an offer, factors
most associated with healthy
patients.
Algorithm(s)
Minimum Description Length—
Considers each attribute as a
simple predictive model of the
target class
ANOMALY DETECTION
Identifies unusual or suspicious
cases based on deviation from
the norm. Common examples
include health care fraud,
expense report fraud, and tax
compliance.
One-Class Support Vector
Machine —Trains on "normal"
cases to flag unusual cases
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Oracle data mining algorithms (6/7):
Technique
CLUSTERING
ASSOCIATION
Application areas
Useful for exploring data and finding
natural groupings. Members of a
cluster are more like each other
than they are like members of a
different cluster. Common examples
include finding new customer
segments, and life sciences
discovery.
Algorithm(s)
Enhanced K-Means—Supports text
mining, hierarchical clustering,
distance based
Finds rules associated with
frequently co-occurring items, used
for market basket analysis, crosssell, root cause analysis. Useful for
product bundling, in-store
placement, and defect analysis.
Apriori—Industry standard for market
basket analysis
Orthogonal Partitioning Clustering—
Hierarchical clustering, density based
Expectation Maximization—Clustering
technique that performs well in mixed
data (dense and sparse) data mining
problems.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Oracle data mining algorithms (7/7):
Technique
FEATURE SELECTION and
EXTRACTION
Application areas
Produces new attributes as linear
combination of existing
attributes. Applicable for text
data, latent semantic analysis,
data compression, data
decomposition and projection,
and pattern recognition.
Algorithm(s)
Non-negative Matrix
Factorization—Next generation,
maps the original data into the
new set of attributes
Principal Components Analysis
(PCA)—creates new fewer
composite attributes that
respresent all the attributes.
Singular Vector Decomposition—
established feature extraction
method that has a wide range of
applications.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Oracle text mining (x IBM Watson)
• Sentiment analysis on social media, documentcentric and text-centric analyses etc. all use
techniques based on analysis of text.
• And there is this strange myth that says that
Oracle does not have text analytics. Take a look
at these:
• Oracle Text on 12c:
http://www.oracle.com/technetwork/testcontent/index-098492.html
• Oracle text mining:
https://docs.oracle.com/cd/B28359_01/datamine.111/b28129/text.htm
• Tutorial (text mining with Oracle Data Miner):
http://www.oracle.com/webfolder/technetwork/tutorials/obe/db/11g/r2/pr
od/bidw/datamining/ODM11gR2-TextMining.htm
Table 20 – Oracle Data Mining Algorithms that Support Text
Algorithm
Mining Function
Naïve Bayes
Classification
Generalized Linear Models
Classification. Regression
Support Vector Machine
Classification, Regression,
Anomaly Detection
k-Means
Clustering
Non-Negative Matrix
Factorization
Feature Extraction
Apriori
Association Rules
Minimum Descriptor
Length
Attribute Importance
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
R in R Studio (screen capture):
PREDICT FAILURE!
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
8
Related documents