Download Chapter 26

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Choice modelling wikipedia , lookup

Forecasting wikipedia , lookup

Data assimilation wikipedia , lookup

Transcript
Data Mining
Enterprise systems DT211 4
1
Data Mining
• The process of extracting valid,
previously unknown, comprehensible,
and actionable information from large
databases and using it to make crucial
business decisions.
• Involves the analysis of data and the
use of software techniques for finding
hidden and unexpected patterns and
relationships in sets of data.
2
Examples of Applications of Data Mining
• Retail / Marketing
– Predicting response to mailing campaigns
– Market basket analysis
• Banking:
– Detecting patterns of fraudulent credit card use.
• Insurance
– Claims analysis
• Medicine
– Identifying successful medical therapies for
different illnesses
3
Data Mining Operations
• Data mining operations include:
– Predictive modelling
– Database segmentation
– Link analysis
– Deviation detection
4
Predictive Modeling
• Similar to the human learning
experience
– uses observations to form a model of the
important characteristics of some
phenomenon.
5
Predictive Modeling
• Applications of predictive modeling
include direct marketing.
• There are two techniques associated
with predictive modelling: classification
and value prediction.
6
Example of Classification
using Tree Induction
7
Predictive Modelling - Value
Prediction
• Used to estimate a continuous numeric
value that is associated with a database
record.
• Uses the traditional statistical
techniques such as linear regression
and nonlinear regression.
• Applications of value prediction include
credit card fraud detection or target
mailing list identification.
8
Predictive Modelling - Value
Prediction
• Data mining requires statistical
methods that can accommodate nonlinearity, outliers, and non-numeric
data.
• Applications of value prediction include
credit card fraud detection or target
mailing list identification.
9
Database Segmentation
• Aim is to partition a database into an
unknown number of segments, or
clusters, of similar records.
• Applications of database segmentation
include credit card fraud….
10
Database Segmentation using a
Scatterplot
11
Link Analysis
• Aims to establish links between
records, or sets of records, in a
database; one such example would
be association discovery….
• Applications include product affinity
analysis.
• Finds items that imply the presence
of other items in the same event.
12
Link Analysis - Associations
Discovery
• Affinities between items are
represented by association discovery.
– e.g. ‘When a customer rents property for
more than 2 years and is more than 25
years old, in 40% of cases, the customer
will buy a property. This association
happens in 35% of all customers who rent
properties’.
13
•
Data Mining and Data
Warehousing
Data mining requires single, separate, clean,
integrated, and self-consistent source of
data.
• Data quality and consistency is a prerequisite for mining to ensure the accuracy
of the predictive models. Data warehouses
are populated with clean, consistent data as
well as other attributes that are
advantageous to the data mining process:
drill down….
14
Sample types questions
• “Data Mining is one of the most essential
information technologies to aid strategic
formulation” Discuss the validity of this
statement.
•
• Discuss, how different data mining types
operations can generate meaningful
information for the enterprise.
15