Download Data Mining Techniques in CRM

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
DATA MINING
1
Data Mining

Extracting or “mining” knowledge from large amounts of
data

Data mining is the process of autonomously retrieving
useful information or knowledge from large data stores or
sets.

Data mining is a technique for searching large-scale
databases for patterns used mainly to find previously
unknown correlations between variables.
2
Data Mining Motivation

Changes in the Business Environment
Customers becoming more demanding
Markets are saturated

Databases today are huge:
More than 1,000,000 entities/records/rows
From 10 to 10,000 fields/attributes/variables
Gigabytes and terabytes



Databases a growing at an unprecedented rate
Decisions must be made rapidly
Decisions must be made with maximum knowledge
3
Data Mining: Confluence of Multiple Disciplines
Database
Technology
Machine
Learning
Statistics
Data Mining
A.I.
Algorithm
Visualization
Other
Disciplines
4
VISULIZATION
The visual interpretation of complex relationships in multidimensional data.
Graphics tools are used to illustrate data relationships.
Statistic
In data mining it is used for classifying and grouping things
Machine learning
the ability of a machine to improve its performance based on previous results.
Artificial Intelligence
the branch of computer science that deal with writing computer programs that
can solve problems creatively
Algorithm
precise rule (or set of rules) specifying how to solve some problem
5
Why Not Traditional Data Analysis

Tremendous amount of data

High complexity of data
6
Knowledge discovery in database

KDD is non-trivial process of identifying valid, novel,
potentially useful, and ultimately understandable patterns
in data

Data Mining is a step in KDD process consisting of
particular data mining algorithms
7
Data Mining (cont.)

Data Mining is a step of Knowledge Discovery in
Databases (KDD) Process







Data Warehousing
Data Selection
Data Preprocessing
Data Transformation
Data Mining
Interpretation/Evaluation
Data Mining is sometimes referred to as KDD and
DM and KDD tend to be used as synonyms
8
Steps of a KDD Process
Learning the application domain:
 relevant prior knowledge and goals of application
 data selection
Creating a target data set:
 Data cleaning and preprocessing:
 Data reduction and transformation:
 Find useful features, dimensionality/variable
reduction, invariant representation.
 Choosing functions of data mining
 summarization, classification, regression, association,
clustering.

9





Choosing the mining algorithm(s)
Data mining:
search for patterns of interest
Pattern evaluation and knowledge presentation
 visualization, transformation, removing redundant
patterns, etc.
Use of discovered knowledge
10
DATA MININING EVALUTION
Pattern Evaluation
Data Mining
Task-relevant Data
Data Warehouse
Selection
Data Integration
Databases
11
Data Mining: On What Kind of
Data?




Relational databases
Data warehouses
Transactional databases
Advanced DB and information repositories
 Object-oriented and object-relational databases
 Spatial databases
 Text databases and multimedia databases
 WWW
12
Data Mining Applications:

Banking: loan/credit card approval
predict good customers based on old customers

Targeted marketing:
identify likely responders to promotions

Fraud detection: telecommunications, financial
transactions
from an online stream of event identify fraudulent events
13
Data Mining Applications:

Medicine: disease outcome, effectiveness of
treatments



Molecular/Pharmaceutical: identify new drugs
Scientific data analysis:


analyze patient disease history: find relationship
between diseases
identify new galaxies by searching for sub clusters
Web site/store design and promotion:

find affinity of visitor to pages and
14
Financial Industry, Banks,
Businesses, E-commerce





Stock and investment analysis
Identify loyal customers vs. risky customer
Predict customer spending
Risk management
Sales forecasting
15
Data Mining in CRM:
Customer Life Cycle


Customer Life Cycle
 The stages in the relationship between a customer
and a business
Key stages in the customer lifecycle
 Prospects: people who are not yet customers but are
in the target market
 Responders: prospects who show an interest in a
product or service
 Active Customers: people who are currently using
the product or service
 Former Customers: may be “bad” customers who did
not pay their bills or who incurred high costs
16
Data Mining in CRM

DM helps to


Determine the behavior surrounding a particular
lifecycle event
Find other people in similar life stages and
determine which customers are following similar
behavior patterns
17
Data Mining in CRM (cont.)
Data Warehouse
Customer Profile
Data Mining
Customer Life Cycle Info.
Campaign Management
18
Data Mining Techniques
Data Mining Techniques
Descriptive
Predictive
Clustering
Classification
Association
Decision Tree
Sequential Analysis
Rule Induction
Neural Networks
Nearest Neighbor Classification
Regression
19
Predictive DM
Predictive data mining, which produces the model of
the system described by the given data set
build models in order to estimate unknown values of
interest.
Examples:
Given a customer’s characteristics a model
predicts how much the customer will spend on the
next catalog order.
20
Descriptive DM
Descriptive data mining, which produces new, nontrivial
information based on the available data set.
Descriptive DM is used to learn about and understand
the data.
Example:
Identify and describe groups of customers with
common buying behavior
21
Classification
Classification is the process of sub-dividing a data
set with regard to a number of specific outcomes.
Example
Given old data about customers and payments, predict
new applicant’s loan eligibility.
22
Decision Trees
Tree where internal nodes are simple decision rules on one or more
attributes and leaf nodes are predicted class labels
hair
brown
brown
red
dark
dark
brown
dark
brown
eyes
blue
brown
blue
blue
blue
blue
brown
brown
class
A
B
A
B
B
A
B
B
23
Decision Trees:
Learned Predictive Rules
hair
dark
brown
red
B
A
eyes
blue
A
brown
B
24
Rule induction

In rule induction action are given and we have to discover the rule.

The extraction of useful if-then rules from data based on statistical
significance.

Rule induction is an area of machine learning in which formal rules
are extracted from a set of observations.
Examples

Do not give the discount on 2 items that are frequently brought. use
the discount on 1 to pull the others.

Send camcorder offer to VCR purchasers 2-3 months after VCR
purchase.
25
NEUTAL NETWORK
Set of nodes connected by directed weighted edges
Useful for learning complex data like handwriting, speech and
image recognition.
Neural networks have broad applicability to real world
business problems and have already been successfully
applied in many industries. Since neural networks are best at
identifying patterns or trends in data, they are well suited for
prediction or forecasting needs including:
26
27
NEAREST NEIGHBOUR MEHTOD
The nearest neighbor algorithm in pattern recognition is a method
for classifying phenomena based upon observable features.
Define proximity between instances, find neighbors of new instance
and assign majority class.
The nearest neighbor algorithm is a heuristic algorithm that is not
guaranteed to produce a correct result in most cases.
Clustering

The art of finding groups in data.

Objective: gather items from a database into sets
according to (unknown) common characteristics.

Group existing customers based on time series of
payment history such that similar customers in same
cluster.

Key requirement: Need a good measure of similarity
between instances.
29
Major issues in data mining
Mining different kinds of knowledge in databases.
Expression and visualization of data mining results.
Handling noise and incomplete data.
Pattern evaluation: the interestingness problem.
Efficiency and scalability of data mining algorithms.
Handling relational and complex types of data.
30