Download Cluster Analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
MKT 700
Business Intelligence and
Decision Models
Week 6:
Segmentation and Cluster Analysis
What have we seen so far?
Overview:Analytical CRM
Overview:CRISP-DM Methodology
Data Preparation
Legacy Approach: RFM
Customer Value
Where are we going from
now?
Classification: Clustering
Classification +: Profiling
Predictive Modeling: Response Probability
Outline for Today

Clustering:
Clustering and Segmentation
 B2C and B2B
 Clustering theory


Lab
Typical Classroom
Segmentation (Beer)
Local
Office
Worker
Maple
Leaf
Fans
Blue
Collar
Light
Strong
Occasional
Selective
Foodies
Import
Clusters and Segments

Differences between clusters and
segments
Learning segmentation
 Dynamic segmentation

Consumer Segmentation
Taxonomy





Product usage/loyalty
Buying behaviour
Preferred channel
Family life cycle (stage in life)
Lifestyle (personal values)
Status Levels and Segments
(needs + treatment)
Status Levels
Marketing Segments
Business
Customers
Gold
Affluent
Retired
Silver
Young
Singles
Families
With Kids
Customer
Bronze
Bargain
Shoppers
Occasional
Buyers
Marketing
Staff
Data Sources for
Segmentation

Internal



Transactions
Surveys & Customer Service
External (Data overlays)




Lists
Census
Taxfiler
Geocoding
Geo-Segmentation in CDA
Birds of a feather f___k together…

Environics (Prizm)


Generation5 (Mosaic)


http://www.generation5.ca
Manifold:


http://www.environicsanalytics.ca/prizm-c2-clusterlookup
http://www.manifolddatamining.com/html/lifestyle/lifes
tyle171.htm
Pitney-Bowes (Mapinfo)

http://www.utahbluemedia.com/pbbi/psyte/psyteCanad
a.html
B2B Segmentation
Taxonomy
Firm size (employees, sales)
 Industry (SIC, NAICS)
 Buying process
 Value within finished product
 Usage (Production/Maintenance)
 Order size and Frequency
 Expectations

Clustering

Measuring distances (differences) or
proximities (similarities) between
subjects
BI Modeling Techniques

No Target (No dependent variable,
unsupervised learning)
• RFM
• Cluster Analysis

Target (Dependent variable,
supervised learning)
• Regression Analysis
• Decision Trees
• Neural Net Analysis
Measuring distances
(two dimensions, x and y)
A
B
C
Pythagoras
17
Measuring distances
(two dimensions)
D(b,a)
A
B
D(a,c)
D(b,c)
C
18
Measuring distances
(two dimensions)
dac2 = (dx2 + dy2)
A
B
C
Euclid
dac2 = (di)2
dac = [(di)2]1/2
19
Distances between US cities
ATL
CHI
DEN
HOU
LA
MIA
NY
SF
SEA
DC
0
587
1212
701
1936
604
748
2139
2182
543
Chicago
587
0
920
940
1745
1188
713
1858
1737
597
Denver
1212
920
0
879
831
1726
1631
949
1021
1494
701
940
879
0
1374
968
1420
1645
1891
1220
1936
1745
831
1374
0
2339
2451
347
959
2300
Miami
604
1188
1726
968
2339
0
1092
2594
2734
923
New_York
748
713
1631
1420
2451
1092
0
2571
2408
205
2139
2182
543
1858
1737
597
949
1021
1494
1645
1891
1220
347
959
2300
2594
2734
923
2571
2408
205
0
678
2442
678
0
2329
2442
2329
0
Atlanta
Houston
Los_Angeles
San_Francisco
Seattle
Washington_DC
Cluster Analysis Techniques

Hierarchical Clustering

Metric, small datasets
SPSS Hierarchical Clusters
Dendogram
SPSS Multidimensional Scaling
(Euclidean Distance)
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Atlanta
Chicago
Denver
Houston
Los_Angeles
Miami
New_York
San_Francisco
Seattle
Washington
1
2
.9575
.5090
-.6416
.2151
-1.6036
1.5101
1.4284
-1.8925
-1.7875
1.3051
-.1905
.4541
.0337
-.7631
-.5197
-.7752
.6914
-.1500
.7723
.4469
Euclidean distance mapping
Cluster Analysis Techniques

Hierarchical Clustering


K-mean Clustering


Metric variables, small datasets
Metric, large datasets
Two-Step Clustering

Metric/non-metric, large datasets,
optimal clustering
Cluster Analysis Techniques
See Chapter 23, SPSS Base Statistics for description of methods
Two-Step Cluster Tutorials

SPSS, Direct Marketing, Chapter 3 and 9
 Help  Case Studies  Direct Marketing 
Cluster Analysis
 File to be used: dmdata.sav

SPSS, Base Statistics, Chapter 24
 Analyze  Classifiy  Two-Step Cluster
 File to be used: Car_Sales.sav
 Help: “Show me”