Download Integrating E-Commerce and Data Mining: Architecture and

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
1
Integrating E-Commerce and Data Mining:
Architecture and Challenges
WEB-KDD Workshop
August, 2000
Llew Mason
[email protected]
Joint work with
Suhail Ansari, Ron Kohavi, Zijian Zheng
Blue Martini Software
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
1
2
Outline

E-Commerce: A Killer Domain

Integrated Architecture

Data Collection

Analysis

Challenges

Summary
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
2
3
Killer Domain
E-Commerce

Data records are plentiful

Electronic collection provides reliable data

Enables closed-loop analysis

Insight can easily be turned into action

Success can be directly measured
e.g., Return on investment (ROI)
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
3
4
Integrated Architecture
Business
Data
Definition
Stage
Data
Deploy
Results
Customer
Interaction
Build Data
Warehouse
Analysis
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
4
5
Integrated Architecture
Business
Data
Definition
Stage
Data
Deploy
Results
Customer
Interaction
Build Data
Warehouse
Business facing
Products, content
Analysis
Attributes
Shared meta-data
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
5
6
Integrated Architecture
Business
Data
Definition
Stage
Data
Deploy
Results
Customer
Interaction
Build Data
Warehouse
Build store
Test before production
Analysis
Transform for efficiency
Zero down-time
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
6
7
Integrated Architecture
Business
Data
Definition
Stage
Data
Deploy
Results
Customer
Interaction
Build Data
Warehouse
Customer facing
Multiple Touchpoints
Analysis
Integrated Data Collection
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
7
8
Integrated Architecture
Business
Data
Definition
Stage
Data
Deploy
Results
Customer
Interaction
Build Data
Warehouse
Build warehouse
Automated using meta-data
Analysis
Reduces pre-processing
Transform for analysis
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
8
9
Integrated Architecture
Business
Data
Definition
Stage
Data
Deploy
Results
Customer
Interaction
Build Data
Warehouse
Analysis
Data transformations
Analysis
Exploration
Modeling
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
9
10
Integrated Architecture
Business
Data
Definition
Stage
Data
Deploy
Results
Customer
Interaction
Build Data
Warehouse
Close the loop
Transfer scores, models
Analysis
Personalize
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
10
11
Clickstream Logging

Web server logs





Packet sniffers



Logs every HTTP request - filtering required
Stateless - must identify users and sessions
Captures URLs - must map to content
Can’t understand dynamic content
Streaming data - must parse to understand content
Can’t understand encrypted data (SSL)
Solution : Application server logging
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
11
12
Beyond Clickstream Logging

Business Event Logging
Consider several requests as one logical event
 Add or remove from shopping cart
 Initiate or finalize checkout
 Search
 Register
 Personalization rule evaluation


Provides business insight
Difficult to log outside of application server
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
12
13
Aggregation

Data occurs at multiple granularities
Customers
Sessions
Cities
Finer
Granularity
Requests

Customers
Orders
Many interesting attributes need to be
aggregated for analysis
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
13
14
Aggregation

Interesting customer attributes

What wallet share did each customer spend on books?

How much is each female customer’s average order
amount above the mean value for female customers?

What is the total amount of each customer’s five most
recent purchases over $30?

What is the frequency of each customer’s purchases?

How long ago was each customer’s last purchase?
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
14
15
Hierarchies

E-Commerce data contains many hierarchies

How can we use them in analysis?
Products
Clothing
Books
2
Mens
1
$12
T
F
F
Womens
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
15
16
Analytical Tools

Reporting




OLAP


How do sales vary over time in each geographic region?
Modeling Algorithms




Who are the top referrers by sales generated?
What are the top abandoned products?
What are the conversion rates for each product?
What characterizes visitors that do not buy?
What characterizes customers that prefer promotions?
Which are the potential cross-sells and up-sells?
Visualization
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
16
17
E-Commerce Challenges

Make data mining comprehensible

Support multiple granularity levels

Utilize hierarchies

Support date and time types effectively

Support external events and changing data

Identify bots and crawlers

Handle large amounts of data
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
17
18
Summary

Integrated E-Commerce and data mining
enables effective closed-loop analysis

Application server logging provides integrated
data collection and reduces pre-processing

Powerful data transformations and a broad
suite of analysis techniques are needed

There are many challenges ahead
© Copyright 1998-2000, Blue Martini Software. San Mateo California, USA
18
Related documents