Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Analysing the footprints of your customer - A case study by ask|net and SAS Klaus-Peter Huber Christiane Theusinger Copyright © 2000 SAS EMEA e-intelligence Web Optimization ! Clickstream Reporting and Analysis ! Web Mining ! Copyright © 2000 SAS EMEA Web Mining Data Mining Models Copyright © 2000 SAS EMEA Challenge and Profit Information about user profiles, customer types, purchase behaviour (CRM) ! Better internet offering by analysis of customer behaviour ! Optimization of online marketing activities with fast built response models etc. ! Scoring, to present the right offer or advert at the right time (One-To-One Marketing) ! Copyright © 2000 SAS EMEA Goals with web mining Customer based optimization of the internet shop or offering ! Segmentation of customers ! Analysis of purchase behaviour ! Personalized offerings or adverts ! Copyright © 2000 SAS EMEA ask|net GmbH Internet shop ! sells software ! mainly LOTUS ! Copyright © 2000 SAS EMEA Project goal Modelling and implementing a Business Information System for sales and marketing department Data Warehousing & Data Mining in the Internetshop of ask|net GmbH Copyright © 2000 SAS EMEA Steps ! System analysis ! Data Warehousing (Oracle8) ! OLAP (SAS/EIS) ! Data mining (Enterprise Miner) Copyright © 2000 SAS EMEA OLAP information structure Copyright © 2000 SAS EMEA Web mining questions at ask|net ! Can the behaviour of a customer (clickstream) be used to predict purchase behaviour with data mining? ! Can we define / derive user profiles out of the log file? ! Which factors influence a purchase in the online shop SoftWarehouse? ! How can we optimize the shop? Copyright © 2000 SAS EMEA e-cycle in e-CRM Web LogfileData Customer-Data Application: Scoring offline and online Copyright © 2000 SAS EMEA e-Intelligence Solution include, consolidated, cleansing, Data Mining ... Analyses: paths, clickstreams, cluster, who buys, cross selling ... read, transform, aggregate, schedule, ... Reporting: responsetime, browsetime, pagehits ... Raw Data Variable Transformation & Integration Sequence Analysis targetvariable variable 11target Data Warehouse Flat File 51 Variables Logfiles variables 66variables Web Server 10 sequence variables Copyright © 2000 SAS EMEA Data Mining Models SEMMA Raw Data Variable Transformation & Integration SequenceAnalysis targetvariable variable 11target Data Warehouse Flat File 51 Variables Logfiles variables 66variables Web Server 10 sequence variables Copyright © 2000 SAS EMEA Data Mining Models SEMMA How are the raw data generated ? “Click - Session“ www.SoftWarehouse.de ... Copyright © 2000 SAS EMEA Copyright © 2000 SAS EMEA Copyright © 2000 SAS EMEA Copyright © 2000 SAS EMEA Oracle database C_VALUE C_CALLER C_FQDN C_REFERER C_TIME L_ID ... ... ... ... ... ... a68adce1ab... home xyz.sas.com www.softwarehouse.de 20-8-99:14:01 DE a68adce1ab... catalog xyz.sas.com home 20-8-99:14:02 DE a68adce1ab... program xyz.sas.com catalog 20-8-99:14:03 DE a68adce1ab... product xyz.sas.com program 20-8-99:14:05 DE a68adce1ab... login xyz.sas.com product 20-8-99:14:07 DE ... ... ... ... ... ... Copyright © 2000 SAS EMEA Raw Data Variable Transformation & Integration SequenceAnalysis targetvariable variable 11target Data Warehouse Flat File 51 Variables Logfiles Variables 66Variables Web Server 10 sequence variables Copyright © 2000 SAS EMEA Data Mining Models SEMMA Sequence analysis of the clickstreams ! " Copyright © 2000 SAS EMEA Find weaknesses in web structure Create new variables for data mining ! results (sequence analysis) # Support (%) Confidence (%) 1 15.5 36.8 login ⇒ register 2 13.4 31.9 login ⇒ login 3 12.3 38.5 addcart ⇒ login 4 11.2 28.1 addcart ⇒ register 5 0.7 4.6 pay_req ⇒ help 6 0.3 3.6 news ⇒ pay_res Copyright © 2000 SAS EMEA Rule " results (sequence analysis) Support (%) Confidence (%) Rule seq_0 28.7 74.6 program⇒product⇒p_info⇒product seq_1 26.4 60.2 program⇒product⇒product⇒product seq_2 17.5 80.3 program⇒product⇒addcart⇒freeze seq_3 17.1 82.6 home⇒catalog⇒program⇒product seq_4 13.5 95.6 logpost⇒catalog⇒program⇒product seq_5 12.7 92.2 login⇒catalog⇒program⇒product seq_6 12.6 93.0 logpost⇒product⇒addcart⇒freeze ... ... ... ... seq_9 6.7 67.3 product⇒ login⇒register⇒regpost # Copyright © 2000 SAS EMEA Raw Data Variablen Transformation & Integration Sequence Analysis targetvariable variable 11target Data Warehouse Flat File 51 Variables Logfiles Variablen 66Variablen Web Server 10 sequence variables Copyright © 2000 SAS EMEA Data Mining Models SEMMA integration of ‘sequence variables‘ C_VALUE C_CALLER c_value col1 col2 col3 col4 col5 ... ... ... ... ... ... a68adcelab... home catalog program product login ... ... ... ... ... a68adce1ab... home c_value a68adce1ab... catalog ... a68adce1ab... program a68adce1ab... product a68adce1ab... login ... ... ... ... concat_column ... a68adcelab... ... ... homecatalogprogramproductlogin ... program⇒product⇒p_info⇒product transformed into ‘%program%product%p_info%product%‘ Copyright © 2000 SAS EMEA Integration of the transaction table C_VALUE C_CALLER C_FQDN C_REFERER C_TIME L_ID ... ... ... ... ... ... a68adce1ab... home xyz.sas.com www.softwarehouse.de 20-8-99:14:01 DE a68adce1ab... catalog xyz.sas.com home 20-8-99:14:02 DE a68adce1ab... program xyz.sas.com catalog 20-8-99:14:03 DE a68adce1ab... product xyz.sas.com program 20-8-99:14:05 DE a68adce1ab... login xyz.sas.com product 20-8-99:14:07 DE ... ... ... ... ... ... Copyright © 2000 SAS EMEA Input Variables Number of clicks ! Duration of session ! pages involved ! User profile information ! Referer adress ! Country code ! target variable: purchase ! Copyright © 2000 SAS EMEA Raw Data Variable Transformation & Integration Sequence Analysis targetvariable variable 11target Data Warehouse Flat File 51 Variables Logfiles Variables 66Variables Web Server 10 sequence variables Copyright © 2000 SAS EMEA Data Mining Models SEMMA Predictive Modelling Which factors influence the purchase of software at the Softwarehouse ? Copyright © 2000 SAS EMEA Predictive Models used Decision Trees ! Regression Analysis ! Neural Networks ! Copyright © 2000 SAS EMEA Underlying Data Number of clicks ! # sessions : ! total number of purchases: ! response: ! percentage training data: ! percentage validation data: ! Copyright © 2000 SAS EMEA >= 5 22527 1642 7.29% 60% 40% LOG-transformation of interval variables Distribution bevor transformation Copyright © 2000 SAS EMEA Distribution after transformation Process Flow Diagram ..... Copyright © 2000 SAS EMEA Copyright © 2000 SAS EMEA Results of the Analysis Copyright © 2000 SAS EMEA Results of the Analysis Neural network is the best predictive model ! Decision Tree best for analysing which clickstreams result in a purchase of a product ! Regression analysis useful for analysing relationships and verify results of the decision tree ! Copyright © 2000 SAS EMEA Results of Data Mining Copyright © 2000 SAS EMEA Results of Data Mining ! Order procedure too long and complicated ⇒ Sequence analysis ‘addcart⇒login‘ or‘addcart⇒register‘ ! Click behaviour is representative ⇒ Split variables ‘seq4‘, ‘clcklgth‘ and ‘clicks‘ - typical buyer paths - ‘readers‘ will buy Copyright © 2000 SAS EMEA Next Steps ! Usage of more data from the customer warehouse for ! ! ! ! better analysis of behaviour cross- and up-selling chances multi channel integration Personalized offerings and advertisement with offline-scoring Copyright © 2000 SAS EMEA Actions (Data Mining) ! Monitor registration/login behaviour ... ⇒ Sequence analysis ‘login⇒register‘ or ‘login⇒login‘ ... because multiple registrations/user complicate basket analysis Copyright © 2000 SAS EMEA Conclusions Sequence Analysis can find weaknesses in web design ! Predictive Modelling finds customer segments and predicts probabilities ! Web-Mining is useful to analyze logfile data ! Ask|net redesigned web design to better fit customer needs ! Copyright © 2000 SAS EMEA