Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Post-processing Decision Trees to Extract Actionable Knowledge Qiang Yang and Jie Yin HKUST, Hong Kong China and Charles X. Ling and Tielin Chen Department of Computer Science University of Western Ontario, Canada CRM Customer Relationship Management: focus on customer satisfaction to improve profit Two kinds of CRM Enabling CRM: Infrastructure, multiple touch point management, data integration and management, … Oracle, IBM, PeopleSoft, Siebel Systems, … Intelligent CRM: data mining and analysis, database marketing, customization Vendors/products (see later) From Data Mining to Actions What to do to help Sammy to get loan approval? Applicants Customer Database Sammy Beatrice Dylan Mathew Larry Basil Income 50K 50K 80K 30K 40K 80K Married n y n n n n Cars 1 1 2 1 0 1 Approved? ? Yes Yes No No Yes •Action 1 (Dylan): get higher income to 80K • Action 2 (Beatrice): get Married!! Actionable vs. Passive Data Mining Improve customer relationship What actions to take to change customers from an undesired status to a desired one Actions (promotion, communication) changes From churn to loyal From inactive to active From low spending to high spending From non-buyers to buyers … and still make a profit (the ultimate goal) Approach: Post-processing Decision Trees Mining actions from decision trees Bounded action problem Bounded segment problem Our solutions Post-processing Decision Trees 1. 2. 3. 4. Get Customer Data (marketing DB) Build Customer Profiles Search Actions for Maximal Profit Action Delivery Step 1: Get Customer Data Marketing DB: Segmentation, data preparation, pre-processing… Define a “target”: undesired status and desired status ID Name Age Sex Service Rate Prof … Retained (Target) 1001 John 50 M H L A … Yes 3010 Sue 25 F M H D … No … … … … … … … … 40 M M L B … ??? … 1112 Jack Step 2: Build Customer Profile on target Automatically by Proactive Solution with probabilities on the target Service M H L Sex F Rate M L H Prob=0.1 Prob=0.9 Prob = 0.2 Prob=0.8 Prob=0.5 Step 3: Search Actions for Maximal Profit Proactive Solution searches more desired nodes in the profile… ID Name Age Sex Service Rate Prof … Retained … … … … … … … … … 40 M M L B … ??? 1112 Jack Jack: …, Service = M, Sex = M, … Profit =$4000 Service M H L Sex F Rate M L Prob=0.1 H Prob gain = -0.1 Serv: MH E.Profit= -400 Rate: ?L Cost= $500 E.Net Profit= -900 Prob=0.9 Prob gain = 0.7 E Profit= $2800 Cost = E Net Profit= - Prob = 0.2 Prob=0.8 Prob gain = 0.6 E Profit=$2400 E.Profit=$2400 Cost=$800 E NetProfit=$1600 E.NetProfit=$1600 Prob=0.5 Prob gain = 0.3 E Profit=$1200 Cost=$400 E NetProfit=$800 Step 4: Action Deployment ID Name Prob Actions diff 1112 Jack … 0.6 3010 Sue 0.5 3421 Bill … Action costs Service: M H $800 Rate: L M SigAcc: 0 1 $500 Service: L M N/A NetProfit … $1600 … $700 $0 • Selective deployment: human intelligence, … • Customer segmentation by actions Practical Issue: Resource is Bounded Limited number of account managers Thus, the number of customer segments is bounded Research: how to generate no more than K customer segments, such that We call this the bounded segmentation problem (BSP) Limited number of marketing actions for each segment, find a set of common actions to apply Thus, types of actions are limited We call this the Bounded Attribute Set Problem (BASP) Both problems are NP-hard. The Bounded Segmentation Problem Resources are bounded! Group (potential) negative-class customers into prespecified k customer segments. Recommend “near optimal” actions to help each of the k customer segments switch to a more profitable positive class. Each segment is applied by the same actions (same manager) The expected net profit is to be maximized Each action may have a different cost and bring different profits The Bounded Segmentation Problem is NPComplete Equivalent to maximum coverage problem. NP-hard problem! We seek approximate solutions! The Bounded Segmentation Problem: Greedy Algorithm 1. Discover who are negative-class customers. 2. build decision tree as the classifier Group negative-class leaf nodes into k customer segments using greedy algorithm. Each customer segment one action set The total profit gain by applying such k action sets can be maximized. Algorithm is based on finding the current largest coverage in linear time An Example: K=2 Service L H Status A Rate B C D L1 L2 L3 L4 0.9 0.2 0.8 0.5 If we want to find two customer segments (k=2) It is more profitable to transform L2L1 and L4L3 than others Profit gain = (0.9-0.2)*1-0.2 + (0.8-0.5)*1-0.1=0.7. cost Experiment on Mutual Fund Data GreedyBSP can find k customer segments with maximal profit. Result is very close to those found by OptimalBSP. GreedyBSP is more scalable than OptimalBSP. Summary From decision-tree model building to extracting actions for profit Goal: maximal net profit Resource is bounded Design optimization solutions for action extraction BASP and BSP Future: explore more efficient solutions