Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Applying Analytics to Search Engine Marketing (SEM): (i) Modeling KW Revenue Per Click (RPC) (ii) Harnessing the Long Tail of KWs Sameer Chopra VP, Marketing Analytics Orbitz Worldwide March 2011 1 Proprietary and Confidential Agenda • Notes/Caveats • Importance of deriving Rev Per Click (RPC) for KWs – Recap: (i) Google AdWords Ad Ranking logic (ii) Max CPC Bid Actual CPC relationship • Why is the Long Tail of keywords important? • Tapping into the Long Tail opportunity: Mining the tail for KW addition • Decision Tree approach for RPCs: Meta-data driven matrix-of-rules • Efficiency Curves: Measuring Efficiency Gains • Q&A 2 Notes/Caveats • Speaking to a blend of different experiences and approaches over the years/different firms • This is just one of many different approaches that are used (eg: (i) 2 stage modeling (ii) Predicting Revenue separately from Clicks, etc.) • Re: Title of Presentation: Analytics is applied to many other area of SEM as well: Match Type Analysis Day Parting Analysis Geo Targeting Analysis Spend Allocation Optimization etc. Clearly cannot cover those topics in the time here for the purpose of this talk. 3 Importance of Deriving Rev-Per-Click (RPC) for KWs • Given a bid (Max CPC), actual cost (Actual CPC) is well understood: Actual_CPCA = (Max CPCB * QSB)/QSA + 1¢ (rounded up to nearest penny) • But how much to bid in the first place? – this is where KW valuation (RPC) comes into play. Google AdWords Ad Ranking Logic Source: Google AdWords Learning Center 4 Why is the Long Tail of KWs important? Long Tail: search terms with 4+ words, now represent majority of user queries (compare to short broad phrases for head KWs: “Nike shoes” vs. “Red Nike shoes for men size 12”) Increasing competition (more ads per keyword) results in advertisers facing higher bid prices and an uphill task of maximizing ROI – the long tail can be a very handy lever Do not have high search volume individually but in aggregate the Long Tail provides the vast majority of a site’s search traffic and conversions Long-tail user queries are often very specific and highly qualified, with searchers further along the purchase cycle LT KWs are more targeted and more likely to convert vs. generic head non-branded KWs Long Tail keywords can offer incredible ROI because they're less competitive and less expensive to bid on for PPC ~70% of user queries have no exact-matched KWs….a big ROI opportunity in the long tail. 5 Tapping into the Long Tail Opportunity: Mining the Tail for KW Addition • A common mistake: bidding almost all KWs on Broad Matched on Auto-Pilot - Precludes being relevant & targeted (Copy/LP), “1 size fits all queries” max bid Use Broad Match a „net‟ to fish for LT queries…the diamonds in the rough: • Analyze which queries result in higher conversion – bid these on exact match – and group into Ad Groups Exact Matching is more targeted and generally results in higher Quality Score, lower CPCs, higher ROI] • Analyze which queries do not convert – perfect fodder for negative keywords Note: Utilizing Broad Match is a good strategy if done effectively – not using BM at all is akin to „throwing the baby out with the bath water‟ given the value in the long tail queries Account Structure Organization: There will be an added necessity to get smart about grouping/clustering the Long Tail KWs into tightly themed ad groups with associated copy & LPs --- but it is well worth the effort since you capture searcher-intent effectively. 6 Decision Tree approach for RPCs: Meta-data driven matrix of rules TOKENIZATION: Categorization/Mapping of KW tokens to capture intent Useful variables for segmentation & tree splitting, and KW generation Examples: 1. KW = “Red Nike Shoes” • Product = „Shoes‟ • Color = „Red‟ • Brand Name = „Nike‟ 2. KW= “Charming Bed & Breakfast in Boston” • Product = „B&B‟ • Modifier = „Charming‟ • City = “Boston” VARIABLE CREATION: This is a very important step of course in having an effective D-tree model built –-creative derivation of additional variables Examples: • # of Tokens in KW, • Length of KW, • KW contains <XYZ> • Brand Y/N Indicator, etc. 7 Regression Tree: pseudo-CART approach *** ILLUSTRATIVE *** ABOVE: Illustration of a Decision Tree model for RPCs Example: KW= “Looking for a Discounted B&B in SFO that accepts Pets” would get segmented with KWs in the bottom right leaf node -- having Dtree Model RPC = $0.35 Where the weight αkw is used to incorporate actuals for kws with lots of information. 8 Decision Tree approach (continued) Some Pros: Trees represent Rules which are easily understood by humans and expressed in SQL/SAS/PMML etc. Easy to follow path from root node to a leaf – explanation for any prediction is easy to understand Trees carve up & cover completely the multi-dimensional space – enabling us to assign any new record to an outcome based on which region it falls into. Robust/insensitive to outliers, missing values, & skewed data distributions Non-parametric: do not assume that the dependent variable follows any given distribution Relatively little data preparation needed (eg: Categorical field with hundreds of values – groups formed) Data exploration: Can relatively quickly (no more than 1 pass for each level of the tree) explore large datasets to determine useful input variables. Some Cons: • Over-fitting: could result in large unstable trees with many leaf nodes –- not lending itself to easy interpretation or prediction -- need to prune tree and validate against holdout/validation data set • ‘Greedy’ Algorithm for splitting: look at variables hierarchically/sequentially rather than simultaneously • Not all data fits neatly into rectangular regions (a deep tree would be needed to try and approximate the required diagonal split for instance) • Usually used for classification; other data mining techniques generally suited for continuous estimation. • D-tree can only generate as many discrete values as there are leaves in the tree – so we have discrete, ‘lumpy’, step-function-like discontinuous estimates [Note: Since we use dampening with weights, final predictions are a continuous spectrum of RPCs] 9 Efficiency Curves: Measuring Efficiency Gains *** ILLUSTRATIVE CHART *** $$ Daily Rev (R) R=B*S^n2; 0<n2<1 Efficiency Gain R=A*S^n1; 0<n1<1 (Common Spend Region) 0 $$ Daily Spend (S) • Note: Model fit vs. Business Lift…don‟t always go hand in hand. Accuracy by itself is not a sufficient measure of a model‟s usefulness to business performance… 10 Q&A 11