Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Adaptive Web Navigation for Wireless Devices Corin Anderson Pedro Domingos Dan Weld Web personalization • Web sites designed “one size fits all” • But one size does not fit all – Visitors will use the site in unforeseen ways – Browser may be resource-constrained – Unintuitive or impossible to use • Personalization adapts and customizes content for each visitor • In most need: mobile web visitors 2 c|net on the desktop •1024x768 screen –Multi-column okay –Many links easily visible •Fast network –Hierarchical link structure okay –Large images, HTML pages 3 c|net on a wireless Palm • Very small screen – Only few lines of text – Lots of scrolling • Slow net connectivity – Following links costly Challenge: automatically improve PC-centric web site for mobile browsing 4 Web site personalizers • An intermediary between server and visitor Visitor Personalizer Web server • Adapts and customizes site for each visitor • Personalizations include: – Add shortcut links shorten long paths – Rearrange content to increase salience – Elide content, replacing with link 5 Old and new • 590AI – Autumn 2000 – Proteus – Shortcut links and content elision – Key ideas: Expected utility and search • 590AI – Spring 2001 – MinPath – Shortcut links – Key ideas: clustering and predictive models 6 Trails • A trail is a sequence of page requests… • …coherent in time… • …and coherent in space 7 Shortcuts • Connect two previously unconnected pages • Savings of shortcut is # links skipped – Don’t forget the link you follow – the shortcut itself! 8 Shortcut link selection problem • Given: – visitor V – trail prefix p0 ,, pi – maximum number of shortcuts m • Output: pi q1,, pi qm – list of shortcuts that minimize the number of links the visitor must follow to reach the visitor’s destination 9 Finding shortcuts • If we know the whole trail p0 ,, pi ,, pn • finding the right shortcut is easy pi pn • Unfortunately, omniscience is hard to come by 10 Expectation • All we really know is the prefix • We must guess the rest of the trail • Idea: – Foreach suffix of trail on site • Calculate the probability of that suffix • Add that probability to the shortcut to the end of that suffix – Return top m shortcuts 11 Predictive models • We don’t really guess the suffix – we try all of them on the site • We calculate each probability using a model of behavior • [[ predict the next request given the past requests, the position in the trail, and the visitor’s identity ]] • We’ve tried a number of variations… 12 Unconditional model • Ignore visitor identity, trail prefix, position • [[equation 1 from paper]] • Of course, the visitor is bound to the links actually on the page; MinPath thus uses: • [[next equation from paper]] 13 Naïve Bayes mixture • Unconditional builds one model for everyone • Intuitively, we suspect that not everyone behaves the same • Cluster visitors, and condition prediction on cluster identity 14 Clustering • Use Expectation-Maximization – Simultaneously cluster, build models • Cluster sequences, not visitors • … 15 Clusters of unconditional models 16 Markov models • • • • Condition on past history First order: one bit of history [[ equation ]] Also build mixtures of Markov models 17 Request position • Training data suggest ordinal position important • We’ll try adding position to unconditional and Markov models 18 Experiments • Train models on 20 days of logs (35,000 trails) • Test on 1.5 days (2,500 trails) • All trails have length >= 3 (3 pages, 2 links) • Measure performance as # links followed to reach destination 19 Results • Compare different models • Compare different cluster sizes • Compare cluster assignment method • MinPath does save navigational effort • Markov models offer most savings • Clustering helps 20 figure 1 from ijcai paper 21 Figure 2 from ijcai paper 22 Clusters at www.cs Anecdotal clusters from www.cs data 23 Summary • • • • Shortcuts for navigation cluster model save 40% 24 Ongoing work • Proteus, MinPath feed on HTML and web graph • But many sites are built from databases and templates • What adaptations are possible given a declarative model of the site? 25