Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Tallinn University of Technology Department of Computer Engineering Applying User Profile Ontology for Mining Web Site Adaptation Recommendations Tarmo Robal, Ahto Kalja [email protected], [email protected] Outline Introduction » Web Mining & Adaptive Web Sites » Recommender Systems Web Usage data Capturing User Profiles Extraction Recommendations Generation Summary 2 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Introduction The electronic age » Internet – enourmous source of information » Competition over users » Browsing affected by many factors System feedback » What is actually going on within the system » Observe users’ actions & preferences Constant need for web improvement! 3 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Reaching the Aim Make browsing easier - better user experience Collect usage data » Exploit a log system Apply web mining techniques on the collected data to: » Analyse & Reason Employ the mining results » Construct users’ profile ontology » Adaptive websites & Recommender systems Continue collecting usage data 4 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Introduction Research based on the access data of the website of our department » » » » » Dynamic website Run by system kernel developed at our lab Witholds 118 pages Average access rate 250 sessions daily Average number of operations per session 1.9 (4.3 in sessions with more than 1 page request) » http://ati.ttu.ee 5 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Web Mining ... is the use of data mining techniques to automatically discover and extract information from Web documents and services (Perkowitz and Etzioni 2001) Content Mining discovery of document content patterns Structure Mining discovery of hypertext/linking structure patterns Usage Mining discovery of access patterns Profile Mining discovery of user profiles 6 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Adaptive Websites ... sites that automatically improve their organization and presentation by learning from visitor access patterns Tactical » Adaptions triggered in real time » Adding value to provided information » Highlightning items » Recommending items » Easier browsing Strategic » Adaptions triggered on the structure » Offline & with approval Towards enhanced web experience! 7 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Recommender Systems To assist users during browsing Improved user experience More relevant information for the user Based on site’s usage: » Transparent i.e. general » Personalized (i-banking) Why? » Constant competition over rating » Marketing, e-commerce, information portals, ... 8 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Recommender Systems Users implicitly use a concept model based on their own knowledge of the domain or topic searched, even though mostly they do not know how to represent it! (Li & Zhong) If we are able to track down users’ actions, we are also able to produce dynamically discovered recommendations Step towards intelligent web Basis for adaptive web 9 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Collecting Web Usage Data Explicit data collection Implicit data collection » » » » Transparent to end-user Monitor accessed pages Time spent on a particular page Discover navigational paths Need for a special log system » Ability to capture distinct and recurrent user sessions 10 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Web Server Logs NOT Suitable? Reasons: » » » » » » » 11 suffer from insufficiencies do not allow to identify visitor sessions impossible to track recurrent visits no information about users’ screen resolution are not kept for a long period of time are of large size a lot of detailed information about every element accessed on the web server Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 The Log System Data collected: » » 12 Page requested Client identifier (session ID) Request time IP and host Browser and OS Tarmo Robal, Ahto Kalja » » » » » Query method and query string Site referrer Page load time and server load Recurrent visit ID (session ID) Screen resolution ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 User Profiles Extraction Construct user navigational paths from session data s=<pi, pi+1,…pn> pi P » 269 782 paths Apply further processing » 87 953 paths Apply the Locality Model onto discovered paths » Extract localities L L = pj, pj+1, … pm, where pj pj+1 … pm » Size of locality window w? 13 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 The Locality Model If a large number of users frequently access a set of pages, then these pages must be related The locality L is defined by the users nearest sequential activity history within the site during a session L is constructed based on navigational paths Users are moving from one locality L to another, which can be represented by the w latest operations (requests for pages) L 100 – 400 – 410 – 400 – 410 - 4110 – 410 – 460 – 430 w w w 14 ... Tarmo Robal, Ahto Kalja w=3 L=CalculateLocality(st,w) N=FindNextItem(st,w) ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 The Locality Model What’s the size of w? w has to cover a rationale amount of page requests Attributes observed: » cover percentage for the number of combinations computed from the paths » average frequency of finding these combinations in paths » average number of possible localities in path » the availability of next item for each locality (progress) The size of w is correllated to the absolute menu depth Properties observed (1) Combination coverage [%] (2) Combination frequency (3) No of localities in path (4) Availability of next item [%] 15 Tarmo Robal, Ahto Kalja Studied window size w 2 3 4 5 31.2 35.5 20.7 12.6 1.1 1.0 1.0 1.0 6.3 6.6 6.5 5.9 76.6 77.4 74.1 76.3 ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 User Profiles Extraction User Session from DW Navigational path sequence construction 100 – 400 – 410 – 410 – 400 – 410 -4110 – 410 – 460 – 430 – 430 Path minimization – 100 – 400 – removal of redundant operations 410 – 400 – 410 - 4110 – 410 – 460 – 430 Filtering of non-relevant paths 100 – 400 – (e.g. paths with 1 item) 410 – 400 – 410 - 4110 – 410 – 460 – 430 User Session 100 400 410 410 4110 410 460 430 430 Extracting localities L with size w 100 – 400 – 410 400 – 410 – 400 410 – 400 – 410 400 – 410 - 4110 100 – 400 – 410 Removal of cyclic localities 400 – 410 - 4110 410 - 4110 – 410 4110 – 410 – 460 4110 – 410 – 460 410 – 460 – 430 410 – 460 – 430 Extracted user profiles Ontology 16 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 User Profiles Ontology Frequent user profiles discovered from web usage Predefined user profile classes Mapping of Extracted Profiles onto Concepts of Web Ontology Concepts of Web Ontology for ati.ttu.ee 17 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Inferred Ontology Definitions for predefined user profile classes Profiles inferred for predefined user profile class 18 Users profiled as Students are interested in ... Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Producing Recommendations RE determines the type of user online RE computes recommendations » User’s recent actions » Knowledge from ontologies » Page ranking Pages ranked with inverse time weighting algorithm n Interest value(i) Rank p Age (i) i 1 19 Tarmo Robal, Ahto Kalja No. of hits during Age(i) Days into the past ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Producing Recommendations LOG SYSTEM Usage Data Capturing Refined Topology Data Mining Recommended Sub-Topology Tactical Adaption Detection of Locality Window Size w Strategic Adaption Extracted User Profiles Web Ontology Web Site Ontology MAPPING Ranked Pages Recommendation Engine (RE) Profiles Ontology 20 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 The Locality of User Online (recent w actions) Web Site Tactical Recommendations Raising / highlightning items during user’s online session Adding recommended items to existing topology Providing sub-topologies for targeted user groups Enhanced (semi-personalized) User Experience 21 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Strategic Recommendations Deriving recommendations for general site improvement to adjust sites to their users preferences Long-term Discovering related page-sets according to users preferences Improved Site Structure 22 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Conclusions Monitoring users actions and producing concept models based on that enables to » Classify a user as an individual into one of the conceptual user groups (predefined user profiles) » Produce recommendations that correlate to that particular individual » Tactical recommendations » Strategic recommendations 23 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Summary Introduction » Web Mining & Adaptive Web Sites » Recommender Systems Web Usage data Capturing User Profiles Extraction Recommendations Generation Summary 24 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007 Tallinn University of Technology Department of Computer Engineering Thank you! Questions? 26 Tarmo Robal, Ahto Kalja ADBIS'07, Bulgaria, Varna 29.09-03.10.2007