Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004 Introduction • Why do we care about how people use the Web? • Today’s topics (10/7, not the present age): – – – – – – – Implicit vs. explicit feedback Representation effectiveness Browser-based activities History mechanisms How do we cater to the people? Resources Research Implicit vs. Explicit Feedback Reading Time, Scrolling and… (Kelly & Belkin, 2001) • Implicit feedback (Morita & Shinoda): – Time spent on a page is directly related to user interest. Backed by many studies. • Explicit feedback (this study) – Time spent on a page is similar for relevant and irrelevant content. • Results suggest: – “Generalizability” is severely affected by explicit feedback methods. – Spend time to choose the right feedback type! Implicit vs. Explicit Feedback Reading Time, Scrolling and… (Kelly & Belkin, 2001) • Why do the results differ? – Relevance was difficult to distinguish this time – Participants are truly interested in the content former studies – Users may have rushed to complete in this experimental context Representation Effectiveness How we really use the Web (Krug, 2000) Three “facts of life”: 1. “We don’t read pages. We scan them.” – Why? hurry, necessity, habit – If we are to read its entirety, we save or print! (ClearType project) Representation Effectiveness How we really use the Web (Krug, 2000) 2. “We don’t make optimal choices. We Satisfice.” – Why? hurry, quick access to and fro, less work than thinking – Generally, it’s more productive to guess. Representation Effectiveness How we really use the Web (Krug, 2000) 3. “We don’t figure out how things work.” – Why? not important, “if it ain’t broke (baroque)…” – Is it important to us whether the user understands how it works or not? Why? Representation Effectiveness Cognitive Strategies in Web… (Navarro-Prieto, et al, 1999) • Users get lost on the Web. Why? • It is not just interactivity between user and system, rather user, task, and information • Analysis structure of browsing behavior presented and tested “The Interactivity Framework” or “How we should analyze cognitive strategies” Representation Effectiveness Cognitive Strategies in Web… (Navarro-Prieto, et al, 1999) • The Interactivity Framework – User Level – Web experience, cognitive processes, cognitive style, knowledge (CS majors knew more about SE processes) – User Strategies – based on searching structure (or lack of), task nature SEARCHING CONDITIONS FACT FINDING EXPLORATORY DISPERSED STRUCTURE • • Look for data base algorithm in Java Look for criteria for the diagnosis of diseases • Find all the available jobs for profession CATEGORY STRUCTURE • Look for word definition • Find all information about 1997 Nobel Prize for Literature Representation Effectiveness Cognitive Strategies in Web… (Navarro-Prieto, et al, 1999) – Information Structure • Internal (user’s) representation • External (system’s) representation • Computational Offloading – How much work does the user have to do to understand and how much does a representation help? – Re-representation – How much it makes problem solving easier or more difficult – Graphical Constraining – How it constrains inferences – Temporal and Spatial Constraining – How it helps when distributed over time and space Representation Effectiveness Cognitive Strategies in Web… (Navarro-Prieto, et al, 1999) SEARCHING TASK EXPERIENCED WEB-PARTICIPANTS INFORMATION IN WEB DISPERSED STRUCTURE SPECIFIC FACT FINDING: • Bottom-up • Mixed strategy at the beginning and selecting Bottom-up (e.g. find criteria for a psychological disease) NOVICE WEBPARTICIPANTS • • Start with top-down and change at the end to bottom-up Start typing without knowing why EXPLORATORY: • Top-down INFORMATION IN WEB CATEGORY STRUCTURE (e.g. find a job opening) • • Mixed strategy at the beginning and then selecting top-down Top-down • • Top-down following browser categories Start with bottom-up and change to top-down Representation Effectiveness Cognitive Strategies in Web… (Navarro-Prieto, et al, 1999) • More Results – Experienced users searched with a plan – By having a plan you keep a more internal representation and focus your search – Inexperienced users were more influenced by external representations – Computational Offloading Results • Must explain – How have these issues changed? Representation Effectiveness Cognitive Strategies in Web… (Navarro-Prieto, et al, 1999) • Conclusions – Cognitive strategies used by the participants depend on how the information is structured. – Interaction is a multi-dimensioned concept. – Search engine interfaces should be designed to have less restrictive external representation. Browser-based Activities Characterizing Browsing… (Catledge & Pitkow, 1995) • User study of browsing events at the Georgia Tech (xMosaic browser) • Three main browsing strategies identified: – Search browsing – directed search, goal known – General purpose browsing – consulting highly likely sources for needed information (dictionary.com) – Serendipitous browsing – random – Most people use a combination of these Browser-based Activities Characterizing Browsing… (Catledge & Pitkow, 1995) • Results – Users were patient 99% of the time for long page loads – 1222 unique sites accessed outside of GATech (~16% of Web servers) – Paths were calculated (sequences of page navigation) • Per session, paths of 7 different sites occurred 5 times • Per user, paths of 8 different sites occurred 9 times Browser-based Activities Characterizing Browsing… (Catledge & Pitkow, 1995) • More Results – 2% of the retrieved pages were saved or printed – Based on user’s slope, browsing strategy categories were applied – Slope can also categorize usage patterns of Web documents – Users tended to operate in one small area of a site Browser-based Activities Characterizing Browsing… (Catledge & Pitkow, 1995) • Design Strategies – Users averaged 10 pages per server • Make most important info within 2 or 3 jumps from the index • Do not put too many links on one page – increases search time (back, forward, back, site map, etc.) – Facilitate the likely visitor browser patterns • Maybe make more than one version of your page? • Most work well in a “hub and spoke” environment • The Future – Offer site tour based on most frequently traveled paths – Alter page design dynamically based on site trends History Mechanisms (in browsers) Revisitation Patterns in… (Tauscher & Greenberg, 1997) • Purpose: Provide empirical data to aid in the development of effective history mechanisms – Understand revisitation patterns – Evaluate current mechanisms and suggest best practices and methods • Data Collection – Altered version of xMosaic to record activity – Survey of users afterward History Mechanisms (in browsers) Revisitation Patterns in… (Tauscher & Greenberg, 1997) • Revisitation Results – 58% recurrence rate (>40% are new pages!) – As people search they build their vocabulary – 7 browsing strategies • • • • • • • First-time visits to cluster of pages Revisits to pages Authoring of pages (high reload percentage) Regular use of web-based apps Hub-and-spoke (breadth-first approach) Guided tour (e.g. next page links) Depth-first search (following links deeply before returning to the index) History Mechanisms (in browsers) Revisitation Patterns in… (Tauscher & Greenberg, 1997) • Revisitation Results – Visit frequency as a function of distance • Users mostly revisit recently visited pages (within about 6 jumps) • 39% chance that the next URL will match one of the previous 6 pages visited – Access frequency • • • • 60% of pages visited only once 19% visited twice 8% visited 3 times 4% visited 4 times – Locality (not valuable for predicting next page) • Most locality sets were small • Only 2.5 to 4.5 URLs per set • Only 15% of pages were part of a locality set – Paths (not valuable for predicting next page) • Could these be captured and offered in a history mechanism? • Time per page could indicate path History Mechanisms (in browsers) Revisitation Patterns in… (Tauscher & Greenberg, 1997) • Mechanism types – Recency Ordered • • • • Sequential order based on time accessed Repeated entries for revisitation “Pruned” by keeping only first instance or only last Simple for users to understand (they remember paths) – Frequency Ordered • • • • • Most visited at top, least visited at bottom User interest changes, latest URLs must have frequency How to break ties – last visited, earliest visited When few items are on the list, this suffers Difficult for users to understand History Mechanisms (in browsers) Revisitation Patterns in… (Tauscher & Greenberg, 1997) • Stack-based – Recently visited at top – Order and availability depend on: • Loading – causes page to be added to the top • Recalling – changes pointer to the currently displayed page • Revisiting – user reloads the page, has no effect on the stack – – – – Keeps duplicates Non-persistent vs. persistent (btw sessions) Better than recency at short distances Users have difficulty understanding this model History Mechanisms (in browsers) Revisitation Patterns in… (Tauscher & Greenberg, 1997) • Hierarchically Structured – Recency ordered hyperlink sublists • • • • Like recency w/ latest position saved Each URL has its own sublist of links from that page Helps with common linking paths Easier to understand – Context-sensitive web subspace • Somewhat of a combination of the above-mentioned and stack-based approaches • Gives user better understanding of context of his/her searches • May be difficult to remember where a certain URL was • I THINK this approach would be a great tool History Mechanisms (in browsers) Revisitation Patterns in… (Tauscher & Greenberg, 1997) • Do users actually use history mechanisms? – Less than 1% of navigation – 3% involve favorites – 30% of navigation was back button usage How do we cater to the people? • Inter-site browsing strategies are not easy to tackle. How would you control that? • Why should we attempt to understand user behavior and search strategies? – Formulate general design principles (e.g. 3 level depth) – Design for multiple searching personalities – Understand how to survey your intended users or get feedback most appropriately – Identify importance of all aspects of the development process and allocate resources accordingly How do we cater to the people? Some Bright Ideas • Personalized search – Learning systems – You might also like… – www.a9.com (history, favorites, personalized interface) – But what about changing for different types of user behavior based on the user’s path history on your server? • Researched since 1995 and earlier! • What has resulted? • Microsoft ASP.net 2.0 – Web Parts What resources are out there? • xMosaic 2.6 download, for those of you so excited • Architecture of the World Wide Web http://www.w3.org/TR/webarch/ • Sum Sun Sug Gestions http://www.sun.com/980713/webwriting/ • Jakob Nielsen – research on content usability, http://useit.com/alertbox/9710a.html Research • Vox Populi: The Public Searching Of The Web (2001) – Compares statistics from two studies – Shows how public searching changed from 1997 to 1999 • Usage Patterns of a Web-Based Library Catalog (2001), Michael D. Cooper • Real Life, Real Users, and Real Needs: A Study and Analysis of User Queries on the Web (2000), Jansen, Spink & Saracevic • Redefining the Browser History in Hypertext Terms (), Mark Ollerenshaw