Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Search Text Mining Web Site Usability Marti Hearst SIMS UCB CS Research Fair BAILANDO Projects Better Access to Information using Language Analysis and Novel Dynamic Organizations UCB CS Research Fair Current BAILANDO Projects CHA-CHA & FLAMENCO: LINDI: Better Search Interfaces UI support for Search Text Data Mining TANGO: Automated Web Site Usability UCB CS Research Fair Search UIs Combine Browsing & Search Place Search Results in Context Large Category Hierarchies UCB CS Research Fair Cha-Cha Students: Mike Chen, Jamie Laflen, Jason Hong, Jimmy Lin, Shiang Chen UCB CS Research Fair Medical Category Hierarchy Medicine Disease Migraine UCB CS Research Fair Anatomy MS Carotid Artery Spinal Cord Drugs Tamoxifin Steroids DynaCat (Pratt, Hearst, & Fagan 99) UCB CS Research Fair DynaCat Study Design Three queries 24 cancer patients Compared three interfaces Results ranked list, clusters, categories Participants strongly preferred categories Participants found more answers using categories Participants took same amount of time with all three interfaces Similar results have been verified by another study by Chen and Dumais (CHI 2000) UCB CS Research Fair Cat-a-Cone Interface (Hearst & Karadi 97) FLAMENCO: Improving Search via Large Category Hierarchies How to show intersections across category types? How to preview related categories in a usertailored, dynamic manner? UCB CS Research Fair Text Data Mining Relationships between information in documents can create new facts, not previously known. UCB CS Research Fair Imagine You are a medical researcher Your patient has spinal inflammation numbness in fingers low TC levels negative results for all tests How can you help her? UCB CS Research Fair Idea A new way of searching text. Link pieces of information together to formulate hypotheses … UCB CS Research Fair LINDI Linking Information for New DIscoveries Three main parts Search UI for building and reusing hypothesis seeking strategies. Statistical language analysis techniques for interpreting the text. Backend for interfacing with various databases and translating different formats. UCB CS Research Fair Gathering Evidence Spinal Inflammation Numbness in fingers Low TC Levels UCB CS Research Fair Gathering Evidence Spinal Inflammation Numbness in fingers Low TC Levels UCB CS Research Fair Find diseases associated with each Supporting Cascaded Search Operations Spinal Inflammation Numbness in fingers Low TC Levels UCB CS Research Fair UCB CS Research Fair New Language Analysis First use category labels to retrieve candidate documents Then use language analysis to detect causal relationships between concepts Title: Interpretation: Magnesum deficiency implicated in increased stress levels. <nutrient><reduction> related-to <increase><symptom> Use these to find relationships and formulate hypotheses UCB CS Research Fair Statistical Semantic Parsing Modern statistical techniques Mainly applied to syntactic structure Probabilistic knowledge representation Represent hypotheses with different degrees of certainty. UCB CS Research Fair Automating Assessment of Web Site Usability UCB CS Research Fair Why Worry? Problem: IBM's extranet Solution Heavy use of help and search Unhappy users Massive web site redesign Focus on info-organization, not the purchasing process. Cost: "in the millions" Results Not announced or trumped up Use of "help" decreased 84% Sales increased 400% UCB CS Research Fair Web TANGO Tool for Assessing NaviGation & Organization Goal: automated support for comparing design alternatives How: Assess usability of the information architecture Approximate people’s information-seeking behavior (Monte Carlo simulation) Output quantitative usability metrics UCB CS Research Fair Guidelines There are many usability guidelines A survey of 21 sets of web guidelines found little overlap (Ratner et al. 96) Why? Our hypothesis: not empirically validated So … let’s figure out what works! UCB CS Research Fair An Empirical Study: Which features distinguish well-designed web pages? UCB CS Research Fair Methodology Data collection 1108 pages 163 sites 3 levels per site 14 metrics About 85% accurate Text cluster and text positioning counts less accurate UCB CS Research Fair Metrics UCB CS Research Fair Preliminary Results Linear regression to predict Webby judges ratings Top 30% vs bottom 30% Prediction accuracy: 72% if categories not taken into account 83% if categories assessed separately UCB CS Research Fair Goals Create empirical foundations for what is still guesswork Next step: A free online tool Long term goal: An monte carlo simulator for comparing potential designs UCB CS Research Fair For More Information http://webtango.berkeley.edu [email protected] UCB CS Research Fair