Download lcs35

Individualized Knowledge Access David Karger Lynn Andrea Stein Web Search Tools  Indices  search by keyword  Taxonomies  A lot like libraries...  Library catalogues  Dewey Digital classify by subject  Cool site of the day  New book shelf, suggested reading Is a universal library enough? Library/Web Limitations  Huge:  too many answers, mostly irrelevant  Only published material  miss info known to few, leading-edge content  Rigid: all get same search results  even if come back and try again  The library is the last place we look Bookshelves First  My data: information gathered personally  high quality, easy for me to understand  not limited to publicly available content  annotations   My organization: choose own subject arrangement  optimize for my kind of searching   Adapts to my needs Then a Friend  Leverage they organize information for their access  so quickly find things for me   Personal expertise  they know things not in any library  Trust  their recommendations are good  Shared vocabulary  they know me and what I want Last the Library  Answer usually there but hard to find  would be nice to rearrange to my needs   For hardest problems, need librarian they have broad knowledge of library  but not as deep as an expert on question  Lessons  Individualized access: The best tools adapt to individual ways of organizing and seeking data.  Individualized knowledge: People know much more than they publish. That knowledge is useful. Haystack: a Tool for Oxygen  Independent but interacting repositories that adapt to their individual users  Individualize access My data collection, organization  My search tools, with answers for me   Leverage individual knowledge Collaborative retrieval with others  Motivate people to organize their data for their own benefit and thus for others’  Example  Have probabilistic models been used in data mining? My haystack doesn’t know, but “probability” is in lots of mail I got from Tommi Jaakola  Tommi told his haystack that “Bayesian” refers to “probability models”  Tommi has read several papers on Bayesian methods in data mining  His haystack suggests them to mine  Research Threads  Heterogeneous data and metadata  archive whatever user wants  Human-Computer Interaction let user express/use own organizational rules  observe user to detect unexpressed knowledge   Machine learning  use gathered data to improve performance  Collaborative filtering  use others’ decisions to help me My data  Haystack archives anything  web pages browsed, email sent and received, documents written, scanned images, home directory, people known, projects worked on  And any properties, relationships text of object (if know how)  author, title, color, citations, quotations, annotations, quality, last usage   Users freely adds types, relationships Gathering My Data  Active user input  interfaces let user add data, note relationships  Mining data from haystack plug-in services opportunistically extract data  e.g., find author/title/text in MSWord document  or, detect that one document quotes another   Observing user plug-ins to other interfaces report user actions  web pages browsed, mail sent, queries made  Adaptation  Remember user’s attempts to tune a query instead of first query attempt, use last one  record items user picked as good matches  future similar queries do better right away   Stored content shows what user knows/likes modify queries to big search engines  filter results coming back  personalized “cool site of the day”  Collaborative Access  Leverage others’ work organizing data no need to “publish” expertise  exposed automatically  self interest helps others   Privacy/permission concerns allowing exposure easier than publishing  much public info: mailing lists, papers read   Whose opinions matter? people I mail, w/shared data, referrals  collaborative filtering techniques  Conclusion  Libraries are not enough  Haystack teases out individual knowledge  Individualizes information access for user  Exposes individual knowledge to benefit community  Current status: individual-user prototype. Some data extraction, observation, adapting. Collaborative version in future.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download lcs35