Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Individualized Knowledge Access David Karger Lynn Andrea Stein Web Search Tools  Indices  search by keyword  Taxonomies  A lot like libraries...  Library catalogues  Dewey Digital classify by subject  Cool site of the day  New book shelf, suggested reading Is a universal library enough? Library/Web Limitations  Huge:  too many answers, mostly irrelevant  Only published material  miss info known to few, leading-edge content  Rigid: all get same search results  even if come back and try again  The library is the last place we look Bookshelves First  My data: information gathered personally  high quality, easy for me to understand  not limited to publicly available content  annotations   My organization: choose own subject arrangement  optimize for my kind of searching   Adapts to my needs Then a Friend  Leverage they organize information for their access  so quickly find things for me   Personal expertise  they know things not in any library  Trust  their recommendations are good  Shared vocabulary  they know me and what I want Last the Library  Answer usually there but hard to find  would be nice to rearrange to my needs   For hardest problems, need librarian they have broad knowledge of library  but not as deep as an expert on question  Lessons  Individualized access: The best tools adapt to individual ways of organizing and seeking data.  Individualized knowledge: People know much more than they publish. That knowledge is useful. Haystack: a Tool for Oxygen  Independent but interacting repositories that adapt to their individual users  Individualize access My data collection, organization  My search tools, with answers for me   Leverage individual knowledge Collaborative retrieval with others  Motivate people to organize their data for their own benefit and thus for others’  Example  Have probabilistic models been used in data mining? My haystack doesn’t know, but “probability” is in lots of mail I got from Tommi Jaakola  Tommi told his haystack that “Bayesian” refers to “probability models”  Tommi has read several papers on Bayesian methods in data mining  His haystack suggests them to mine  Research Threads  Heterogeneous data and metadata  archive whatever user wants  Human-Computer Interaction let user express/use own organizational rules  observe user to detect unexpressed knowledge   Machine learning  use gathered data to improve performance  Collaborative filtering  use others’ decisions to help me My data  Haystack archives anything  web pages browsed, email sent and received, documents written, scanned images, home directory, people known, projects worked on  And any properties, relationships text of object (if know how)  author, title, color, citations, quotations, annotations, quality, last usage   Users freely adds types, relationships Gathering My Data  Active user input  interfaces let user add data, note relationships  Mining data from haystack plug-in services opportunistically extract data  e.g., find author/title/text in MSWord document  or, detect that one document quotes another   Observing user plug-ins to other interfaces report user actions  web pages browsed, mail sent, queries made  Adaptation  Remember user’s attempts to tune a query instead of first query attempt, use last one  record items user picked as good matches  future similar queries do better right away   Stored content shows what user knows/likes modify queries to big search engines  filter results coming back  personalized “cool site of the day”  Collaborative Access  Leverage others’ work organizing data no need to “publish” expertise  exposed automatically  self interest helps others   Privacy/permission concerns allowing exposure easier than publishing  much public info: mailing lists, papers read   Whose opinions matter? people I mail, w/shared data, referrals  collaborative filtering techniques  Conclusion  Libraries are not enough  Haystack teases out individual knowledge  Individualizes information access for user  Exposes individual knowledge to benefit community  Current status: individual-user prototype. Some data extraction, observation, adapting. Collaborative version in future.