Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick Information Access A key task in Oxygen: help people manage and retrieve information Three overlapping projects: Haystack: information storage and retrieval application clients Semantic Web: next-generation metadata Volt: collaborative access Presentation Overview Motivation Information access behavior and goals System Design & Architecture Data Model Interacting data and UI components Working applications Base haystack Frontpage Volt Motivation Problem Scenario I try solving problems using my data: Information gathered personally High quality, easy for me to understand Not limited to publicly available content My organization: Personal annotations and meta-data Choose own subject arrangement Optimize for my kind of searching Adapts to my needs Then Turn to a Friend Leverage They organize information for their own use Let them find things for me too Shared vocabulary They know me and what I want Personal expertise They know things not in any library Trust Their recommendations are good Last to Library/web Answer usually there But hard to find Wish: rearrange to suit my needs Wish: help from my friends in looking Lessons Individualized access Best tools adapt to individual ways of organizing and seeking data Individualized knowledge People know more than they publish That knowledge is useful to them and others Collaborative use Right incentives lead to sharing and joint use Haystack Individualized access My data collection, organization Search tools tuned for me Collaborate to leverage individual knowledge Access unpublished information in others’ haystacks Self interest public benefit Lens to personalize access to the world library Rearrange presentation to suit my personal needs Example Info on probabilistic models in data mining My haystack doesn’t know, but “probability” is in lots of email I got from Tommi Jaakola Tommi told his haystack that “Bayesian” refers to “probability models” Tommi has read several papers on Bayesian methods in data mining Some are by Daphne Koller I read/liked other work by Koller My Haystack queries “Daphne Koller Bayes” on Yahoo Tommi’s haystack can rank the results for me… System Design Gathering Data Haystack archives anything Web pages browsed, email sent and received, address book, documents written And any properties, relationships Text of object (for text search) Author, title, color, citations, quotations, annotations, quality, last usage Users freely add types, relationships Semantic Web Arbitrary objects, connected by named links No fixed schema User extensible HTML Doc Haystack Sharable by any application A new “file system”? D. Karger Outstanding Gathering Data Active user input Interfaces let user add data, note relationships Mining data from prior data Plug-in services opportunistically extract data Passive observation of user Plug-ins to other interfaces record user actions Other Users Data Extraction Services Machine Learning Services Spider Triple Store Web Observer Proxy Mail Observer Proxy Volt Viewer/ Editor Web Viewer Sample Applications Sample Applications Because everything uses the Semantic Web constructions, a variety of application clients can share information Web Browser---data viewer FrontPage---personalized information filter Volt---collaboration tool Haystack via Web Web server interface Basic operations: Insert objects View objects Queries Haystack via Web Haystack via Web Viewer shows one node and associated arrows Service notices we’ve archived a directory; so archives the objects it contains (and so on…) Haystack via Web Services detect document type, extract relevant metadata Output can specialize by type of object Mediation Haystack can be a lens for viewing data from the rest of the world Stored content shows what user knows/likes Selectively spider “good” sites Filter results coming back Compare to objects user has liked in the past Can learn over time Example - personalized news service News Service News Service Scavenges articles from your favorite news sources Html parsing/extracting services Over time, learns types of articles that interest you Prioritizes those for display Content provider no longer controls viewing experience No more ads Personalized News Service Collaborative Access Want to leverage others’ work in organizing information No need to “publish” expertise Exposed automatically---without effort Self interest helps others Volt Volt is about collaboration between people The Haystack architecture allows easy collaboration among individuals semantic web references to Haystack objects Individuals share parts of their Haystack Group spaces and shared notebooks Volt Collaborators Those I interact with Frequent mail contact Frequent visits to their home page Those with shared content And who have same opinions about content Collaborative filtering techniques Referrals Expertise search engine Expertise Beacon Volt Expertise Beacons Group spaces and shared notebooks Create individual and group profiles Profiles can be used to find other people Allows targeted search “Who else is working on this project?” User controls visibility/privacy Summary Next generation information access Semantic Web provides a language and capabilities for meta-data Haystack teases out individual knowledge, stores it in a coherent fashion, and allows a variety of application clients to leverage individual meta-data Volt turns individual knowledge into a community resource More Info http://haystack.lcs.mit.edu/ http://www.w3c.org/2001/sw [email protected] [email protected] [email protected] [email protected]