* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The Web of Data, and LarKC`s role in it
Survey
Document related concepts
Transcript
The Web of data and LarKC’s role in it Creative Commons License: allowed to share & remix, but must attribute & non-commercial Frank van Harmelen Vrije Universiteit Amsterdam The Future CurrentInformation InformationUniverse Universe and another web page about Frank a web page in English about Frank And this page is about Stefano And this page is about LarKC ? This page is about the Vrije Uniersitei ? ? linked web-pages, writtenof by people, Many linked data, these pages writtenby for people, already usable come computers! from data, used bylink people... usable useful for by people! computers! But weonly can’t the data.... ? ? How far away is this ? Not very far away! rapidly growing Linked Open Data cloud. already many billions of facts & rules It gets bigger every month Full Web-style decoupling: re-usability, independence • All identifiers are URL's (= on the Web) – Allows total decoupling of • data • vocabulary • meta-data [<x> IsOfType <T>] x T different owners & locations <person> For the first time ever, it is now possible: to re-use somebody else's knowledge base • without having to talk to them first (syntax, semantics) • without having to make copies Rapid growth: "billion triple challenge" (= machine-reason with a billion facts and rules) • 2006: “where do we get a billion facts from?” • 2008: “which billion shall we choose!” What to do when success is becoming a problem? The Large Knowledge Collider a platform for infinitely scalable reasoning on the data-web Infinite scalability? parallelisation • cluster computing distribution • “Thinking@home”, “self-computing semantic Web” approximation • “almost” is often good enough • gets better with more resources First result: MaRVIN Node Reasoning Routing InputPool OutputPool Node Node Data Preparation Node Node Node statistics & visualisation MaRVIN scales by: •distribution (over many nodes) •approximation (sound but incomplete) •anytime convergence (more complete over time) Result Storage Use case: Drug FDA white paper Discovery Innovation or Stagnation (March 2004): • “developers have no choice but to use the tools of the last century Problem: pharmaceutical R&D in early clinical to assess this century's candidate solutions.” development is stagnating “industry scientists often lack cross-cutting information about an entire product area, or information about techniques that may be used in areas other than theirs” “Show me any potential liver toxicity associated with the compound’s drug class, target, structure and disease.” (Q1Q2Q3) Q1 Q3 Q2 Show me all liver toxicity “Show me all liver toxicity associated with the target associated with compounds with similar structure” or the pathway. Genetics Chemistry “Show me all liver toxicity from the public literature and internal reports that are related to the drug class, disease and patient population” LITERATURE Current NCBI: linking but no inference Use Case: City on-line • Our cities face many challenges • Urban Computing improve the quality of life is the ICT way to address them Is public transportation where theispeople are? • Where the traffic moving Which • Is public transportation where people are landmarks attract more • Which location attractspeople? most people right now • Is public transportation where people will be Where are people concentrating? Where is traffic moving? Is anybody doing this for real? • OpenCalais: – enrich text (news items) with semantic meta-data – recognise people, places, events, organisations,... – useful for searching, selecting, personalising, aggregating, summarising, etc • From early ’09: – identify “people, places, events, organisations,...” by linking to the Open Data cloud: And this page is about LarKC And this page is about Stefano Summarising The Information Universe of the Future will be a Web of Data • • • • This Web of Data is rapidly taking shape There are compelling use-cases Industrial take-up is beginning to happen We are building new infrastructure to deal with required scale Contact Info Want to ask questions? Want to play with LarKC? Want to contribute plugins? Want to run a use-case? [email protected] http://www.larkc.eu