Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Semantics and the Grid or “What’s does it all mean?” Slides taken from http://www.iceage-eu.org/issgc07/sessionDescription.cfm?id=78 http://www.semanticgrid.org/presentations/DeRoureSemanticGrid Web2Stockholm.ppt Overview • The Semantic Web – Introduction – What is the Semantic Web • Annotation, Integration, Inference – Semantic Web Technologies • RDF, RDF Schema and OWL – Summary • Semantic Grid – Motivation – Examples • Service Descriptions, Data Management, Provenance – Putting it together – The Future What is the Semantic Web? • An extension of the current Web… – … where information and services are given well-defined and explicitly represented meaning, … – … so that it can be shared and used by humans and machines, ... – ... better enabling them to work in cooperation • How? – Promoting information exchange by tagging web content with machine processable descriptions of its meaning. – And technologies and infrastructure to do this The Semantic Web Vision • The Web was made possible through established standards – TCP/IP for transporting bits down a wire – HTTP & HTML for transporting and rendering hyperlinked text • Applications able to exploit this common infrastructure – Result is the WWW as we know it • Generations – 1st generation web mostly handwritten HTML pages – 2nd generation (current) web often machine generated/active • Both intended for direct human processing/interaction – In the next generation web, resources should be more accessible to automated processes • To be achieved via semantic markup • Metadata annotations that describe content/function The Syntactic Web The Semantic Web Where we are Today: the Syntactic Web Resource href Resource Resource href href href Resource Resource href Resource href Resource href href href Resource href href Resource href Resource • A place where computers do the presentation (easy) and people do the linking and interpreting (hard). • Why not get computers to do more of the hard work? What’s the Problem? • Typical web page markup consists of: • Rendering information (e.g., font size and colour) • Hyper-links to related content • Semantic content is accessible to humans but not (easily) to computers… Information We Can See CS599 Introduction to Grid Computing Contents 1 CS599 Introduction to Grid Computing, Fall 2007 1.1 Prerequisites … CS599 Introduction to Grid Computing, Fall 2007 …. Prerequisities Graduate courses in Operating Systems (CS 555) and/or Networks (CS 551) Course Description This course provides a graduate-level introduction to wide area distributed computing research, focusing on a wide… … What the computer sees… CS599 Introduction to Contents 1 CS599 Introduction Fall 2007 1.1 Prerequisites … CS599 Introduction to Fall 2007 Grid Computing to Grid Computing, Grid Computing, …. Prerequisities Graduate courses in Operating Systems 555) and/or Networks (CS 551) (CS Course Description This course provides a graduate-level introduction to wide area distributed computing research, focusing on a wide… … Solution: XML markup with “meaningful” tags? <title>CS599 Introduction to Grid Computing</title> <contents>Contents 1 CS599 Introduction to Grid Computing, Fall 20071.1 Prerequisites </contents> <intro> CS599 Introduction to Grid Computing, Fall 2007 </intro> <prereqs> Prerequisites Graduate courses in Operating Systems (CS 555) and/or Networks (CS 551)<</prereqs> <description> Course Description This course provides a graduate-level introduction to wide area distributed </description> Still the Machine only sees… <title>CS599 Introduction to Grid Computing</title> <contents>Contents 1 CS599 Introduction to Grid Computing, Fall 20071.1 Prerequisites </contents> <intro> CS599 Introduction to Grid Computing, Fall 2007 </intro> <prereqs> Prerequisites Graduate courses in Operating Systems (CS 555) and/or Networks (CS 551)<</prereqs> <description> Course Description This course provides a graduate-level introduction to wide area distributed </description> Need to Add “Semantics” • External agreement on meaning of annotations – E.g., Dublin Core for annotation of library/bibliographic information • Agree on the meaning of a set of annotation tags – Problems with this approach • Inflexible • Limited number of things can be expressed • Use Ontologies to specify meaning of annotations – Ontologies provide a vocabulary of terms – New terms can be formed by combining existing ones • “Conceptual Lego” – Meaning (semantics) of such terms is formally specified – Can also specify relationships between terms in multiple ontologies Ontology in Computer Science • An ontology is an engineering artifact: – It is constituted by a specific vocabulary used to describe a certain reality, plus – a set of explicit assumptions regarding the intended meaning of the vocabulary. • Almost always including concepts and their classification • Almost always including properties between concepts • Similar to an object oriented model • Thus, an ontology describes a formal specification of a certain domain: – Shared understanding of a domain of interest – Formal and machine manipulable model of a domain of interest Ontology Languages • Work on Semantic Web has concentrated on the definition of a collection or “stack” of languages. – Used to support the representation and use of metadata – Basic machinery that we can use to represent the extra semantic information needed for the Semantic Web RDF(S) XML Annotation RDF Integration Integration RDFS Inference OWL Reasoning over the information we have Could be light-weight (taxonomy) Could be heavy-weight (logic-style) Integrating information sources Associating metadata to resources (bindings) RDF • RDF stands for Resource Description Framework • It is a W3C Recommendation – http://www.w3.org/RDF • RDF is a graphical formalism ( + XML syntax + semantics) – for representing metadata – for describing the semantics of information in a machineaccessible way • Provides a simple data model based on triples. The RDF Data Model • Statements are <subject, predicate, object> triples: – <Paul,presents,SemClass> • Can be represented as a graph: Paul presents SemClass • Statements describe properties of resources • A resource is any object that can be pointed to by a URI – The generic set of all names/addresses that are short strings that refer to resources – a document, a picture, a paragraph on the Web, http://users.ecs.soton.ac.uk/pg03r, a book in the library, a real person, isbn://0141184280 • Properties themselves are also resources (URIs) Linking Statements • The subject of one statement can be the object of another • Such collections of statements form a directed, labeled graph “Paul Groth” hasName Paul presents preparedBy preparedBy Oscar SemClass hasHomePage http://vtcpc.isi.edu/CS599_Gri dComputing/ • The object of a triple can also be a “literal” (a string) RDF Syntax • RDF has an XML syntax that has a specific meaning: • Every Description element describes a resource • Every attribute or nested element inside a Description is a property of that Resource • We can refer to resources by URIs <rdf:Description rdf:about="some.uri/person#paul> <o:presents rdf:resource="some.uri/class#SemClass"/> <o:hasName rdf:datatype="&xsd;string">Paul Groth</o:hasName> </rdf:Description> <rdf:Description rdf:about="some.uri/session#SemClass"> <o:hasHomePage> http://vtcpc.isi.edu/CS599_GridComputing/ </o:hasHomePage> <o:preparedBy rdf:resource=“some.uri/person#oscar> <o:preparedBy rdf:resource=“some.uri/person#paul"> </rdf:Description> What does RDF give us? • • • • Single (simple) data model. Syntactic consistency between names (URIs). A mechanism for annotating data and resources. Low level integration of data. RDF XML Annotation RDF(S) Integration Integration RDFS Inference OWL What doesn’t RDF give us? • RDF does not give any special meaning to vocabulary – Such as subClassOf or type (supporting OO-style modelling) • So, what’s the difference between this graph... “Paul Groth” hasName presents Paul SemClass preparedBy • ... and this one? “Paul Groth” isAlsoKnownAs Paul talksIn presentedBy SemClass RDFS: RDF Schema • RDF Schema is another W3C Recommendation – http://www.w3.org/TR/rdf-schema/ • It extends RDF with a schema vocabulary that allows you to define basic vocabulary terms and the relations between those terms – Class, type, subClassOf, – Property, subPropertyOf, range, domain – it gives “extra meaning” to particular RDF predicates and resources – this “extra meaning”, or semantics, specifies how a term should be interpreted • The combination of RDF and RDF Schema is normally known as RDF(S) Example xsd:date eventDate Event subClassOf subClassOf subClassOf Personal_Event Local_Event Regional_Event involves Person subClassOf Professor subClassOf Researcher RDF(S) Inference rdfs:Class rdf:type Person rdf:type rdfs:subClassOf rdfs:subClassOf Academi c rdf:subClassOf Professor rdf:type RDF(S) Inference rdfs:Class rdf:type Academic rdfs:subClassOf rdf:type Professo r rdf:type Ewa rdf:type What does RDFS provide? • Ability to use simple schema/vocabularies to describe our resources • Consistent vocabulary use and sharing • Simple inference • Query mechanisms: SPARQL, SeRQL, RDQL, … – SELECT N FROM {N} rdf:type {sti:Event} USING NAMESPACE sti=<http://www.ontogrid.net/StickyNote#> What is RDFS lacking? • RDFS is too weak to describe resources in sufficient detail – No localised range and domain constraints • Can’t say that the range of hasEducationalMaterial is Slides when applied to TheoreticalSession and Code when applied to HandsonSession – TheoreticalSession hasEducationalMaterial Slides – HandsonSession hasEducationalMaterial Code – No existence/cardinality constraints • Can’t say: – Sessions must have some EducationalMaterial – Sessions have at least one Presenter – No transitive, inverse or symmetrical properties • Can’t say that presents is the inverse property of isPresentedBy The OWL Family Tree DAML RDF/RDF(S) DAML-ONT Joint EU/US Committee Frames DAML+OIL OIL OntoKnowledge+Others Description Logics OWL W3C OWL • W3C Recommendation (February 2004) • A family of Languages – OWL Full – OWL DL – OWL Lite • Formal semantics – Description Logics (DL/Lite) – Relationship with RDF OWL Basics (on top of RDF and RDFS) • Set of constructors for concept expressions – Booleans: and/or/not • A Session is a TheoreticalSession or a HandsonSession • Slides are not the same as Code – Quantification: some/all • Sessions must have some EducationalMaterial • Sessions can only have Presenters that have developed Grid applications or Grid middleware • Axioms for expressing constraints – Necessary and Sufficient conditions on classes • A Session that hasEducationalMaterial Code is a HandsonSession. – Disjointness • TheoreticalSessions are disjoint with HandsonSessions – Property characteristics: transitivity, inverse Reasoning • OWL DL based on a well understood Description Logic (SHOIN(Dn)) – Formal properties well understood (complexity, decidability) – Known reasoning algorithms – Implemented systems (highly optimised) • Because of this, we can reason about OWL ontologies – Subsumption reasoning • Allows us to infer when one class is a subclass of another • Can then build concept hierarchies representing the taxonomy. • This is classification of classes. – Satisfiability reasoning • Tells us when a concept is unsatisfiable – i.e. when it is impossible to have instances of the class. • Allows us to check whether our model is consistent. – Instance Retrieval/Instantiation • What are the instances of a particular class C? • What are the classes that x is an instance of? Reasoning Tasks. Classification What does OWL provide? • Ability to use complex schema/vocabularies to describe our resources. • Consistent vocabulary use and sharing. • Robust data integration techniques • Complex inference and several reasoning functions • Query mechanisms: OWL QL Summary • Good Things about RDF(S) + OWL – They let us describe resources in a machine understandable way – Resources can be anything addressable by URIs – These languages are standards – They allow for different levels of reasoning Lots of Resources on the Grid • • • • • • • Web Services Computational facilities Appartus Disk and networking infrastructure Policies Workflows Programs The Semantic Grid “The Semantic Grid is an extension of the current Grid in which information and services are given well-defined and explicitly represented meaning, so that it can be shared and used by humans and machines, better enabling computers and people to work in cooperation” D. De Roure, et. al http://www.semanticgrid.org Where things meet… Motivation: Metadata Matters • Particularly for the following activities: – – – – – Information provision and resource discovery Data integration Provenance Systems Configuration Policy representation and reconciliation • Using: – Open, flexible and extensible self describing schemas that don’t have to be nailed down • “Let’s describe my data set, or the output format of this tool” • Lightweight schemas • Decoupled, interoperable systems, which resist to syntactic changes – Open world • “This metadata is no longer valid because...” – Data integration across different data models (e.g. RDF) • Like policy or resource models – Formalization & Reasoning support Example: Web Services list of strings Photo Lookup Service Provided by USC campus jpegs Example: Web Services Keywords, Describe Pictures, Describe things at USC list of strings Photo Lookup Service Provided by USC campus jpegs Photos, Taken at USC, Photo taxonomy Availability, It uses Google, It runs on an old 486 It logs your data OWL-S • OWL-S is a language based on OWL for the description of Web Services – http://www.w3.org/Submission/OWL-S/ • Motivating Tasks – Automatic Web service discovery – Automatic Web service invocation – Automatic Web service composition and interoperation • Bringing Semantics to Web Services with OWL-S, Martin, D. and Burstein, M. and McDermott, D. and McIlraith, S. and Paolucci, M. and Sycara, K. and McGuinness, D. and Sirin, E. and Srinivasan, N., 2007 OWL-S - Upper Ontology OWL-S Service Profile OWL-S and WSDL Semantics make better registries • Grimories – http://twiki.grimoires.org/bin/view/Grimoires/ – A UDDI Service Registry – Allows for the additon of arbitrary metdata (in RDF) to describe services. – Allows for the registration of workflows and associated metada desciptions – Exposes service descriptions through a WSRF interface • Benefits – More advanced search taking advantage of RDFS style reasoning • e.g. find a photo service, finds the USC photo service – Descriptions can be enhanced over time Taverna Workflow Workbench From Carole Goble Composing Services • Taverna (http://taverna.sourceforge.net) – – – – Quite a few users Not very “gridy” Stores all of its worklfows and related information in RDF Mainly uses semantics for lookup • Wings (http://www.isi.edu/ikcap/wings/) – Takes advantage of service descriptions, data descriptions, and workflow templates in OWL to automatically generate workflows that can be run with Pegasus. • McIlraith et al. – Using Golog to automatically compose services based on OWL-S descriptions. Data Management • • • • Data from multiple institutions Data discovery and search Integration through semantics End-to-end data understanding – Data from apparatus – Data from lab notebooks • CombeChem • Scientific Application Middleware Provenance of data Bioinformatics: verification and auditing of “experiments” (e.g. for drug approval) High Energy Physics: tracking, analysing, verifying data sets in the ATLAS Experiment of the Large Hadron Collider (CERN) Why not use log files? INFO: Starting Coyote HTTP/1.1 on http-8443 Jun 25, 2007 4:26:54 PM org.apache.jk.common.ChannelSocket init INFO: JK2: ajp13 listening on /0.0.0.0:8009 Jun 25, 2007 4:26:54 PM org.apache.jk.server.JkMain start INFO: Jk running ID=0 time=4/38 config=/Users/pgroth/Develop/jakarta-tomcat-5/conf/jk2.properties Jun 25, 2007 4:26:54 PM org.apache.catalina.startup.Catalina start INFO: Server startup in 17476 ms Jun 25, 2007 4:36:49 PM org.apache.catalina.core.ContainerBase log INFO: Removing web application at context path /preserv Jun 26, 2007 8:59:59 AM org.apache.catalina.core.StandardHostDeployer install INFO: Installing web application at context path /preserv-1.0 from URL jar:file:/Users/pgroth/Develop/jakarta-tomcat5/webapps/preserv-1.0.war!/ Jun 26, 2007 9:00:03 AM org.apache.catalina.loader.WebappClassLoader validateJarFile INFO: validateJarFile(/Users/pgroth/Develop/jakarta-tomcat-5/webapps/preserv-1.0/WEB-INF/lib/servlet-api-2.4.jar) - jar not loaded. See Servlet Spec 2.3, section 9.7.2. Offending class: javax/servlet/Servlet.class Jun 26, 2007 9:00:08 AM org.apache.catalina.startup.HostConfig deployWARs SEVERE: Exception while expanding web application archive preserv-1.0.war java.lang.IllegalStateException: Context path /preserv-1.0 is already in use at org.apache.catalina.core.StandardHostDeployer.install(StandardHostDeployer.java:190) at org.apache.catalina.core.StandardHost.install(StandardHost.java:832) at org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:617) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:431) at org.apache.catalina.startup.HostConfig.check(HostConfig.java:1068) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:327) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119) at org.apache.catalina.core.StandardHost.backgroundProcess(StandardHost.java:800) at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1619) at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1628) at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1608) at java.lang.Thread.run(Thread.java:613) Jun 26, 2007 9:03:18 AM org.apache.catalina.core.ContainerBase log INFO: Removing web application at context path /preserv-1.0 Provenance • Provides the causal connections between data items • Needed is a meaningful representation of the provenance of data so that it can be reasoned about • Several groups have worked on provenance in the context of workflows • Many use Semantic Web Technologies • The Provenance Challenge – http://twiki.ipaw.info/bin/view/ Challenge/ Adding Semantics to the Grid – Semantic-OGSA – Semantic Grid Reference Architecture – A low-impact extension of OGSA – Mixed ecosystem of Grid and Semantic Grid services • Services ignorant of bindings • Services binding aware but unable to process them • Services binding aware and capable of processing (part of) them – Everything is OGSA compliant Model provide/ consume expose Mechanisms Capabilities use From Carole Goble OGSA Core Grid Telecontrol Protocol Delegation Data Replication Community Data Access Authorization & Integration Contrib/ Preview Community Scheduling Framework WebMDS Python WS Core Workspace Management Trigger C WS Core Authentication Authorization Reliable File Transfer Grid Resource Allocation & Management Index Java WS Core Pre-WS Authentication Authorization GridFTP Pre-WS Grid Resource Alloc. & Mgmt Pre-WS Monitoring & Discovery C Common Libraries Credential Mgmt Replica Location Security Data Mgmt eXtensible IO (XIO) Execution Mgmt Info Services Common Runtime Deprecated Web Services Components Non-WS Components S-OGSA (OntoKit implementation) Annotation Metadata Reasoning Ontology Semantic Delegation Core Grid Telecontrol Protocol Ontology Role-based AuthZ Data Replication Community Data Access Authorization & Integration Contrib/ Preview Community Scheduling Framework WebMDS Python WS Core Workspace Management Trigger C WS Core Authentication Authorization Reliable File Transfer Grid Resource Allocation & Management Index Java WS Core Pre-WS Authentication Authorization GridFTP Pre-WS Grid Resource Alloc. & Mgmt Pre-WS Monitoring & Discovery C Common Libraries Credential Mgmt Replica Location Security Data Mgmt eXtensible IO (XIO) Execution Mgmt Semantically Aware Info Services Common Runtime Deprecated Web Services Components Non-WS Components The Future • Increasing amounts of RDF data – May just be because it’s a nice way to store graphs • Semantics are becoming easier to integrate with the Grid because of the move towards Web Service technology • Difficult to markup services and data • Concerns about the “friendliness” of tools that use semantics • Everybody loves Web 2.0, Semantic Grid people do to (now)