Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Protection Act, 2012 wikipedia , lookup
Data center wikipedia , lookup
Clusterpoint wikipedia , lookup
Data analysis wikipedia , lookup
Forecasting wikipedia , lookup
Semantic Web wikipedia , lookup
3D optical data storage wikipedia , lookup
Data vault modeling wikipedia , lookup
Information privacy law wikipedia , lookup
Database model wikipedia , lookup
Grids, Grid Data Services and OGSA-DAI Mike Mineter NeSC-TOE [email protected] Acknowledgement • Many slides from OGSA-DAI team. • (Some slides from me.) 2 EU project: RIO31844-OMII-EUROPE Contents • • • • • What is a Grid? What is a Grid Data Service? Why the “OGSA-DAI” acronym?! Why does OGSA-DAI matter?! When should we use OGSA-DAI? 3 EU project: RIO31844-OMII-EUROPE What is a Grid? - 1 • This is 4 EU project: RIO31844-OMII-EUROPE What is a Grid? Computers Data People A Grid is all about the sharing of Resources 5 EU project: RIO31844-OMII-EUROPE A Grid is.. • … all about the sharing of Resources – Within and between virtual organisations (= collaborations) • Resources accessed by abstractions – User wants a job to run, wants to access data,… » Rarely cares where this happens • … a set of resources (and enabling services) that share mechanisms for – Authentication: communicate identity of user/provider • X.509 certificate commonly used in “production grids” – Authorisation: what can this user be allowed to do • Member of which VO, which group,… – Underpinned by agreement across VOs and resource providers • … infrastructure that builds on the Internet to permit orchestration of services across administrative domains 6 EU project: RIO31844-OMII-EUROPE Web services – software components that are… • Accessible across a network • Loosely coupled, defined by the messages they receive / send • Service description that can be used to create client software • Based on standards (for which tools do / could exist) • Developed in anticipation of new uses Client Service Service Service Service Service Service 7 EU project: RIO31844-OMII-EUROPE 26 Globus Toolkit 4 Web Services Core Custom Web Services Custom GT4 WSRF Web WSRF Web Services Services WS-Addressing, WSRF, WS-Notification WSDL, SOAP, WS-Security Thanks to J. Schopf, ANL Registry Administration GT4 Container User Applications Focus on Data Data OGSA-DAI enables the sharing of Data Resources 29 EU project: RIO31844-OMII-EUROPE Types of data services • Many user communities manage data in grid vaults (aka storage elements) – Experimental data • …. – Replicated for resilience – And to be close to where computation will happen • Many new user communities have more diverse data resources • To facilitate new research need data to be accessible from Grid infrastructures • Resources: – May pre-date Grids – Providers may have current ways to distribute data to users – May not be able to replicate data – Need AuthN and AuthZ 30 EU project: RIO31844-OMII-EUROPE Motivation • Grid is about sharing resources • OGSA-DAI is about sharing structured data resources Relational Database XML Database Indexed File Web: www.omii.ac.uk 31 Email: [email protected] Life before OGSA-DAI…. • A few examples follow of alternative approaches to sharing data. Web: www.omii.ac.uk 32 Email: [email protected] Sharing data via web site download • ZIP up data and put it on a web site • Pros o o Easy distribution for providers Easy access for consumers • Cons o o o o Consumers have to download all the data Consumers have to load data into local databases to use it Static snapshot Security Web: www.omii.ac.uk 33 Email: [email protected] Sharing data via direct access • Providers tell consumers o o o Database URL – mycomputer.epcc.ed.ac.uk:3306 Username – userID Password – password • Pros o Consumers have direct access • Cons o o o o Firewall issues User and password management is hard No consistent security model Hard to use in grid/web service workflows Web: www.omii.ac.uk 34 Email: [email protected] Sharing data via direct access • Cons (continued) o o o No server-side layer in which to standardize database heterogeneities Myriad drivers Different APIs across different data types • Relational and JDBC • XML and XMLDB • Indexed files and Lucene Web: www.omii.ac.uk 35 Email: [email protected] Domain-specific web services • Manipulate data using domain-specific operations, e.g. o o o Book findByISBN(ISBN) List<Book> findByAuthor(Author) List<Book> findByKeyword(Word) • Pros o o o o Fits with grid/web service approach Abstraction hides back-end database details Web services are programming language neutral Operations likely to map well to authorization policies Web: www.omii.ac.uk 36 Email: [email protected] Domain-specific web services • Cons o Slower than direct access • Web service layer • SOAP transport overhead – especially for large result sets o Domain-specific API prevents use of generic data exploration, mining and manipulation tools Books Cancer Generic Data Linking Application Books written by University employees Web: www.omii.ac.uk University Employees 37 University employees in 1932 who have since died of cancer Email: [email protected] OGSA-DAI generic web services • Manipulate data using OGSA-DAI’s generic web services Relational Database request OGSA-DAI XML Database data Indexed File Web: www.omii.ac.uk 38 Email: [email protected] Importance of workflows OGSA-DAI server is close to data Access OGSA-DAI OGSA-DAI service Transform Web Service Query -> Transform -> DeliverToFTP FTP Server 3 activities in the workflow FTP Server 39 EU project: RIO31844-OMII-EUROPE Usage Scenarios Data Source Data Source 1 Data Source OGSA-DAI OGSA-DAI Client Client Data Source 2 Data Source n OGSA-DAI FTP Server on Client Client Data message Control message © 40 EU project: RIO31844-OMII-EUROPE OGSA-DAI 3.0 OMII GT Axis UNICORE WS-DAI ? gLite Embedded Resource management OGSA-DAI Core Data Resources Activity management Workflow engine Activities Persistence and Configuration 41 EU project: RIO31844-OMII-EUROPE Typical roles • Researcher – Wants to use data from context of known application, easy portal, workflow.. • Data publisher – Deploys OGSA-DAI server – Determines AuthN and AuthZ policies for their data – Establishes activities (= workflow components) • Informatician / Application developer – Deploys client software – Uses Java to build workflow – Exposes client for… 42 EU project: RIO31844-OMII-EUROPE OGSA-DAI 3.0 • • • OGSA-DAI has evolved constantly since February 2002 OGSA-DAI 2.2 released April 2006 As the number of users grew so did the requirements – – – – • • More effective data streaming Standardisation of activity inputs and outputs Targeting multiple data resources in a single workflow Supporting application-specific presentation layers OGSA-DAI 2.2 was not suitable for addressing these OGSA-DAI 3.0 – A complete re-design and re-implementation of OGSA-DAI – A stable framework for the future – Released September 2007 43 EU project: RIO31844-OMII-EUROPE Where might OGSA-DAI not be suitable? • OGSA-DAI is not – A complete solution to every data-related problem – A replacement for or competitor to JDBC – Just about accessing relational databases • It is not suitable if – You have a single data resource that isn’t going to change – You have no data transformation requirements – You want rapid access to data in a single data resource 44 EU project: RIO31844-OMII-EUROPE What is OGSA-DAI? • • • • • An extensible framework accessed via web services that executes data-centric workflows involving heterogeneous data resources for the purposes of data access, integration, transformation and delivery • within a grid • and is intended as a toolkit for building higher-level application-specific data services 45 EU project: RIO31844-OMII-EUROPE Thank you! http://www.ogsadai.org.uk http://omii-europe.org 46 EU project: RIO31844-OMII-EUROPE