Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Federal University of Rio de Janeiro – COPPE/UFRJ Author: Wladimir S. Meyer – Doctorate Student Advisors: Jano Moreira de Souza – Ph.D. Milton Ramos Ramirez – D.Sc. 1 / 18 Summary Introduction Framework Description Motivation Objectives Related Works Structure Functioning New functionalities added to Secondo The Case Study Final Considerations 2 / 18 Introduction Motivation The challenge of integrate spatial databases spread around a computational grid Objectives Aggregate new functionalities to an extensible SDBMS that permit it to act as a platform to study distributed spatial databases in computational grids. This platform should: Be capable of interact (by itself) with other analogous platforms in a grid Offer some level of transparencies [Özsu and Valduriez 1999]: • Data independence • Network transparency • Replication Transparency Be modular to permit focus only in experiences being developed Be capable of exchange “specialized skills” (algebras in this case) 3 / 18 Introduction Related Works The GGF Data Access and Integration Services Work Group (GGFDAIS-WG) produces a lot of recomendations related with DB in grids [OGSA-DAI-WSRF 05]. They are a set of interfaces and services to be implemented outside the DBMS environment Only relational, XML and file system data models are supported The OGSA-DAI project implements many of DAIS-WG recomendations and offers a java toolkit for clients The OGSA-DQP project [Smith et al. 2002] uses OGSA-DAI to offer support in distributed queries over a grid. Only relational databases are benefitted and doesn’t support the newly release of OGSA-DAI based on WSRF. 4 / 18 Framework Description - Structure The framework is composed by: A Spatial DBMS*: Secondo [Dieker and Güting 2000] was adopted because its modularity, formalism and extensibility. It was intended originally for experimental purpose with spatial and spatio-temporal data models [Güting et al. 2004]. A grid middleware: it offers several services that are used by the SDBMS [Foster 2005]: Job Manager Service (GRAM) Reliable File Transfer Service (RFT) Index Service (MDS) Globus Toolkit 4 was chosen because of its web service approach and set of powerful components. A set of tools: it was added to provide some extra functionalities like: Submit queries to a set of servers, Discovery an algebra, in other Secondo, based in algebra description files Import an algebra (*) – when used with its spatial algebra 5 / 18 Framework Description - Functioning Central Index Service (MDS) •Global Schema •Fragments’ map Secondo#1 Algebras’ Description file Secondo #2 Algebras’ Description file QUERY Secondo #3 Algebras’ Description file Secondo #4 6 / 18 Framework Description - Functioning Central Index Service (MDS) •Global Schema •Fragments’ map MDS Secondo #2 Secondo #1 QUERY Request Servers’ status MDS Secondo #3 MDS Same fragments Secondo #4 7 / 18 Framework Description - Functioning Central Index Service (MDS) •Global Schema •Fragments’ map MDS Secondo #2 Secondo #1 QUERY Responses MDS CPU load Total amount of memory Total amount of free memory Number of running processes Number of active processes Number of users logged in Total amount of free space in hard disk Secondo #3 MDS Secondo #4 8 / 18 Framework Description - Functioning Central Index Service (MDS) •Global Schema •Fragments’ map Secondo #1 Secondo #2 The Secondo #1 generates a job description file, a Secondo-command file and submit them to selected nodes using GRAM QUERY Send subqueries The job description file can express a multijob, for example meaning that a result from a query must be transfered to another to be used in a second step. Secondo #3 Secondo #4 9 / 18 Framework Description - Functioning Central Index Service (MDS) •Global Schema •Fragments’ map Secondo #2 Secondo #1 QUERY Results as nested lists (RFT) Secondo #3 Secondo #4 10 / 18 Framework Description - Functioning Central Index Service (MDS) •Global Schema •Fragments’ map Secondo #1 Result Secondo #2 The returned results are aggregated to form a global result Secondo #3 Secondo #4 11 / 18 Framework Description – Modified Secondo Global Query Plan Processor subqueries Global query Query processor Alg 2 Alg 3 MDS Results Query Execution Monitor Resources status monitorResourcesStatus() Fragment Location Kernel Command processor Global schema requestGlobalSchema() Optimizer requestFragmentLocation() Query Plan Maker submitSubQueries() Graphical User Interface Alg 1 New functionalities GRAM Global result Adapted from [Ramirez 2001] Alg n Storage Manager & tools globalQueryPlanProcessor() GRAM cli Submit activities (jobs) to grid requestGlobalSchema() modifyGlobalSchema() requestFragmentLocation() updateFragmentLocation() monitorResourcesStatus() lookForAlgebras() importAlgebra() MDS cli Discover and monitor registered resources 12 / 18 Framework Description – New functionalities Files generated automatically during a job submission: Job description file – a file that specifies details about where and how a job must be executed Secondo Command file – specifies a set of commands to be run in a Secondo server Spatial select example Constructed with spatial algebra open database 28433; create tempBox:rect; update tempBox:=[const rect value(-48.775 –48.771 –25.331 –25.339)] let temp=drain_line creatertree [shape]; query temp drain_line windowintersect [tempBox] consume; delete temp; delete tempBox; close database 28433; R-tree algebra operators 13 / 18 The Case Study To validate the proposed framework a geographic database prototype is being built in the following manner: Composition: • 04 computers, with Fedora Linux, as grid nodes, • All machines running GT4 with GRAM, MDS, RFT services, • All machines running a modified Secondo (Secondo-grid) Distributed spatial database design: Hydrography Federated architecture with a Global Schema Secondo 1 Edification Thematic fragmentation Vegetation The fragments can be replicated All themes belong to the same region Secondo 2 Secondo 4 Secondo 3 14 / 18 The Case Study Autonomy: modarate, because each Secondo must update the global schema and fragments’ map when necessary Nature of data: Cartographic data supplied by Directory of Geographic Service (Brazilian Army) Queries being implemented: spatial select and spatial join 15 / 18 Final Considerations This framework is being developed as a platform for experimental purposes: performance isn’t its main focus Many issues were not included in present work and will be covered in future works: transaction control, optimizer for distributed queries, security, etc Modules of the framework that are running now: • Registering and Monitoring modules: based on global schema, fragments’ map, servers’ status monitor and algebras’ description file • Automatic generation of files: job description and secondo command file • Submission of single queries with GRAM clients 16 / 18 Final Considerations Next steps: Conclude the data transference module using RFT Implement multijob submission with complex queries Conclude the infrastructure to import algebras 17 / 18 Thank you ! 18 / 18