Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Management in Geodise Zhuoan Jiao, Jasmin Wason and Marc Molinari 30-31 January 2003, Edinburgh Engineering design and optimisation is a computationally intensive process where data may be generated at different locations with different characteristics. Data is traditionally stored in flat files with little descriptive metadata provided by the file system. Our focus is on providing data management by leveraging existing database tools that are not commonly used in engineering and making them accessible to users of the system. The main objectives are to provide: A data management service Store and retrieve data files securely from a repository using GridFTP. Technical and application specific metadata added so data is easier to search for and locate. Metadata management services Web services provide API access to metadata in databases. Use both relational and XML databases. A familiar interface for engineers Work with functions and variables rather than underlying XML, SOAP, SQL, XPath, etc. © Geodise Project, University of Southampton, 2003. http://www.geodise.org/ Geodise Database Toolkit Storage service Allows applications to archive data sent over GridFTP in file systems curated by Geodise for benefits of: accessibility by a larger community (via authorisation), storage capacity, and a uniform query interface. Metadata service The data can be stored with additional descriptive information detailing technical characteristics (e.g. location, format), ownership, and application domain specific metadata. Query service Query over the metadata database can help to locate the needed data intuitively and efficiently. Example: Authorisation service Access rights to data can be granted to an authenticated user based on information stored in the authorisation database. Example: Archive data: >> fileID = gd_archive('C:\input.dat'); Retrieve data: >> gd_retrieve(fileID, 'E:\tmp' ) ans = E:\tmp\input.dat Example: Define metadata and archive file: >> m.grids = 1; >> m.turb_model = 'sa'; >> fileID = gd_archive('C:\input.dat', m); Example: >> r = gd_query('standard.userID = me & grids < 2'); To access a value from first result in cell array r: >> r{1}.turb_model ans = sa >> >> >> >> m.grids = 1; m.access.users = {'userA', 'userB'}; m.access.groups = {'groupC'}; fileID = gd_archive ('C:\input.dat', m); © Geodise Project, University of Southampton, 2003. http://www.geodise.org/ Application of XML Toolbox for MATLAB Type-based XML (easy for converting back to Matlab) Define Matlab variables: >> meta.grids = 1 >> meta.turb_model = ‘sa’ X = xml_format(meta) X = meta = xml_parse(X) <struct xmlns="http://www.geodise.org/matlab.xsd" idx="0“ fields="grids turb_model"> <double idx="1" name="grids" size="1 1"> 1 </double> <char idx="1" name="turb_model" size="1 2">sa</char> </struct> XSLT: name2type The following functions are responsible for converting a Matlab variable to and from an XML string: Name-based XML (easy for query) xml_format() - Convert X = a Matlab variable into an XML string. <file_metadata type="struct" idx="0" fields="grids turb_model"> <grids type="double" idx="1" size="1 1"> 1 </grids> <turb_model type="char" idx="1" size="12">sa</turb_model> </file_metadata> xml_parse() - Convert an XML string into a Matlab variable. XSLT: type2name Database © Geodise Project, University of Southampton, 2003. http://www.geodise.org/ Data Management Implementation To increase the usability of file and metadata management services for Engineers we have implemented a MATLAB Toolkit for archiving, querying and retrieval of data to and from a Geodise repository. Client Grid Geodise Database Toolkit Matlab Functions Globus Server Refers to GridFTP Java clients .NET Location Service Location Database Authorisation Service Authorisation Database CoG Apache SOAP SOAP SOAP Browser Java Metadata Archive & Query Services © Geodise Project, University of Southampton, 2003. http://www.geodise.org/ Metadata Database Query Example 1 MATLAB commands to retrieve files. 2 3 Copy and paste 4 © Geodise Project, University of Southampton, 2003. http://www.geodise.org/ Future Work Grid Data Management Replace and enhance some of our functionality with that provided by OGSA DAI for Grid Database Services. E.g. Name mapping interface for authenticating Grid credentials to local ids. Automatic collection of data and metadata from a higher level engineering problem setup GUI. Manage Matlab Structures Some data may take the form of Matlab structures rather than files. These can be archived as XML in the repository and then queried. The structures can also be retrieved back into the Matlab workspace. Categorisation of metadata based on XML Schemas Group metadata with same XML Schema in the database. Users are not expected to write XML Schemas. Generate simple XML Schema from metadata structure if one does not already exist to describe it. Could help future integration of ontologies. © Geodise Project, University of Southampton, 2003. http://www.geodise.org/