Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
gLibrary 2.0 REST Platform Antonio S. Calanducci – University of Catania - Italy ([email protected]) e-Research Summer Hackfest – Catania (Italy) This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement n° 654237 Outline • • • • • • • • • Platform presentation & history Features Architecture Authentication & Authorization Deployment Under the hood How to use gLibrary gLibrary 1.0 vs gLibrary 2.0 Reference Live demo 2 Introduction to gLibrary 2.0 • A service that provides access to existing data collections or create new ones • Exposes access to data collections via REST APIs and JSON • RESTifies existing database • Supports both relational with schema (MySQL, PostgreSQL, etc.) and non-relational schema-less database (MongoDB) • Creation and management of new repositories and collections (i.e. REST APIs for them) is done via gLibrary REST APIs at runtime • We can say that gLibrary provides REST API to create REST APIs :) 3 Terminology • Repository: it provides a way to group together data collections. These collections can be of different type, heterogeneous or coming from different remote servers. An alias of repository is project (or a database in a RDBMs world). Generally an user is the owner/manager of a repository • Collection: it’s a set or documents or records. A collection can have a fixed schema (like a database table) or schemaless (a JSON document) • Item: a record or document. It’s a set of key value pairs in JSON format • Replica (or Attachment): Each item can optionally have an associated file, stored on one or more distribute storage server 4 Examples: • Repositories: “sci-gaia”, “my_newproject”, “unict”, “demo” eg: /v2/repos/my_newproject • Collections: “patients”, “activities”, “presentations”, “manuscripts”, “music”, “videos”, “invoices”, “running_jobs”, “staged_files”, etc eg: /v2/repos/my_newproject/videos /v2/repos/my_newproject/invoices • Item: a given “invoice” detail, “song” details, “job” detail eg: /v2/repos/demo/music/32 • Replica: the “pdf” file of an “invoice”, a “mp3” file of a “song”, the “txt” output file of a “job” eg: /v2/repos/demo/invoices/1432/_replicas/i2jgi34jg34 5 gLibrary REST APIs to manage REST APIs over data sets • 6 We follow REST principles to manage resources, using HTTP verbs and proper URI paths: • GET for retrieving list of collections, items, replicas • POST to create new repository, new collections, new items, new replicas • PUT for editing/updating items, collections, replicas • DELETE to delete repositories, collections, items, replicas Features • • • • • • • • 7 Creation of local datasets (on gLibrary server) or remote (on MySQL, PostgreSQL, MongoDB Support for both schema less and fixed schema collection Create collections from data coming from existing remote databases (query are forwarded to the remote host) Creation of relations between collections (even of different type or belonging to different and remote databases) Powerful query syntax on the URL that offers limit, skip, where, like, logical operators, regexp, comparison, ordering User creation and login Setting permissions per repository and per collections (Access Control Lists) Atorage of assets on Grid (Disk Pool Manager) or Cloud (OpenStack Swift). Direct download/upload from servers (no caching on gLibrary server) gLibrary architecture Architecture Clients Grid Storage (DPM) infn-se-03.ct.pi2s2.it browser Cloud Storage (Swift) cloud.recas.ba.infn.it mobile apps glibrary.ct.infn.it server (local database / mongoDB) e-Infrastructure Resources Remote Databases (MySQL, PostgreSQL, MongoDB) running on VM) 8 Certificate token server User TrackingDB 8 Authentication & Authorization • • • Authentication: gLibrary provides APIs to create and sign in Users. Each call to gLibrary REST APIs has to be authenticated. A valid and not expired TOKEN has to be passed in any request in the Authorization HTTP header with the TOKEN: i.e. curl -H “Authorization: Fsw6tUVzNwp4ftzK4cb3WxwKkvMZ” http://glibrary.ct.infn.it:3500/v2/repos/ • • 9 Authorization: Access Control Lists, with permissions (reading, creation, editing) for repositories and collections Deployment • gLibrary server can be installed anywhere: on Windows, macOS, any Linux distribution • Requirements: • An installation of Node.js (https://nodejs.org) • a local or remote MongoDB (https://www.mongodb.com) • Install it from the source available at: • https://github.com/csgf/glibrary • (note: use the branch testv2.1) • install instructions are provided in the previous link • or create an account on our server at • http://glibrary.ct.infn.it:3500 10 How to use gLibrary • From the Command Line: • use CURL, Wget to integrate in your own script (i.e. running on a VM or Grid WN) • From RIA Web Apps using xmlHTTPRequests of any wrapper on top of it (i.e. jQuery $.ajax()) • From any portal/CMS (ie. Liferay, Wordpress, Joomla, Drupal, etc) as long an HTTP client is available • From mobile apps (Android, iOS and Windows phone provides HTTP Clients in their SDK) • From desktop applications 11 Under the hood • gLibrary 2.0 has been written in JavaScript and Node.js • It’s based on the open source Loopback framework from IBM: • http://loopback.io • A MongoDB database is used to store it’s configuration settings for repositories, collections and replicas • It uses Juggler (https://github.com/strongloop/loopbackdatasource-juggler) as ORM. It has a modular architecture to connect alternative datasources (SQLite, Oracle, SQL Server, Redis, DynamoDB, CouchDB, Firebird, etc.) 12 History (gLibrary 1.0 vs. gLibrary 2.0) • gLibrary 1.0 initial goal was to be a simple and easy to use platform to store, organize, browse and retrieve digital assets in repositories, on grid infrastructure • the “g” stands for Grid • built with Python/PHP and AMGA as metadata service • collections had fixed schema, grid storage only • API were not so “RESTy” • gLibrary 2.0 is an evolution and has been rewritten from scratch • it’s a different product, that can do anything gLibrary 1.0 can do, plus: • support many storage back-ends • on demand repository and collection creation 13 References • Official documentation: • https://csgf.readthedocs.io/en/latest/glibrary/docs/glibrary2 .html • Source code and installation instructions: • https://github.com/csgf/glibrary • Contacts: • [email protected] - [email protected] • [email protected] (Lead developer) 14 Live Demo 15 Summary and conclusions • 16 gLibrary 2.0 is an API Platform • Provides REST APIs to create repositories, collections, items and replicas • Can expose datasets from local and remote databases • Can be easily integrated in any kind of application using HTTP requests • Supports relational and not relational databases and both Grid and Storage Servers Thank you! sci-gaia.eu [email protected]