Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Secure Grid Data Management Technologies in ATLAS Miguel Branco (CERN) D. Malon, A. Vaniachine (ANL) CHEP 2004 Overview Introduction ATLAS Production System ATLAS Databases Conclusion 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Introduction Security “requirements” from a Data Challenges production manager: o A Production manager cares about its data o o o o o o o Not so much about the underlying middleware security we want our data to be available we don’t want to lose our data we don’t want it corrupted we want to remain good friends with site managers we don’t want to upset our physicists with security restrictions we want to be in charge of the production and of the data we want to audit data usage and data access … from a physicist: o o o o “don’t bother me” No visible security Especially no grid certificates Especially not wanting to request or renew grid certificates 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 ATLAS Data Challenges ATLAS decided to undertake a series of Data Challenges in order to validate its Computing Model, its software, its data model Started summer 2004: o ATLAS DC-2 o Unsupervised production across many sites spread over three different Grids (US Grid3, NorduGrid, LCG-2) 4 major components: Production Database Windmill – ATLAS Production Supervisor Job Executors – one executor per “grid-flavor” Common Data Management system – Don Quijote (see #142) Introduced the new ATLAS Automatic Production System (see #501): o 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 ATLAS Production System Brief overview: o Windmill (the production supervisor) connects to Oracle Database at CERN o o o o Several Windmill instances running world-wide On-going work regarding the usage of GSI and Jabber Job executors connect to Windmill using Jabber Each grid Executor has the user certificate of the production manager Job executors interact with grid middleware and with data management service using grid certificates Windmill interacts with data management service – without grid certificates 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 ATLAS Production System Prod. DB Windmill Don Quijote Executor a grid… Another grid… 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Don Quijote Access service for grid file-based data o High-level interface for grid data management for the ATLAS Automatic Production System o Allow transparent registration and movement of replicas between all grid “flavors” used by ATLAS 29/09/2004 Across different grid “islands” as well as within a given grid US Grid3, NorduGrid and LCG-2 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Don Quijote Security model: o Client o Server o secure version (GSI) and insecure version Either forwards user credentials (if client using secure version) Either acts on behalf of user using service certificate When server does what? 29/09/2004 Security needs depend on the action being taken. e.g: Search requests can be done with service certificate if end-user didn’t supply credentials All other requests require a secure client BUT: • Service certificate can still be used – Terrible, but pragmatic decision M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Problems encountered during ATLAS DC ATLAS has some long jobs (over 24 hours) o Using MyProxy server on LCG Had some downtime which affected LCG production Either full or none access is granted New LCG File Catalog supports ACLs that are mapped to individual user accounts – does this scale? Replica catalogs from all 3 grids: o Not supporting namespaces, ACLs, … o o Any user can change LCG catalog without any security requirements Looking forward for new replica catalogs 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Problems encountered during ATLAS DC LCG: o ATLAS jobs being submitted as atlassgm o Problem was pointed out, but no one ever complained as a possible security breach o Castor@CERN accessible from “grid” and from outside”grid” Very useful but… Grid3 and NorduGrid: o Entire ATLAS Data Challenges production ran on behalf of few users Only NorduGrid complained of the existence of the Don Quijote service certificate for data access “Single” ATLAS “VO” o Not effectively used across 3 grids!! 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Status on security for ATLAS Production System ATLAS Production System will be reviewed soon: o Security will be taken into consideration for all components o So far, the need to have a working system as quickly as possible delayed the process Security “gaps” derive mostly from the existing grid middleware Overall still a bit to go, but the usage of client tools provided by the Production System protects end-users o Don Quijote client tools to access data files produced by the Data Challenges 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 What we need? ATLAS will need to develop grid services on top of the existing ones Experiments need: o Middleware (or at least a set of guidelines) on how to develop and host secure services o Hoped to get this quickly from EGEE o And from ARDA as well Not clear yet – no clear “standards” 29/09/2004 WS-Security components not completed yet GSI delegation not supported commercially Grid AuthZ needs to be standards-based M. Branco, D. Malon, A. Vaniachine - CHEP 2004 File-based data Despite current limitations, grid security model and data transport mechanisms suited for handling the file-based data Also true for database-resident file cataloguing and file-level metadata: o stored in grid-based Replica Location catalogs and respective Metadata catalogs 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Databases and the Grid In addition to file-based event data, LHC data processing applications traditionally require access to large amounts of valuable non-event data stored in relational databases o detector conditions, calibrations, etc. o For that purpose ATLAS Data Challenges exercise the Computing Model processing and managing data on three different grid flavors In contrast to the file-based data, this database-resident data flow has to be detailed further 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 ATLAS Data Challenge 2 Database Infrastructure 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Securing Database-resident data on the Grid ATLAS is evaluating several technologies for securing database-resident data Secure grid query engine technologies federating heterogeneous databases on the grid o used at Fermilab Run II experiments Methods utilizing GSI data-transport channel for database services delivery to the grid clusters behind closed firewalls Grid certificate authorization technologies for database access control where the safety features are pushed into the database engine code 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Database Grid Solution Prototype of the Database Grid Solution is in use in Fermilab’s Run II Data Handling system, servicing millions of queries per day 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Penetrating Firewalls ATLAS applications require open TCP/IP channels to the database servers o To deliver database-resident data To harness Grid computing resources that are not dedicated to ATLAS one must address the problem of data delivery to the computing nodes on the clusters behind the closed firewalls As a partial solutions in ATLAS we are implementing: o o database server replica deployment on a dedicated node behind the firewall Network address translation (NAT) techniques providing TCP/IP conduits to the listed database servers ports/IP addresses All of these require considerable involvement of the cluster support personnel An alternative using GSI data-transfer channels – without requiring changes on cluster configuration - is presented 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Secure GSI Transport Channel Extract-Transport-Install Extract & Transport Main Server Transport & Install MySQL simplified the delivery of the extract-transportinstall components of ATLAS database This provides the database services needed for the Data Challenges for sites with Grid Compute Elements behind closed firewalls o some sites on Grid3 and NorduGrid Replica Servers 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Database Access on the Grid Two different security models (and schools) o Using a separate server: • • • • o Spitfire (EDG WP2) – SOAP/XML text-only data transport DAI (IBM UK) – Spitfire technologies + XML binary extensions Perl DBI database proxy (ALICE) – SQL data transport Oracle 10g (separate authorization layer) Integrated in database server: • Instead of surrounding database with external secure layers the safety features are embedded inside of the code – Open-source databases (MySQL, PostgreSQL) – IBM DB2 loadable security modules techniques 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 External Security Providing security in a separate layer features o Advantages • Proven traditional approach, used everywhere o Disadvantages: • Weak database authorization techniques behind the secure layer • Clear-text passwords embedded in the code • Limited control over the secure transport channel, cryptographic handshake overhead for every gSOAP message • Requires protocol extensions (XML with binary attachments) 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Embedded Security Instead of surrounding database with external secure layers the safety features are embedded inside of the code o Advantages o Elimination of the clear-text passwords Integration of the same grid security model throughout all data flow channels Inefficient data transfer bottlenecks are eliminated Disadvantages: 29/09/2004 Pushing secure authorization into the database engine result in a monolithic system that are known to be more fragile M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Grid-enabling Databases Grid-enabled MySQL server is deployed on the database development server tier in ATLAS o o Certificate authorization is supported directly on the database server Only authentication must be encrypted, the TCP/IP data-transfer channel can be used in un-encrypted mode (for data transfer efficiency) The technology was used extensively in DC2 pre-production on Grid3 o grid-proxy certificate authorization was used for processing of 7K jobs In addition, use of certificate credentials provided capabilities for efficient locking mechanism to support chaotic mode of job submission on the Grid 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Status on security for ATLAS Databases To overcome database access limitations one must to go beyond the existing grid infrastructure We are evaluating the technologies laying a foundation of a new hyperinfrastructure: o o o Secure grid query engine technologies federating heterogeneous databases on the grid Methods utilizing Grid Security Infrastructure data-transport channel for database services delivery to the grid clusters behind closed firewalls Grid certificate authorization technologies for database access control where the safety features are pushed into the database engine code 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004 Conclusion ATLAS taking security heavily on consideration Two main areas where security is being addressed: o o File-based data Database-resident data Security improvements for file-based data depend on new grid middleware; help from grid middleware providers Developments on secure database access also on-going; looking into new projects such as LCG3D Urgent need of grid certificates for all users o where it all begins… 29/09/2004 M. Branco, D. Malon, A. Vaniachine - CHEP 2004