* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download metadata_stonjek_short-3D - Indico
Survey
Document related concepts
Entity–attribute–value model wikipedia , lookup
Serializability wikipedia , lookup
Microsoft Access wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Ingres (database) wikipedia , lookup
Oracle Database wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Concurrency control wikipedia , lookup
Versant Object Database wikipedia , lookup
Relational model wikipedia , lookup
ContactPoint wikipedia , lookup
Transcript
Distributed Databases for ATLAS Stefan Stonjek 04-Jul-2006 Stefan Stonjek: LCG 3D 1 Outline LCG 3D Project COOL Distributed Deployment of Databases for LCG Conditions Database Project Other database applications 04-Jul-2006 Stefan Stonjek: LCG 3D 2 Client access patterns Main applications: Reconstruction, Simulation Access Patterns Read from file: experiment data, physics model Read from database: file catalogue, geometry, conditions Write to file: reconstructed data, simulated data Write to database: file catalogue Some minor applications (calibration) write to conditions database Geometry, conditions: high volume; file catalogue: low volume No need for instantaneous replication Model for conditions database: write only at Tier-0 and replicate to Tier-1s and from there to Tier-2s • • Geometry change less, can be deployed by files File catalogue is a more localized issue, not covered in this talk • 04-Jul-2006 file catalogue is local for two but distributed for two other experiments Stefan Stonjek: LCG 3D 3 Tiers, Resources and Level of Service Different requirements and service capabilities for different tiers Tier1 Database Backbone Tier2 Medium volume, often only sliced extraction of data Asymmetric, possibly only uni-directional replication Part time administration (shared with fabric administration) Tier3/4 (eg Laptop extraction) High volume, often complete replication of RDBMS data Can expect good network connection to other T1 sites Asynchronous, possibly multi-master replication Large scale central database service, local dba team Support fully disconnected operation Low volume, sliced extraction from T1/T2 Need to deploy several replication/distribution technologies Each addressing specific parts of the distribution problem But all together forming a consistent distribution model 04-Jul-2006 Stefan Stonjek: LCG 3D 4 Possible Service Architecture O T0 M M - autonomous reliable service T1- db back bone - all data replicated - reliable service O T2 - local db cache T3/4 -subset data -only local service M O M Oracle Streams Cross vendor extract MySQL Files Proxy Cache 04-Jul-2006 Stefan Stonjek: LCG 3D 5 Possible distribution technologies Vendor native distribution,Oracle replication and related technologies Table-to-Table replication via asynchronous update streams Transportable tablespaces Little (but non-zero) impact on application design Potentially extensible to other back-end database through API Evaluations done at FNAL and CERN Combination of http based database access with web proxy caches close to the client Performance gains reduced real database access for largely read-only data reduced transfer overhead compared to low level SOAP RPC based approaches Deployment gains 04-Jul-2006 Web caches (e.g. squid) are much simpler to deploy than databases and could remove the need for a local database deployment on some tiers No vendor specific database libraries on the client side “Firewall friendly” tunneling of requests through a single port Stefan Stonjek: LCG 3D 6 Multi Tier Computing for LHC T2 T2 T2s and T1s are inter-connected by the general purpose research networks T2 T2 T2 T2 GridKa IN2P3 T2 T2 Any Tier-2 may access data at any Tier-1 Dedicated 10 Gbit links Brookhaven TRIUMF ASCC T2 Nordic Fermilab T2 RAL CNAF T2 T2 T2 04-Jul-2006 PIC SARA T2 T2 Stefan Stonjek: LCG 3D 7 COOL COOL: conditions database toolkit Allow easy handling of condition data on a relational database Is not relational itself Data devided into Folders (read: subdetector) Channels IOVs (interval of validity) Still some performance issues 04-Jul-2006 Stefan Stonjek: LCG 3D 8 COOL performance COOL performance testing done in a way which is optimal for COOL but not close to a real world scenario ATLAS should provide number about the planed COOL / conditions database usage Because of internal COOL structure Do not use many folders Loop over Folder, Channel, IOV 04-Jul-2006 Sounds strange Stefan Stonjek: LCG 3D 9 ATLAS ATLAS is using the COOL system Sometimes ATLAS has to keep using old version of COOL COOL had no backward compatibility New versions are sometimes unstable ATLAS has to learn how to use COOL in an optimal way 04-Jul-2006 Stefan Stonjek: LCG 3D 10 Conclusion 3D, COOL and ATLAS are on a good way One has to ensure that they work well together Need COOL tests which reflect ATLAS usage pattern Nee ATLAS information how ATLAS wants to use COOL (with numbers) 04-Jul-2006 Stefan Stonjek: LCG 3D 11