Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
TAG Monitoring, Performance and Scalability Florbela Viegas, CERN ADP 22/05/2017 1 Monitoring The path to performance is aided by monitoring. There are two sources of information gathering : Watch -> Log what the users are doing. Ask -> Understand what the usage patterns are and will be , by knowing the users and knowing the data. If we can’t see what users are doing, we don’t know what to improve. At this time, we watch the usage at several levels: Database SQL Webservices statistics (awstats) Service usage queries and statistics (Logging WebService) 22/05/2017 2 TAG Logging Service Logging activity from: iELSSI, Extract, Event Lookup /GUID Counting, Histogramming What‘s being logged: Use cases (which action?), logical queries, DB connections, timing, users etc. Analysis jobs on raw logging data: Aggregate data per collection / per run -> which collections/runs/passes etc. are accessed? Example statistics: number of (distinct) users per time period number of queries per service/deployment per time period Data popularity Usage of resources What can be taken out of it: Optimization of data distribution Performance of sites -> improved site selection Interface to the logging information under development 22/05/2017 3 Performance Database Performance (see https://savannah.cern.ch/task/?19056) SQL has been regularly analyzed to check execution plans. Services have implemented improved queries. Lower Optimizer_index_cost_adj on logon trigger for better optimizer plans. 11g databases have produced better plans due to better default statistics gathering of statistics: better global partitioned statistics, good histogram buckets. Service Performance: Services have been streamlined for network traffic with result caching in memcached and data pruning at database level, instead of client level. Firefox 4.0 has improved the Javascript speed: huge gains for ELSSI Web Further improvements are dependent on knowledge of user patterns – which are the more used bits? Which are the most used attributes? What can we cache? 22/05/2017 4 Scalability TAG is a true distributed database. The data is scattered across 5 sites at present. Distributed storage capacities : DB Used Free Total ATLARC 10 TB 10TB 20 TB PIC 4 TB 1 TB 5 TB DESY 7 TB 10 TB 17 TB RAL 2 TB 8 TB 10 TB TRIUMF 7 TB 20 TB 27 TB TAG Distributed Global DB 30 TB 49 TB 79 TB 22/05/2017 5 TAG Data Distribution DESY PIC ELSSI Suite ELSSI Suite CERN COMA DB COMA DB COMA DB All Data except Monte Carlo Monte Carlo and some recent data Most Recent Data (no MC) TRIUMF RAL CNAF COMA DB COMA DB COMA DB Most Recent Data ( prepare for MC) Future Site (June 2011) TASK DB TASK DB All Data except Monte Carlo 22/05/2017 6 TAG Geo Coverage Europe & West Asia Americas , East Asia & Australia CERN ELSSI Suite TRIUMF ELSSI Suite 22/05/2017 7 Scalability & Performance Catalog integration in the services has given them capabilities for parallelizing queries across remote sites. We can take advantage of this to query faster, even at the expense of more data redundancy across sites. ELSSI Suite of Services is today in a position to be « distributed aware » and take full advantage of data and service catalog. Queries can be broken down across servers, and parallelized for optimization. 22/05/2017 8