Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Migration of ATLAS PanDA to CERN Graeme Stewart, Alexei Klimentov, Birger Koblitz, Massimo Lamanna, Tadashi Maeno, Pavel Nevski, Marcin Nowak, Pedro Salgao, Torre Wenus, Mikhail Titov Graeme Stewart: ATLAS Computing 1 Outline PanDA Review PanDA History PanDA Architecture First steps of Migration to CERN Infrastructure Setup PanDA Monitor Task Request Database Second Phase Migration PanDA Server and Bamboo Database bombshells Migration, Tuning and Tweaks Conclusions Graeme Stewart: ATLAS Computing 2 PanDA Recent History PanDA was developed by US ATLAS in 2005 Became the executor of all ATLAS production in EGEE 35k simultaneous running jobs 150k jobs per day finished during 2008 March 2009: executes production for ATLAS in NDGF as well using ARC Control Tower (aCT) As PanDA had become central to ATLAS operations it was decided in late 2008 to relocate it to CERN Graeme Stewart: ATLAS Computing PanDA Server Architecture PanDA (Production and Distributed Analysis) is ATLAS ProdDB a pilot job system Panda Monitor Panda Client Bamboo Executes jobs from the ATLAS production system and from users Brokers jobs to sites based on available Panda Server compute resource and data Panda Databases Pilots get jobs data if necessary Computing Site Pilots Pilot Factory Graeme Stewart: ATLAS Computing Can move and stage Triggers data movement back to Tier-1s for dataset aggregation PanDA Monitor PanDA Monitor is the web interface to the panda system Provides summaries of processing per cloud/site Drill down to individual job logs And directly view logfiles Task status Also provides a web interface to request actions from the system Task requests Dataset Subscriptions Graeme Stewart: ATLAS Computing Task Request Database AKTR MySQL PandaDB MySQL ProdD BOracl e Task request interface is hosted as part of the panda monitor Allows physicists do define MC production task Backend database exists separately from rest of panda Prime candidate for migration from MySQL at BNL to Oracle at CERN AKTR Oracle ProdD BOracl e Graeme Stewart: ATLAS Computing PandaDB MySQL Migration – Phase 1 Target was migration of task request database and panda monitor First step was to prepare infrastructure for services: 3 server class machines to host panda monitors Setup as much as possible as standard machines supported by CERN FIO Dual CPU, Quad Core Intel E5410 CPUs 16GB RAM 500GB HDD Quattor templates Lemon monitoring Alarms for host problems Also migrated to the ATLAS standard python environment Utilise CERN Arbitrating DNS to balance load across all machines Picks the 2 ‘best’ machines of 3 with a configurable metric Graeme Stewart: ATLAS Computing Python 2.5, 64 bit Parallel Monitors DB Panda was always architected to have multiple stateless monitors Each monitor queries the backend database to retrieve user requested information and display it Thus setting up a parallel monitor infrastructure at CERN was relatively easy Once external dependencies were sorted ATLAS Distributed Data Management (DDM) Grid User Interface tools This was deployed at the beginning of December 2008 Graeme Stewart: ATLAS Computing Task Request Database First real step was to migrate the TR DB between MySQL and Oracle This is not quite as trivial as one first imagines Each database supports some non-standard SQL features Optimising databases is quite specific to the database engine First attempts ran into trouble And these are not entirely compatible MySQL dump from BNL to CERN resulted in connections being dropped Had to dump data at BNL and scp to CERN Schema required some cleaning up Dropped unused tables Removing null constraints, CLOB->VARCHAR, resizing some text fields However, after a couple of trial migrations we were confident that data could be migrated in just a couple of hours Graeme Stewart: ATLAS Computing Migration Migration occurred on Monday December 8th Database data was migrated in a couple of hours Two days were then used to iron out any glitches In the Task Request interfaces In the scripts which manage the Task Request to ProdDB interface Could this all have been prepared in advance? In theory yes, but we are migrating a live system So there only a limited amount of test data which can be inserted into the system Real tasks trigger real jobs System was live again and accepting task requests on Wednesday Latency of tasks in the production system is usually several days, even for short tasks Acceptable to the community Graeme Stewart: ATLAS Computing A Tale of Two Infrastructures MySQL DB Oracle DB New panda monitor setup required DB plugins to talk to both MySQL and to Oracle The MySQLdb module is bog standard The cx_oracle module much less so In addition Python 2.4 was the supported infrastructure at BNL as opposed to Python 2.5 at CERN This meant after the TR migration the BNL monitors started to have a more limited functionality This had definitely not been in the plan! Graeme Stewart: ATLAS Computing PanDA Servers Some preliminary work on the panda server has been done already in 2008 However much still required to be done to migrate the full suite of panda server databases: PandaDB – holds live job information and status (‘fast buffer’) LogDB – holds pilot logfile extracts MetaDB – holds panda scheduler information on sites and queues ArchiveDB – ultimate resting place of any panda job (big!) For most databases the data volume was minimal and the main work was in the schema details Including the setup of Oracle triggers For the infrastructure side we copied the BNL setup, with multiple panda servers running on the same machines as the monitors We knew the load was low and the machines were capable We also required one server component which interfaces between the panda servers and ProdDB, bamboo Same machine template worked fine Graeme Stewart: ATLAS Computing ArchiveDB In MySQL, because of constraints on the table performance vs. size an explicit partitioning had been adopted One ArchiveDB table for every two months of jobs Jan_Feb_2007 Mar_Apr_2007 … Jan_Feb_2009 In Oracle internal partitioning is supported: CREATE TABLE jobs_archived (<list of columns>) PARTITION BY RANGE(MODIFICATIONTIME) ( PARTITION jobs_archived_jan_2006 VALUES LESS THAN (TO_DATE('01-JAN-2006','DD-MON-YYYY')), PARTITION jobs_archived_feb_2006 VALUES LESS THAN (TO_DATE('01-MAR-2006','DDMON-YYYY')), PARTITION jobs_archived_mar_2006 VALUES LESS THAN (TO_DATE('01-APR-2006','DD-MON-YYYY')), … This allows for considerable simplification of the client code in the panda monitor Graeme Stewart: ATLAS Computing Integrate, Integrate, … By late February trial migrations of the databases had happened to integration databases hosted at CERN (the INTR database) Trail jobs had been run through the panda server, proving basic functionality Decision now had to be made on final migration strategy This could be ‘big bang’ (move the whole system at once) or ‘inflation’ (gradually migrate clouds one by one) Big bang would be easier for, e.g., panda monitor But would carry greater risks – suddenly loading the system with 35k running jobs was unwise If things went very wrong it might leave us with a big mess to recover from External constraint was the start of the ATLAS cosmics rereprocessing campaign due to start 9th March We decided to migrate piecemeal Graeme Stewart: ATLAS Computing Final Preparations In fact PanDA did have two heads already IT and CERN clouds had been run from a parallel MySQL setup from early 2008 This was an expensive infrastructure to maintain as it did not tap into CERN IT supported services It was obvious that migrating these two clouds would be a natural first step Plans were made to migrate to the ATLAS production database at CERN (aka ATLR) Things seemed to be under control a few days before… Graeme Stewart: ATLAS Computing DBAs Friday before we were due to migrate CERN DBAs asked us not to do so They were worried that not enough testing of the Oracle setup in INTR has been done This triggered a somewhat frantic weekend of work, resulting in several thousand jobs being run through the CERN and IT clouds using the INTR databases From our side this testing looked to be successful However, we reached a subsequent compromise that We would migrate the CERN and IT clouds to panda running against the INTR They would start backups on the INTR database giving us the confidence to run production for ATLAS though this setup Subsequent migration from INTR to ATLR could be achieved much more rapidly as the data was already in the correct Oracle formats Graeme Stewart: ATLAS Computing Tuning and Tweaking Migration of PandaDB, LogDB, MetaDB was very quick There was one unexpected piece of client code which hung during the migration process (polling of CERN MySQL servers) Migration and index building of ArchiveDB was far slower However, we disabled access to ArchiveDB and could bring the system up live within half a day Since then a number of small improvements in the panda code have been made to help optimise use of oracle Connections are much more expensive in Oracle than in MySQL Restructure code to use a connection pool Create common reader and writer accounts for access to all database schemas from the one connection Migration away from triggers to .nextval() syntax Despite fears, migration of panda server to oracle has been relatively painless and been achieved without significant loss of capacity Graeme Stewart: ATLAS Computing Cloud Migration Initial migration was for CERN and IT clouds We added NG, the new Nordugrid cloud, which was from a standing start We added DE after a major intervention in which the cloud was taken offline Similarly TW will come up in the CERN Oracle instance UK was the interesting case where we migrated a cloud live: Switched bamboo instance to send jobs to CERN Oracle servers Current jobs are left being handled by old bamboo and servers Start sending pilots to UK asking from jobs from CERN Oracle servers Force the failure of jobs not yet started in the old instance These return to prodDB and then are picked up again by panda using the new bamboo Old running jobs are handled correctly by the ‘old’ system There will be a subsequent re-merge into the CERN ArchiveDB Graeme Stewart: ATLAS Computing Monitor Blues A number of problems did arise in the new monitor setup required for the migrated clouds Coincident with the migration there was a repository change from CVS to SVN However, the MySQL monitor was deployed from CVS and the Oracle monitor from SVN This lead to a number of accidents and minor confusions which it took a while to recover from New security features cause some loss of functionality at times as it was hard to check all the use cases And the repository problems augmented this However, these are now mostly resolved issues and ultimely the system will in fact become simpler Graeme Stewart: ATLAS Computing Conclusions Migration of the panda infrastructure from BNL to CERN has underlined how difficult the transition of a large scale, live, distributed computing system is A very pragmatic approach was adopted in order to get the migration done in a reasonable time Although it always takes longer then you think Much has been achieved Monitor and task request database fully migrated CERN Panda server infrastructure moved to Oracle (This is true even when you try and factor in knowledge of the above) Now running 5(6) of the 11 ATLAS clouds: CERN, DE, IT, NG, UK, (TW) Remaining migration steps are now a matter of scaling and simplifying We learned a lot Love your DBAs, of course If we have to do this again, now we know how But there is still considerable work to do Mainly in improving service stability, monitoring and support proceedures Graeme Stewart: ATLAS Computing