Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Tier1A Status Martin Bly 28 April 2003 CPU Farm • Older hardware: – 108 dual processors (450, 600 and 1GHz) – 156 dual processor 1400MHz PIII • Recent delivery: – 80 dual 2.66GHz P4 Xeon – 533MHz FSB, 2GB memory • Next delivery expected in the summer Operating Systems • Operating Systems: – Redhat 6.2 service will close in May – Redhat 7.2 service has been in production for Babar for 6 months. – New Redhat 7.3 service now available for LHC/other experiments • Increasing demands for security updates becoming problematic. Disk Farm (last Year) • Last year – 26 servers, each with 2 external RAID arrays - 1.7TB disk per server: – Excellent performance, well balanced system – Problems with a bad batch of Maxtor drives – many failures and high error rate – all 620 drives now replaced by Maxtor. – Still outstanding problems with Accusys controller failing to eject bad drives from RAID set. Disk Farm (this year) • Recent upgrade to disk farm. – 11 dual P4 servers (with PCIx), each with 2 Infortrend IFT-6300 arrays – 12 Maxtor 200GB Diamondmax Plus 9 drives per array. • Not yet in production – but a few snags: – Original tendered Maxtor: Maxline Plus II drive was found not to exist. – Infortrend array has 2TB limit per RAID set – some (10%) wasted space! • Nick White ([email protected]) for more info New Projects • Basic fabric performance monitoring (ganglia) • Resource CPU accounting (based on PBS accounts/mysql) • New CA in production • New batch scheduler (MAUI) • Deploy new helpdesk (May) Ganglia Monitoring • Urgently needed live performance and utilisation monitoring – RAL Ganglia Monitoring (live) – RAL Ganglia Monitoring (Static) • Scalable solution based on multicast • Very rapidly deployable - reasonable support on all Tier1A Hardware • See: http://ganglia.sourceforge.net/ PBS Accounting Software • Need to keep track of system CPU and disk usage. • Home grown PBS accounting package (Derek Ross): – Upload PBS and disk stats into MYSQL – Process with perl DBI script – Serve via Apache • http://www.gridpp.rl.ac.uk/stats • Contact Derek ([email protected]) for more info. MAUI/PBS • Maui scheduler has been in production for last 3 months. • Allows extremely flexible scheduling with many features. But …. – Not all of it works – we have done much work with developers for fixes. – Major problem – MAUI schedules on wall clock time – not CPU time. Had to bodge it!! New Helpdesk Software • Old helpdesk mail based/unfriendly. • With additional staff, urgently need to deploy new solution. • Expect new system to be based on free software – probably Request Tracker • Hope that deployed system will also meet needs of Testbed and may also satisfy Tier 2 sites. • Expect deployment by end of May. • http://requestracker.gridpp.rl.ac.uk/ (Static) Outstanding Issues/worries • We have to run many distinct services. For example, FERMI Linux, RH 6.2/7.2/7.3, EDG testbeds, LCG … • Farm management is getting very complex. We need better tools and automation. • Security Is becoming a big concern again.