Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006 Mandate – Examine the current LHC experiment computing models. – Attempt to determine the data volumes, access patterns and required data security for the various classes of data, as a function of Tier and of time. – Consider the current storage technologies, their prices in various geographical regions and their suitability for various classes of data storage. – Attempt to map the required storage capacities to suitable technologies. – Formulate a plan to implement the required storage in a timely fashion. February 13th 2005 Membership • • • • • • • -o- Roger Jones, Lancaster, ATLAS [email protected] -o- Andrew Sansum, RAL, [email protected] -o- Bernd Panzer/ Helge Meinhard, CERN, [email protected] -o- David Stickland (latterly) (CMS) -o- Peter Malzacher GSI Tier-2, Alice, [email protected] -o- Andrei Maslennikov,CASPUR, [email protected] -o- Jos van Wezel GridKA, HEPiX, [email protected] • Shadow 1 [email protected] • Shadow 2 [email protected] • -o- Vincenzo Vagnoni Bologna, LHCb, [email protected] • -o- Luca dell’Agnello • -o- Kors Bos, NIKHEF by invitation Thanks to all members! February 13th 2005 Degree of Success • Assessment of Computing Model – RJ shoulders the blame for this area! – Computing TDRs help – see many talk at this conference – Estimates of contention etc rough; toy simulations are exactly that, and we need to improve this area beyond the lifetime of the task force. • Disk – Thorough discussion of disk issues – Recommendations, prices etc • Archival media – Less complete discussion – Final reporting here in April HEPiX/GDB meeting in Rome • Procurement – Useful guidelines to help tier 1 and tier 2 procurement February 13th 2005 Outcome • Interim document available through the GDB • Current High Level Recommendations – It is recommended that a better information exchange mechanism be established between (HEP) centres to mutually improve purchase procedures. – An annual review should be made of the storage technologies and prices, and a report made publicly available. – Particular further study of archival media is required, and tests should be made of the new technologies emerging. – A similar regular report is required for CPU purchases. This is motivated by the many Tier-2 centres now making large purchases. – People should note that the lead time from announcement to effective deployment of new technologies is up to a year. – It is noted that the computing models assume that archived data is available at the time of attempted processing. This implies that the software layer allows pre-staging and pinning of data. February 13th 2005 Inputs • Informed by C-TDRs and computing model documents – Have tried to estimate contentions etc, but this requires much more detailed simulation work – Have looked at data classes and associated storage/access requirements, but his could be taken further • E.g. models often provide redundancy on disk, but some sites assume they still need to back disk to tape in all cases – Have included bandwidths to MSS from LHC4 exercise, but more detail would be good February 13th 2005 Storage Classes 1) tape, archive, possibly offline (vault), access > 2 days, 100 MB/s 2) tape, on line in library, access > 1 hour, 400 MB/s 3) disk, any type, in front of tape caches 4) disk, SATA type optimised for large files, sequential Read only IO 5) disk, SCSI/FC type optimised for small files, Read/Write random IO 6) disk, high speed and reliability RAID 1 or 6 (catalogues, home directories etc) February 13th 2005 Disk • Two common disk types – SCSI/FibreChannel • Higher speed and throughput • Little longer lifetime (~4 years) • More expensive – SATA (II) • • • • Cheaper Available in storage arrays Lifetime >3 years (judging by warrantees!) RAID5 gives fair data security – Could still have 10TB/1PB unavailable on any given day • RAID6 looks more secure – – • Some good initial experiences Care needed with drive and other support Interconnects – Today • SATA (300 MB/s) – Good for disk to server, point to point • Fibre channel (400 MB/s) – – High speed IO interconnect, fabric Soon (2006) • Serial Attached SCSI (SAS – multiple 300 MB/s) • Infiniband (IBA 900 MB/s) February 13th 2005 Architectures • Direct Attached Storage – Disk is directly attached to CPU – Cheap but administration costly • Network Attached Storage Clients node 1 node 1 Storage Controller node 2 IP network node n – File servers on Ethernet network – Access by file-based protocols • Slightly more expensive but smaller number of dedicated nodes • Storage in a box – servers have internal disks • Storage out of box – fiber or SCSI connected • Storage Area Networks – Block not file transport – Flexible and redundant paths, but expensive February 13th 2005 node n Disk Data Access • Access rates – 50 streams per RAID group or 2 MB/s per stream on a 1 Gbit interface – Double this for SCSI • Can be impaired by – Software interface/SRM – Non-optimal hardware configuration • CPU, kernel, network interfaces – Recommend 2 x nominal interfaces for read and 3 x nominal for write February 13th 2005 Disk Recommendations • Storage in a box (DAS/NAS disks together with server logic in a single enclosure) – most storage for a fixed cost – more experience with large SATA + PCI RAID deployments desirable – more expensive solutions may require less labour/be more reliable (experiences differ) – high quality support may be the deciding factor • Recommendation – Sites should declare the products they have in use • A possible central place would be the central repository setup at hepix.org – Where possible, experience with trial systems should be shared (Tier-1s and CERN have a big role here) February 13th 2005 Procurement Guidelines • These come from H Meinhard • Many useful suggestions for procurement • May need to be modified to local rules February 13th 2005 Disk Prices • DAS/NAS: storage in a box (disks together with server logic in a single enclosure) – 13500-17800 € per usable 10 TB • SAN/S: SATA based storage systems with high speed interconnect. – 22000-26000 € per usable 10 TB • SAN/F: FibreChannel/SCSI based storage systems with high speed interconnect – ~55000 € per usable 10 TB • These numbers are reassuringly close to those from Pasta reviews, but it should be noted there is a spread from geography and other situations • Evolution (raw disks) – Expect Moore’s Law density increase of 1.6/year between 2006 and 2010 – Also consider effect of increase at only 1.4/year – Cost reduction 30-40% per annum February 13th 2005 Tape and Archival • This area is ongoing and needs more work – Less frequent procurements • Disk system approaches active tape system costs by ~2008 • Note computing models generally only assume archive copies at the production site • Initial price indications similar to LCG planning projections – – – – 40 CHF/TB for medium 25MB/s effective scheduled bandwidth drive + server is 15kCHF - 35 kCHF Effective throughput is much lower for chaotic usage 6000 slot silo is ~500 kCHF • New possibilities include spin on demand disk etc – Needs study by T0 and T1s, should start now – Would be brave to change immediately February 13th 2005 Plans • The group is now giving more consideration to archival – Need to do more on archival media – General need for more discussion of storage classes – More detail to be added on computing model operational details • Final report in April • Further task forces needed every year or so February 13th 2005