Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006 Mandate – Examine the current LHC experiment computing models. – Attempt to determine the data volumes, access patterns and required data security for the various classes of data, as a function of Tier and of time. – Consider the current storage technologies, their prices in various geographical regions and their suitability for various classes of data storage. – Attempt to map the required storage capacities to suitable technologies. – Formulate a plan to implement the required storage in a timely fashion. February 13th 2005 Membership • • • • • • • -o- Roger Jones, Lancaster, ATLAS [email protected] -o- Andrew Sansum, RAL, [email protected] -o- Bernd Panzer/ Helge Meinhard, CERN, [email protected] -o- David Stickland (latterly) (CMS) -o- Peter Malzacher GSI Tier-2, Alice, [email protected] -o- Andrei Maslennikov,CASPUR, [email protected] -o- Jos van Wezel GridKA, HEPiX, [email protected] • Shadow 1 [email protected] • Shadow 2 [email protected] • -o- Vincenzo Vagnoni Bologna, LHCb, [email protected] • -o- Luca dell’Agnello • -o- Kors Bos, NIKHEF by invitation Thanks to all members! February 13th 2005 Degree of Success • Assessment of Computing Model – RJ shoulders the blame for this area! – Computing TDRs help – see many talk at this conference – Estimates of contention etc rough; toy simulations are exactly that, and we need to improve this area beyond the lifetime of the task force. • Disk – Thorough discussion of disk issues – Recommendations, prices etc • Archival media – Less complete discussion – Final reporting here in April HEPiX/GDB meeting in Rome • Procurement – Useful guidelines to help tier 1 and tier 2 procurement February 13th 2005 Outcome • Interim document available through the GDB • Current High Level Recommendations – It is recommended that a better information exchange mechanism be established between (HEP) centres to mutually improve purchase procedures. – An annual review should be made of the storage technologies and prices, and a report made publicly available. – Particular further study of archival media is required, and tests should be made of the new technologies emerging. – A similar regular report is required for CPU purchases. This is motivated by the many Tier-2 centres now making large purchases. – People should note that the lead time from announcement to effective deployment of new technologies is up to a year. – It is noted that the computing models assume that archived data is available at the time of attempted processing. This implies that the software layer allows pre-staging and pinning of data. February 13th 2005 Inputs • Informed by C-TDRs and computing model documents – Have tried to estimate contentions etc, but this requires much more detailed simulation work – Have looked at data classes and associated storage/access requirements, but his could be taken further • E.g. models often provide redundancy on disk, but some sites assume they still need to back disk to tape in all cases – Have included bandwidths to MSS from LHC4 exercise, but more detail would be good February 13th 2005 Storage Classes 1) tape, archive, possibly offline (vault), access > 2 days, 100 MB/s 2) tape, on line in library, access > 1 hour, 400 MB/s 3) disk, any type, in front of tape caches 4) disk, SATA type optimised for large files, sequential Read only IO 5) disk, SCSI/FC type optimised for small files, Read/Write random IO 6) disk, high speed and reliability RAID 1 or 6 (catalogues, home directories etc) February 13th 2005 Disk • Two common disk types – SCSI/FibreChannel • Higher speed and throughput • Little longer lifetime (~4 years) • More expensive – SATA (II) • • • • Cheaper Available in storage arrays Lifetime >3 years (judging by warrantees!) RAID5 gives fair data security – Could still have 10TB/1PB unavailable on any given day • RAID6 looks more secure – – • Some good initial experiences Care needed with drive and other support Interconnects – Today • SATA (300 MB/s) – Good for disk to server, point to point • Fibre channel (400 MB/s) – – High speed IO interconnect, fabric Soon (2006) • Serial Attached SCSI (SAS – multiple 300 MB/s) • Infiniband (IBA 900 MB/s) February 13th 2005 Architectures • Direct Attached Storage – Disk is directly attached to CPU – Cheap but administration costly • Network Attached Storage Clients node 1 node 1 Storage Controller node 2 IP network node n – File servers on Ethernet network – Access by file-based protocols • Slightly more expensive but smaller number of dedicated nodes • Storage in a box – servers have internal disks • Storage out of box – fiber or SCSI connected • Storage Area Networks – Block not file transport – Flexible and redundant paths, but expensive February 13th 2005 node n Disk Data Access • Access rates – 50 streams per RAID group or 2 MB/s per stream on a 1 Gbit interface – Double this for SCSI • Can be impaired by – Software interface/SRM – Non-optimal hardware configuration • CPU, kernel, network interfaces – Recommend 2 x nominal interfaces for read and 3 x nominal for write February 13th 2005 Disk Recommendations • Storage in a box (DAS/NAS disks together with server logic in a single enclosure) – most storage for a fixed cost – more experience with large SATA + PCI RAID deployments desirable – more expensive solutions may require less labour/be more reliable (experiences differ) – high quality support may be the deciding factor • Recommendation – Sites should declare the products they have in use • A possible central place would be the central repository setup at hepix.org – Where possible, experience with trial systems should be shared (Tier-1s and CERN have a big role here) February 13th 2005 Procurement Guidelines • These come from H Meinhard • Many useful suggestions for procurement • May need to be modified to local rules February 13th 2005 Disk Prices • DAS/NAS: storage in a box (disks together with server logic in a single enclosure) – 13500-17800 € per usable 10 TB • SAN/S: SATA based storage systems with high speed interconnect. – 22000-26000 € per usable 10 TB • SAN/F: FibreChannel/SCSI based storage systems with high speed interconnect – ~55000 € per usable 10 TB • These numbers are reassuringly close to those from Pasta reviews, but it should be noted there is a spread from geography and other situations • Evolution (raw disks) – Expect Moore’s Law density increase of 1.6/year between 2006 and 2010 – Also consider effect of increase at only 1.4/year – Cost reduction 30-40% per annum February 13th 2005 Tape and Archival • This area is ongoing and needs more work – Less frequent procurements • Disk system approaches active tape system costs by ~2008 • Note computing models generally only assume archive copies at the production site • Initial price indications similar to LCG planning projections – – – – 40 CHF/TB for medium 25MB/s effective scheduled bandwidth drive + server is 15kCHF - 35 kCHF Effective throughput is much lower for chaotic usage 6000 slot silo is ~500 kCHF • New possibilities include spin on demand disk etc – Needs study by T0 and T1s, should start now – Would be brave to change immediately February 13th 2005 Plans • The group is now giving more consideration to archival – Need to do more on archival media – General need for more discussion of storage classes – More detail to be added on computing model operational details • Final report in April • Further task forces needed every year or so February 13th 2005