Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Grids: A Reality Check Mark Hayes GEFD Summer School 2003 It’s not just compute cycles... An exponential growth in data from many areas of science. 4 types of Grid • CPU intensive cycle scavenging (SETI@home) • Data sharing • Application provision • Human-human interaction (e.g. Access Grid) SETI@home The world’s most powerful computer delivered 52 Teraflops/second yesterday (Earth Simulator is 35 Tflop/s, sum of top 2-10 is 60Tflop/s) Latest Stats http://setiathome.ssl.berkeley.edu/totals.html 6th July 2003 Total Last 24 Hours Users 4,570,474 1,226 Results received 944 M 1.1 M Total CPU time 1.5 M years 1,226 years Floating Point Operations 3 E+21 ops 3 zeta ops 4.5 E+18 flops/day 52 Teraflops/s The data explosion - some big numbers • CFD turbulence simulations - 100TB • BaBar particle physics experiment - 1TB/day • CERN LHC will generate 1GB/s or 10PB/year • VLBA radio telescope generates 1GB/s today • NCBI/EMBL database is “only 0.5TB” but doubling each year • brain imaging - 4TB/brain at full colour, 10mm resolution (4PB/brain at 1mm i.e. cellular resolution) • Pixar - 100TB/movie FTP and GREP are not adequate (Jim Gray) Application provision • Google - 10K cpus, 2PB database (2 years ago) • free email services - HotMail, Yahoo! 2-10PB storage • netsolve - numerical algorithms on demand with Matlab & Mathematica plugins • renderfarm.net - graphics rendering on demand The Access Grid High end video conferencing and collaboration technology. O(100) nodes world wide. Presenter mic Presenter camera Ambient mic (tabletop) Audience camera “...one of the most compelling glimpses into the future I’ve seen since I first saw NCSA Mosaic.” Larry Smarr £1 buys... • • • • • • • 1 day of cpu time 4 GB ram for a day 1 GB of network bandwidth 1 GB of disk storage 10 M database accesses 10 TB of disk access (sequential) 10 TB of LAN bandwidth (bulk) How do you move a terabyte? Source: Terascale SneaketNet, Jim Gray et al Context Speed Mbps Rent $/month $/Mbps $/TB Sent Time/TB Home phone 0.04 40 1,000 3,086 6 years Home DSL 0.6 70 117 360 5 months T1 1.5 1,200 800 2,469 2 months T3 43 28,000 651 2,010 2 days OC3 155 49,000 316 976 14 hours OC 192 9600 1,920,000 200 617 14 minutes 100 Mpbs 100 1 day Gbps 1000 2.2 hours Some consequences Compute cycles are (almost) free... by comparison with network costs. -The cheapest and fastest way to move 1TB of data out from CERN is still by FedEx. Though this considers only bandwidth, low latency networks are even more expensive! (MPI over WAN doesn’t work well.) What makes a good Grid application? A distributed community of users. Tiny network input & output, huge compute requirement. Database access & storage is also expensive, therefore put the computation near the data. The Grid in the UK Pilot projects in particle physics, astronomy, medicine, bioinformatics, environmental sciences... Contributing to international Grid software development efforts 10 regional “eScience Centres” The European DataGrid • Tiered structure: Tier0=CERN • Lots of their own Grid software •Applications: particle physics, earth observation, bioinformatics http://www.eu-datagrid.org/ NASA Information PowerGrid • First “production quality” Grid • Linking NASA & academic supercomputing sites at 10 sites • Applications: computational fluid dynamics, meteorological data mining, Grid benchmarking http://www.ipg.nasa.gov/ TeraGrid • Linking supercomputers through a high-speed network • 4x 10GBps between SDSC, Caltech, Argonne & NCSA • Call for proposals out for applications & users http://www.teragrid.org/ Asia-Pacific Grid • No central source of funding • Informal, bottom-up approach • Lots of experiments on benchmarking & bio apps. http://www.apgrid.org/ What does it take to build a Grid? • Resources - CPU, network, storage • People - sysadmins, application developers, Grid experts • Grid Middleware - Globus, Condor, Unicore… • Security - so you want to use my computer? • Maintenance - ongoing monitoring, upgrades… and co-ordination of this between multiple sites • Applications and users! How you can get involved... • NIEeS - http://www.niees.ac.uk/ • National eScience Centre (Edinburgh) http://www.nesc.ac.uk/ •Your local eScience Centre • Adopt an application! Questions?