Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Parallel and Distributed Computing: Clusters and Grids Information Session Subject Code: 433-498 Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia www.gridbus.org WW Grid Scalable HPC: Breaking Administrative Barriers & new challenges 2100 2100 2100 2100 2100 2100 2100 ? P E R F O R M A N C E 2 2100 2100 Administrative Barriers •Individual •Group •Department •Campus •State •National •Globe •Inter Planet •Galaxy Desktop SMPs or SuperComputers Local Cluster Enterprise Cluster/Grid Global Cluster/Grid Inter Planetary Grid! Why SC? Large Scale Explorations need them—Killer Applications. Solving grand challenge applications using modeling, simulation and analysis Aerospace Internet & Ecommerce Life Sciences 3 CAD/CAM Digital Biology Military Applications 4 PART 2: Cluster Architectures 5 The promise of supercomputing to the average PC User ? HPCC Books, 2 Volumes - Prentice Hall, 1999 Edited by R.Buyya with contributions from over 100 leading researchers 6 (www.buyya.com/cluster/) Agenda • • • • • Cluster ? Enabling Tech. & Motivations Cluster Architecture Cluster Components Single System Image Next Section (after break) • Case Studies • Cluster Programming and Application • 7 Design Resources and Conclusions Rise and Fall of Computer Architectures Vector Computers (VC) - proprietary system: Massively Parallel Processors (MPP) -proprietary systems: difficult to use and hard to extract parallel performance. Clusters - gaining popularity: 8 suffers from scalability Distributed Systems: high cost and a low performance/price ratio. Symmetric Multiprocessors (SMP): provided the breakthrough needed for the emergence of computational science, buy they were only a partial answer. High Performance Computing - Commodity Supercomputing High Availability Computing - Mission Critical Applications Cluster computing: Past, Present, Future PDA Clusters 9 1960 1980s 1990 1995+ 2000+ Definition: What is a Cluster? 10 A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource. “stand-alone” (whole computer) computer that can be used on its own (full hardware and OS). So What’s So Different about Clusters? Commodity Parts? Communications Packaging? Incremental Scalability? Independent Failure? Intelligent Network Interfaces? Complete System on every node 11 virtual memory scheduler files … Nodes can be used individually or combined... Cluster Computer Architecture Parallel Applications Parallel Applications Parallel Applications Sequential Applications Sequential Applications Sequential Applications Parallel Programming Environment Cluster Middleware (Single System Image and Availability Infrastructure) PC/Workstation PC/Workstation PC/Workstation PC/Workstation Communications Communications Communications Communications Software Software Software Software Network Interface Hardware Network Interface Hardware Network Interface Hardware Network Interface Hardware Cluster Interconnection Network/Switch 12 A major issues in Cluster design 13 • Enhanced Performance (performance @ low cost) • Enhanced Availability (failure management) • Single System Image (look-and-feel of one system) • Size Scalability (physical & application) • Fast Communication (networks & protocols) • Load Balancing (CPU, Net, Memory, Disk) • Security and Encryption (clusters of clusters) • Distributed Environment (Social issues) • Manageability (admin. And control) • Programmability (simple API if required) • Applicability (cluster-aware and non-aware app.) Scalability Vs. Single System Image UP 14 Cluster Applications Numerous Scientific & engineering Apps. Business Applications: Internet Applications: ASPs (Application Service Providers); Computing Portals; E-commerce and E-business. Mission Critical Applications: 15 E-commerce Applications (Amazon, eBay ….); Database Applications (Oracle on clusters). command control systems, banks, nuclear reactor control, star-wars, and handling life threatening situations. Science Portals - e.g., Papia system RWCP - http://www.rwcp.or.jp/papia/ 16 • Pentiums. • Myrinet. • NetBSD/Linuux. • PM. • Score-D. • MPC++. Papia PC Cluster Adoption of the Approach 17 Scalable HPC: Breaking Administrative Barriers & new challenges 2100 2100 2100 2100 2100 2100 2100 ? P E R F O R M A N C E 18 2100 2100 Administrative Barriers •Individual •Group •Department •Campus •State •National •Globe •Inter Planet •Galaxy Desktop SMPs or SuperComputers Local Cluster Enterprise Cluster/Grid Global Cluster/Grid Inter Planetary Grid! Towards Grid Computing 19 What is Grid ? A paradigm/infrastructure that enabling the sharing, selection, & aggregation of geographically distributed resources: Wide area Computers – PCs, workstations, clusters, supercomputers, laptops, notebooks, mobile devices, PDA, etc; Software – e.g., ASPs renting expensive special purpose applications on demand; Catalogued data and databases – e.g. transparent access to human genome database; Special devices/instruments – e.g., radio telescope – SETI@Home searching for life in galaxy. People/collaborators. [depending on their availability, capability, cost, and user QoS requirements] for solving large-scale problems/applications. 20 Thus enabling the creation of “virtual enterprises” (VEs) P2P/Grid Applications-Drivers Distributed HPC (Supercomputing): High-Capacity/Throughput Computing: Collaborative design, Data exploration, education. Service Oriented Computing (SOC): 21 Medical instrumentation & Mission Critical. Collaborative Computing: Drug Design, Particle Physics, Stock Prediction... On-demand, realtime computing: Sharing digital contents among peers (e.g., Napster) Remote software access/renting services: Application service provides (ASPs) & Web services. Data-intensive computing: Large scale simulation/chip design & parameter studies. Content Sharing (free or paid) Computational science. Computing as Competitive Utility: New paradigm, new industries, and new business. A Typical Grid Computing Environment Grid Information Service Grid Resource Broker R2 R3 R5 Application database R4 RN Grid Resource Broker R6 Grid Information Service 22 R1 Resource Broker Need Grid tools for managing Security Computational Economy Uniform Access Resource Discovery Resource Allocation & Scheduling System Management Data locality Application Development Tools 23 Network Management mix-and-match Object-oriented Internet/partial-P2P Network enabled Solvers Market/Computational Economy 24 Many Grid Projects & Initiatives Australia Europe Nimrod-G GridSim Virtual Lab Active Sheets DISCWorld ..new coming up UNICORE MOL UK eScience Poland MC Broker EU Data Grid EuroGrid MetaMPI Dutch DAS XW, JaWS Japan Korea... Ninf DataFarm N*Grid 25 USA Cycle Stealing & .com Initiatives Globus Legion OGSA Javelin AppLeS NASA IPG Condor-G Jxta NetSolve AccessGrid and many more... Distributed.net SETI@Home, …. Entropia, UD, Parabon,…. Public Forums Global Grid Forum P2P Working Group IEEE TFCC Grid & CCGrid conferences http://www.gridcomputing.com Grid Computing Projects GRIDS Lab @ Melbourne The Gridbus Vision: To Enable Service Oriented Grid Computing & Bus iness! WW Grid Nimrod-G World Wide Grid! 27 GRIDS Lab @ the U. of Melbourne, The Gridbus Project: www.gridbus.org Grid Economy & Distributed Scheduling (via Nimrod-G Broker) GridSim Toolkit: Grid Modeling and Simulation (Java based): http://www.buyya.com/gridsim/ http://www.buyya.com/libra/ http://www.buyya.com/ecogrid/wwg/ Libra: Economic Cluster Scheduler Grid Bank: Accounting, Payment, Enforcement Mechanisms World Wide Grid (WWG) testbed: Application Enabler Projects: Virtual Laboratory Toolset for Drug Design High-Energy Physics and the Grid Network (HEPGrid) Brain Activity Analysis on the Grid Cluster and Grid Info Centres: 28 http://www.buyya.com/ecogrid www.buyya.com/cluster/ || www.gridcomputing.com Nimrod/G : A Grid Resource Broker A resource broker for managing, steering, and executing task farming (parameter sweep/SPMD model) applications on Grid based on deadline and computational economy. Based on users’ QoS requirements, our Broker dynamically leases services at runtime depending on their quality, cost, and availability. Key Features 29 A single window to manage & control experiment Persistent and Programmable Task Farming Engine Resource Discovery Resource Trading Scheduling & Predications Generic Dispatcher & Grid Agents Transportation of data & results Steering & data management Accounting Drug Design: Data Intensive Computing on Grid Molecules Protein 30 Chemical Databases (legacy, in .MOL2 format) It involves screening millions of chemical compounds (molecules) in the Chemical DataBase (CDB) to identify those having potential to serve as drug candidates. MEG(MagnetoEncephaloGraphy) Data Analysis on the Grid: Brain Activity Analysis 64 sensors MEG 2 Analysis All pairs (64x64) of MEG data by shifting the temporal region of MEG data over time: 0 to 29750: 64x64x29750 jobs Data Generation 3 1 5 Results Data Analysis Nimrod-G 4 Life-electronics laboratory, AIST 31 •Provision of expertise in the analysis of brain function •Provision of MEG analysis •[deadline, budget, optimization preference] World-Wide Grid [Collaboration with Osaka University, Japan] A Glance at Nimrod-G Broker Nimrod/G Client Nimrod/G Client Nimrod/G Client Nimrod/G Engine Schedule Advisor Trading Manager Grid Store Grid Dispatcher Grid Explorer Grid Middleware TM Globus, Legion, Condor, etc. TS GE GIS Grid Information Server(s) RM & TS RM & TS G RM & TS C L G Globus enabled node. See HPCAsia 2000 paper! 32 L Legion enabled node. RM: Local Resource Manager, TS: Trade Server G C L Condor enabled node. Active Sheet: Microsoft Excel Spreadsheet Processing on Grid Nimrod Proxy Nimrod-G World-Wide Grid 33 34 GridSim Toolkit A Java based tool for Grid Scheduling Simulations Application, User, Grid Scenario’s Input and Results Application Configuration Resource Configuration User Requirements Grid Scenario ... Output Grid Resource Brokers or Schedulers GridSim Toolkit Application Modeling Resource Entities Information Services Job Management Resource Allocation Statistics Resource Modeling and Simulation (with Time and Space shared schedulers) Single CPU SMPs Clusters Load Pattern Network Reservation Basic Discrete Event Simulation Infrastructure SimJava Distributed SimJava Virtual Machine (Java, cJVM, RMI) PCs 35 Workstations SMPs Clusters Distributed Resources Selected GridSim Users! 36 Alessandro Volta in Paris in 1801 inside French National Institute shows the battery while in the presence of Napoleon I Fresco by N. Cianfanelli (1841) (Zoological Section "La Specula" of National History Museum of Florence University) What ?!?! Oh, mon Dieu ! This is a mad man… 38 ….and in the future, I imagine a Worldwide Power (Electrical) Grid …... 2002 - 1801 = 201 Years 39 Download Software & Information Nimrod & Parameteric Computing: Economy Grid & Nimrod/G: http://www.buyya.com/ecogrid/wwg/ Cluster and Grid Info Centres: 40 http://www.buyya.com/gridsim/ World Wide Grid (WWG) testbed: http://www.buyya.com/vlab/ Grid Simulation (GridSim) Toolkit (Java based): http://www.buyya.com/ecogrid/ Virtual Laboratory Toolset for Drug Design: http://www.csse.monash.edu.au/~davida/nimrod/ www.buyya.com/cluster/ || www.gridcomputing.com Further Information Books: IEEE Task Force on Cluster Computing 41 www.gridforum.org IEEE/ACM CCGrid’xy: www.ccgrid.org http://www.ieeetfcc.org Global Grid Forum High Performance Cluster Computing, V1, V2, R.Buyya (Ed), Prentice Hall, 1999. The GRID, I. Foster and C. Kesselman (Eds), Morgan-Kaufmann, 1999. CCGrid 2002, Berlin: ccgrid2002.zib.de Grid workshop - www.gridcomputing.org Further Information Cluster Computing Info Centre: Grid Computing Info Centre: http://computer.org/dsonline/gc Compute Power Market Project 42 http://www.gridcomputing.com IEEE DS Online - Grid Computing area: http://www.buyya.com/cluster/ http://www.ComputePower.com Final Word? 43 Backup Slides