* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Introduction to High Performance and Grid Computing
Survey
Document related concepts
Transcript
Enablingand Grids for Computing E-sciencE Introduction to High Performance Grid Faculty of Sciences, University of Novi Sad High Performance Cluster and Grid Computing io n a l Gr id I n a d e mi c a n d Ac S e o f er b ia EGEE-III INFSO-RI-222667 it t iv Feb. 06, 2009 www.euegee.org u t ca ia Ed Antun Balaz, [email protected] Scientific Computing Laboratory Institute of Physics Belgrade Serbia A E G I S Introduction to High Performance and Grid Computing Overview Enabling Grids for E-sciencE • • • • • Introduction to clusters High performance computing Grid computing paradigm Ingredients for Grid development Introduction to Grid middleware EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Parallel computing Enabling Grids for E-sciencE • Splitting problem in smaller tasks that are executed concurrently • Why? – Absolute physical limits of hardware components (speed of light, electron speed, …) – Economical reasons –more complex = more expensive – Performance limits –double frequency <> double performance – Large applications –demand too much memory & time • Advantages: Increasing speed & optimizing resources utilization • Disadvantages: Complex programming models – difficult development EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Parallelism levels Enabling Grids for E-sciencE • CPU – Multiple CPUs – Multiple CPU cores – Threads –time sharing • Memory – Shared – Distributed – Hybrid (virtual shared memory) EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Parallel architectures (1) Enabling Grids for E-sciencE • Vector machines – – – – – CPU processes multiple data sets shared memory advantages: performance, programming difficulties issues: scalability, price examples: Cray SV, NEC SX, Athlon3/d, PentiumIV/SSE/SSE2 • Massively parallel processors (MPP) – – – – – large number of CPUs distributed memory advantages: scalability, price issues: performance, programming difficulties examples: ConnectionSystemsCM1 i CM2, GAAP (GeometricArrayParallel Processor) EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Parallel architectures (2) Enabling Grids for E-sciencE • Symmetric Multiple Processing (SMP) – – – – – two or more processors shared memory advantages: price, performance, programming difficulties issues: scalability examples: UltraSparcII, Alpha ES, Generic Itanium, Opteron, Xeon, … • Non Uniform Memory Access (NUMA) – – – – – Solving SMP’sscalability issue hybrid memory model advantages: scalability issues: price, performance, programming difficulties examples: SGI Origin/Altix, Alpha GS, HP Superdome EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Clusters Enabling Grids for E-sciencE • Poor’s man supercomputer “…Collection of interconnected stand-alone computers working together as a single, integrated computing resource”– R. Buyya • Cluster consists of: – – – – Nodes Network OS Cluster middleware • Standard components – Avoiding expensive proprietary components EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Cluster classification Enabling Grids for E-sciencE • High performance clusters (HPC) – Parallel, tightly coupled applications • High throughput clusters (HTC) – Large number of independent tasks • High availability clusters (HA) – Mission critical applications • Load balancing clusters – Web servers, mail servers, … • Hybrid clusters – Example: HPC+HA EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Beowulf clusters Enabling Grids for E-sciencE • 1994 – T. Sterling & M. Baker – NASA Ames Centre • Frontend – Access machine – JMS & Monitoring server – Shared storage –NFS (directory /home) • Nodes – Multiple private networks – Local storage (/scratch) • Private networks – High speed / low latency EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing From clusters to Grids Enabling Grids for E-sciencE • Many distributed computing resources (clusters) exist, even in Serbia • Problem 1: they cannot be used by end users transparently • Problem 2: even when access is granted to users to several clusters, they tend to neglect smaller clusters • Problem 3: distribution of input/output data, sharing of data between clusters • To overcome such problems, Grid paradigm was introduced EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Unifying concept: Grid Enabling Grids for E-sciencE Resource sharing and coordinated problem solving in dynamic, multi-institutional virtual organizations. EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Effective policy governing access within a collaboration Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing What problems Grid addresses Enabling Grids for E-sciencE • Too hard to keep track of authentication data (ID/password) across institutions • Too hard to monitor system and application status across institutions • Too many ways to submit jobs • Too many ways to store & access files/data • Too many ways to keep track of data • Too easy to leave “dangling” resources lying around (robustness) EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Requirements Enabling Grids for E-sciencE • • • • • • • Security Monitoring/Discovery Computing/Processing Power Moving and Managing Data Managing Systems System Packaging/Distribution Secure, reliable, on-demand access to data, software, people, and other resources (ideally all via a Web Browser!) EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Why Grid security is hard (1) Enabling Grids for E-sciencE • Resources being used may be valuable & the problems being solved sensitive – Both users and resources need to be careful • Dynamic formation and management of user groups – Large, dynamic, unpredictable… • Resources and users are often located in distinct administrative domains - Cannot assume cross-organizational trust agreements – Different mechanisms & credentials EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Why Grid security is hard (2) Enabling Grids for E-sciencE • Interactions are not just client/server, but service-to-service on behalf of user – Requires delegation of rights user service – Services may be dynamically instantiated • Standardization of interfaces to allow for discovery, negotiation and use • Implementation must be broadly available & applicable – Standard, well-tested, well-understood protocols; integrated with wide variety of tools • Policy from sites, user communities and users need to be combined – Varying formats • Want to hide as much as possible from applications! EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Grids and VOs (1) Enabling Grids for E-sciencE • Virtual organizations (VOs) are groups of Grid users (authenticated through digital certificates) • VO Management Service (VOMS) serves as a central repository for user authorization information, providing support for sorting users into a general group hierarchy, keeping track of their roles,etc. • VO Manager, according to VO policies and rules, authorizes authenticated users to become VO members EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Grids and VOs (2) Enabling Grids for E-sciencE • Resource centers (RCs) may support one or more VOs, and this is how users are authorized to use computing, storage and other Grid resources • VOMS allows flexible approach to A&A on the Grid EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing User view of the Grid Enabling Grids for E-sciencE •User Interface •User Interface •Grid services EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Ingredients for GRID development Enabling Grids for E-sciencE • Right balance of push and pull factors is needed • Supply side Technology – inexpensive HPC resources (linux clusters) Technology – network infrastructure Financing – domestic, regional, EU, donations from industry • Demand side Need for novel eScience applications Hunger for number crunching power and storage capacity EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Supply side - clusters Enabling Grids for E-sciencE • • • • • • The cheapest supercomputers – massively parallel PC clusters This is possible due to: Increase in PC processor speed (> Gflop/s) Increase in networking performance (1 Gbs) Availability of stable OS (e.g. Linux) Availability of standard parallel libraries (e.g. MPI) Widespread choice of components/vendors, low price (by factor ~5-10) Long warranty periods, easy servicing Simple upgrade path Good knowledge of parallel programming is required Hardware needs to be adjusted to the specific application (network topology) More complex administration Advantages: Disadvantages: Tradeoff: brain power purchasing power The next step is GRID: Distributed computing, computing on demand Should “do for computing the same as the Internet did for information” (UK PM, 2002) EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Supply side - network Enabling Grids for E-sciencE • Needed at all scales: World-wide Pan-European (GEANT2) Regional (SEEREN2, …) National (NREN) Campus-wide (WAN) Building-wide (LAN) • Remember – it is end user to end user connection that matters EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing GÉANT2 Pan-European IP R&E network Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing GÉANT2 Global Connectivity Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Future development: regional network Enabling Grids for E-sciencE Budapest Oradea Cluj-Napoca Szeged Targo-Mures Arad Subotica Timisoara Brasov Novi-Sad Resita Belgrade Brcko Derventa Turnu Severin Bjeljina Doboj Banja Luka Pitesti Bucharest Sabac Zvornik Vlasenica Sarajevo Ploiesti Slatina Craiova Ruse Kragujevac Nis Pirot Sevlievo Sofia Vranje Plovdiv Skopje Titov Veles Tirana Prilep Xanthi Bitola Ohrid Elbasan Gjirokastra Drama Serres Edessa Tepelene Korce Florina Beroia Ioannina Kardzali Komotini Thessaloniki Larissa Lamia Mytilini Preveza Agrinio Patra Veliko Tarnovo Livadia Chios Athens Samos Syros Rhodos Chania EGEE-III INFSO-RI-222667 Iraklio Introduction to High Performance and Grid Computing Supply side - financing Enabling Grids for E-sciencE • National funding (Ministries responsible for research) Lobby gvnmt. to commit to Lisbon targets Level of financing should be following an increasing trend (as a % of GDP) Seek financing for clusters and network costs Networking (HIPERB) Action Plan for R&D in SEE FP6 – IST priority, eInfrastructures & GRIDs FP7 CARDS • • • • • Bilateral projects and donations Regional initiatives EU funding Other international sources (NATO, …) Donations from industry (HP, SUN, …) EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Demand side - eScience Enabling Grids for E-sciencE • Usage of computers in science: Trivial: text editing, elementary visualization, elementary quadrature, special functions, ... Nontrivial: differential eq., large linear systems, searching combinatorial spaces, symbolic algebraic manipulations, statistical data analysis, visualization, ... Advanced: stochastic simulations, risk assessment in complex systems, dynamics of the systems with many degrees of freedom, PDE solving, calculation of partition functions/functional integrals, ... • Why is the use of computation in science growing? Computational resources are more and more powerful and available (Moore’s law) Standard approaches are having problems Experiments are more costly, theory more difficult Emergence of new fields/consumers – finance, economy, biology, sociology • Emergence of new problems with unprecedented storage and/or processor requirements EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Demand side - consumers Enabling Grids for E-sciencE • Those who study: Complex discrete time phenomena Nontrivial combinatorial spaces Classical many-body systems Stress/strain analysis, crack propagation Schrodinger eq; diffusion eq. Navier-Stokes eq. and its derivates functional integrals Decision making processes w. incomplete information … Adequate training in mathematics/informatics Stamina needed for complex problems solving • Who can deliver? Those with: • Answer: rocket scientists (natural sciences and engineering) EGEE-III INFSO-RI-222667 Introduction to High Performance and Grid Computing Scenario Enabling Grids for E-sciencE “User stderr.txt interface” stdout.txt Status / log query Job Submit Event Logging and bookkeeping EGEE-III INFSO-RI-222667 Submit Input “sandbox” Get output Output “sandbox” stderr.txt stdout.txt publish state Job status update • STD input stream is read from file • STD out and err. streams are redirected into files A worker node is allocated by the local jobmanager stderr.txt stdout.txt /bin/hostname Computing Element Introduction to High Performance and Grid Computing