Download Introduction to Grid Computing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

General-purpose computing on graphics processing units wikipedia , lookup

Trusted Computing wikipedia , lookup

Theoretical computer science wikipedia , lookup

Natural computing wikipedia , lookup

Lateral computing wikipedia , lookup

Transcript
Introduction to Grid
Computing
© 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011
1-1.1
Grid Computing
• Using
geographically
distributed and
interconnected
computers
together for
computing and
for resource
sharing.
“The grid virtualizes heterogeneous geographically disperse resources”
from "Introduction to Grid Computing with Globus," IBM Redbooks
1-1.2
Need to harness computers
Original driving force behind Grid
computing same as behind the early
development of networks that became the
Internet:
– Connecting computers at distributed
sites for high performance computing.
1-1.3
However, Grid
computing is
about
collaborating and
resource sharing
as much as it is
about high
performance
computing.
1a-1.4
Virtual
Organizations
Grid computing offers
potential of virtual organizations:
– groups of people, both geographically and
organizationally distributed, working
together on a problem, sharing computers
AND other resources such as databases
and experimental equipment.
1a-1.5
Different organizations can supply resources and
personnel.
Concept has many benefits, including:
•Problems that could not be solved previously for
humanity because of limited computing resources
can now be tackled.
Examples
•
•
Understanding the human genome
Searching for new drugs … .
Continued.
1-1.6
• Users can have access to far greater computing resources
and expertise than available locally.
• Inter-disciplinary teams can be formed across different
institutions and organizations to tackle problems that
require expertise of multiple disciplines.
• Specialized localized experimental equipment can be
accessed remotely and collectively.
• Large collective databases can be created to hold vast
amounts of data.
• Unused compute cycles can be harnessed at remote sites,
achieving more efficient use of computers.
1-1.7
Crosses multiple
administrative domains.
• Another hallmark of larger Grid
computing projects.
• Resources being shared owned either
by members of virtual organization or
donated by others.
• Introduces challenging technical and
social-political challenges.
• Requires true collaboration.
1-1.8
• Some key features we regard as indicative
of Grid computing:
– Shared multi-owner computing resources
– Uses Grid computing software, with security and
cross-management mechanisms in place
– Tools to bring together geographically distributed
computers owned by others.
1-1.9
Shared Resources
Can share much more than just computers:
•
•
•
•
•
Storage
Sensors for experiments at particular sites
Application Software
Databases
Network capacity, …
1-1.10
History of distributed computing
Certainly one can go back a long way to trace the
history of distributed computing.
Types of distributed computing existed in 1960s.
Many people interested in connecting computers
together for high performance computing.
From connecting processors/computers together
locally that began in earnest in the 1960s and 1970s,
distributed computing now extends to connecting
computers that are geographically distant - Grid
computing.
1-1.11
Distributed computing technologies that
underpin Grid computing developed
concurrently and rely upon each other.
Three concurrent interrelated paths:
• Networks
• Computing platforms
• Software techniques
1-1.12
Networks
1960s - Development of packet switched networks.
1969 - ARPNET network became operational.
4 nodes, Univ. of California at Los Angeles, Stanford
Research Institute, Univ. of California at Santa
Barbara, and Univ. of Utah. Design speed of 50
Kbits/sec.
1974 - TCP (Transmission Control Protocol)
1978 - TCP/IP (Transmission Control Protocol/Internet
Protocol).
TCP a protocol for reliable communication
IP for network routing. IP addresses identify hosts on
the Internet Ports identify end points (processes) for
communication purposes.
Early 1970s - Ethernet for interconnecting computers on
local networks.
Early 1980s - Internet. Uses the TCP/IP protocol.
1990s - Internet developed into World-Wide Web.
Browser and HTML markup language introduced.
1-1.13
Computing Platforms
1960 onwards - Recognized that increased speed could
potentially be obtained by having more than one
processor inside a single computer system
Parallel computer coined to describe such systems.
1970s and 1980s - many parallel computer projects
especially with advent of low cost microprocessors.
1990s - cluster computing, a group of computers inter
connected through a network switch to form a
computing platform
Commodity computers (PCs) provided cost-effective
solution.
1-1.14
Typical
cluster
computing
configuration
Programming clusters: Message passing programming -- Messages between
processes specified by programmer using message-passing routines:
Late 1980s - early 1990s - PVM (Parallel Virtual Machine)
Late 1990s - MPI (Message Passing Interface)
1-1.15
Late 1980’s onwards – Condor
To harness “unused” cycles of networked computers for high
performance computing.
A collection of computers could be given over to remote
access automatically when not being used locally.
Widely used as a job scheduler for clusters in addition to its
original purpose of using laboratory computers collectively.
We will consider Condor in the light of Grid computing.
1-1.16
Software Techniques
Mid 1980s - Remote procedure call (RPC) for invoking a
procedure on a remote computer.
Service registry - introduced with RPC to locate remote
services.
1990s - Object-oriented versions of RPC:
CORBA (Common Request Broker Architecture)
Java Method Invocation (RMI).
2000 - Web service
Provide remote actions as RPC but invoked through
standard protocols and Internet addressing.
Use XML (eXtensible Markup Language), also
introduced in 2000.
Web services and XML adopted into Grid computing
soon after their introduction
1-1.17
Grid Computing History
• Began in mid 1990s with experiments
using computers at geographically
dispersed sites.
• Seminal experiment – “I-way” experiment
at 1995 Supercomputing conference
(SC’95), using 17 sites across US running:
– 60+ applications.
– Existing networks (10 networks).
1-1.18
Globus Toolkit
Middleware software
Grid computing toolkit.
Led by Ian Foster, a
co-developer of I-Way
demo and founder of
Grid computing
concept.
Evolved through
several versions
although basic
structural components
remained essentially
same.
We will describe Globus in
detail later.
http://globus.org/
Other grid computing middleware software
Although Globus widely adopted and basis of our course, there are
other software infrastructure projects.
Europe:
1990s UNICORE (UNiform Interface to COmputing REsources)
European grid computing project.
Initially funded by German Ministry for Education and Research.
Continued with other European funding.
Basis of several European efforts and elsewhere.
Many similarities to Globus.
Still active. http://www.unicore.eu/index.php
2002-2010 EGEE, Enabling Grids for E-Science, continued as
EGI, European Grid Infrastructure
1-1.20
gLite
Lightweight
Middleware for
Grid
Computing
A framework for building grid applications
part of EGEE project, continued in EMI.
http://glite.cern.ch/
http://www.ige-project.eu/
Key
concepts
in the
history of
Grid
computing
Fig. 1.2
1-1.23
Applications
• Originally e-Science applications
– Computational intensive
• Traditional high performance computing
addressing large problems
• Not necessarily one big problem but a
problem that has to be solved repeatedly
with different parameters.
– Data intensive
• Computational but emphasis on large
amounts of data to store and process
– Experimental collaborative projects
1-1.24
• Now also e-Business applications
–To improve business models and
practices.
–Sharing corporate computing
resources and databases
–On-demand Grid computing …
–Indirectly led to cloud computing.
1-1.25
Grid Computing verse Cluster
Computing
• Important not to think of Grid computing
simply as large cluster because potential and
challenges different.
• Courses on Grid computing and on cluster
computing are quite different.
1-1.26
Cluster computing course
• One learns about :
– Message passing programming using tools such as MPI,
and
– Shared memory programming using threads and
OpenMP, given that most computers in a cluster today
now multi-core shared memory systems.
– Parallel algorithms (lots)
• Network security is not a big issue.
– Usually an ssh connection to front node of cluster
sufficient.
– User logging onto a single compute resource.
• Computers connected together locally under one
administrative domain
1-1.27
Grid computing course
• Learn about running jobs of remote machines,
scheduling jobs and distributed workflow
• Learn in detail underlying Grid infrastructure
• How Internet technologies applied to Grid
computing
• Grid computing software and standards
• Security is an issue.
1-1.28
Grid Computing verse Cluster Computing
• Of course, there are things in common
• Both courses hands-on with programming
experiences.
• Both use multiple computers
• Both require job scheduler to place jobs.
1-1.29
Cloud computing
• Lot of hype on Cloud computing at the moment.
• Business model in which services provided on
servers that can be accessed through Internet.
• Lineage of cloud computing can be traced back to
on-demand Grid computing in the early 2000s.
1-1.30
Cloud computing using virtualized resources
Fig. 1.3
1-1.31
• Common thread between Grid computing
and cloud computing is use of Internet to
access resources.
• Cloud computing driven by widespread
access that Internet provides and Internet
technologies.
• However cloud computing quite distinct
from original purpose of Grid computing.
1-1.32
Grid Computing verse Cloud Computing
• Whereas Grid computing focuses on
collaborative and distributed shared resources,
Cloud computing concentrates upon placing
services for users to pay to use.
• Technology for cloud computing emphases:
– use of software as a service (SaaS)
– virtualization (process of separating particular
user’s software environment from underlying
hardware).
1-1.33