Download DCIs and Workflows

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
DCIs and Workflows
C. Vuerli
Contributions by G. Sipos, K. Varga, P. Kacsuk
Definitions
• Distributed computing: distributed systems to solve
computational problems
– Computer clusters: sets of loosely connected or tightly
connected computers that work together so that in many
respects they can be viewed as a single system
– Grids: distributed, highly loosely coupled, heterogeneous
and geographically dispersed systems with non-interactive
workloads that involve a large number of files
– HPC systems: enable users to run a single instance of
parallel software over many processors.
– Cloud computing: applications and services offered over
the Internet from data centers all over the world, which
collectively are referred to as the “clouds”. The main
enabling technology for cloud computing is virtualization
which abstracts the physical infrastructure – which is the
most rigid component – and makes it available as a soft
component, easy to use and manage.
DCI Projects
• EDGI (European Desktop Grid Initiative)
– http://edgi-project.eu/
• EGI – InSPIRE (Integrated Sustainable Pan-European
Infrastructure for Researchers in Europe)
– http://www.egi.eu/projects/egi-inspire/
• EMI (European Middleware Initiative)
– http://www.eu-emi.eu/
• IGE (Initiative for Globus in Europe)
– http://www.ige-project.eu/
• StratusLab (Enhancing Grid Infrastructures with
Virtualization and Cloud Technologies)
– http://www.stratuslab.eu/
• VENUS-C (Virtual multidisciplinary EnviroNments
USing Cloud Infrastructures)
– http://www.venus-c.eu/
EGI
• European
– Over 35 countries
• Grid
– Secure sharing
• Infrastructure
– Computers
– Data
– Instruments
– …. and beyond!!
EGI Resource Infrastructure Providers:
National Grid Initiatives
(Snapshot: April 2012)
EGI, EGI.eu, EGI-InSPIRE
• EGI – an open collaboration
– To support the digital European Research Area
through a pan-European research infrastructure based
on services federated from the NGIs
• EGI.eu – a Dutch foundation owned by the NGIs
– To coordinate the work of EGI (operations,
technology, user support, policy, community &
communications, administration)
– Sustainable small coordinating organisation
• EGI-InSPIRE – an FP7 project
– Supports EGI.eu and EGI to transition to sustainable
operational model
EGI’s Strategic Focus
• Community & Coordination
– Community building through events.
• Next event: EGI Technical Forum, Madrid, Sept. 2013
• http://tf2013.egi.eu/
– Community networking, marketing and outreach
through the NGIs
• Operational Infrastructure
– Operate a European wide infrastructure
– Offer its use to other research infrastructures
– Build a federated cloud environment
• Virtual Research Environments
– Enable 3rd party integration & operation of VREs
Guiding Principles
• Be a neutral resource provider
– Any application, any domain, any technology
– A platform for domain specific innovation & use
– Integration of any compliant resource
• End-user needs and technologies change
– Allow communities to deploy their own services
– Give communities the power to meet their own
needs
Virtual Organizations
• VOs – Virtual Organizations
– Groups of researchers with similar scientific interests
and requirements, who are able to work
collaboratively with other members and/or share
resources (e.g. data, software, expertise, CPU,
storage space), regardless of geographical location.
• Researchers must join a VO in order to use grid
computing resources provided by EGI.
• Each virtual organization manages its own
membership list, according to the VO’s
requirements and goals.
• EGI provides support, services and tools to allow
VOs to make the most of their resources
Virtual Reseach Communities
• VRCs – Virtual Research Communities
– Self-organized research communities which give individuals
within their community a clear mandate to represent the
interests of their research field within the EGI ecosystem.
– They can include one or more VOs and act as the main
communication channel between the researchers they
represent and EGI
• EGI establishes partnerships with individual VRCs
through a Memorandum of Understanding (MoU).
• VRCs can access the computing resources and data
storage provided by the EGI community through open
source software solutions. VRC members can store,
process and index large datasets and can interact with
partners using the secured services of EGI’s production
infrastructure
Virtual Reseach Environments
• VREs – Virtual Research Environments
– A combination of environments that provide the
researcher with easy access to the services
deployed in EGI to enable their data analysis
activities.
– Initially, the VRE consisted of a command line
interface for the researcher to access the
services deployed across the infrastructure.
– Over the years, higher-level generic tools and
domain specific environments have been
developed by many research communities to
simplify the data analysis process.
SSO Authentication
• SSO – Single Sign On
– A session/user authentication process that
permits a user to present his/her credentials
to access multiple applications and resources.
– The process authenticates the user for all the
applications they have been given rights to
and eliminates further prompts when users
switch from one application to another during
a particular session.
SSO Authentication Concepts
• Authentication
– The process of determining whether someone
or something is, in fact, who or what it is
declared to be on the basis of provided
credentials.
– Authentication is commonly done through the
use of username/password pairs.
– The use of digital certificates issued and
verified by a Certificate Authority (CA) as part
of a PKI (Public Key Infrastructure) is
considered likely to become the standard way
to perform authentication on the Internet.
SSO Authentication Concepts
• IdP – Identity Provider
– Provides Single Sign-On services. In addition
to a simple yes/no response to an
authentication request, the Identity Provider
can provide a rich set of user-related data to
the Service Provider
• IdF – Identity Federation
– Enables IdPs to exchange identity information
securely across domains, providing browserbased SSO
EGI Platforms
Virtual Research Communities
EGI Cloud
Infrastructure
Platform
VM
VM
VM
EGI Collaboration platform
Community
Platform
Community
Platform
(gLite Grid)
Grid services
EGI resources (cores & storage)
EGI Core
Infrastructure
Platform
Core Infrastructure Platform
Services
•
•
•
•
•
Service monitoring
Service usage accounting
Security (Authentication, Authorization)
Staged rollout of services
...
Unified Middleware Distribution
• External technology providers: EMI (ARC, gLite, UNICORE, dCACHE),
IGE (Globus), EDGI (desktop grids), SAGA, …
• UMD capabilities captured in UMD roadmap
• Staged Rollout: updates of the supported middleware are first released
to and tested by Early Adopter sites before being made available to all
sites through the production repositories.
Core infrastructure
platform
External Technology
Providers
Requirements
Software
Criteria
Definition
Operations
Criteria
Verification
Staged
Rollout
Production
Provisioning Infrastructure
Deployed
Software
Helpdesk unit
User communities of the UMD
VO
Virtual
Organisations
VO
Virtual
Research
Community
Research
communities
Virtual
Research
Community
VO
User
User
User
User
User
User
•
•
•
•
•
•
•
High Energy Physics
Astronomy & Astrophysics
Life Sciences
Earth Sciences
Computational Chemistry
Fusion
…
Members
User
Largest user communities:
~25.000 users in ~200 Registered VOs
http://operations-portal.egi.eu/vo
UMD use case 1:
workload management with gLite
User environment
Create job definition
Submit job
(batch executable + small inputs)Broker
Information System
service
query
create
Proxy
(~login)
Retrieve status
&
(small) output files
Retrieve
output
Job
status
publish
state
Submit job
Logging
Site X of YOUR VO
Computing service
VO Management
Service
(VOMS)
Job
status
Logging and
bookkeeping service
process
Storage Service
Read/write
files
20
UMD use case 2:
data management with gLite
Command line or Portal
Information System
Query
create
proxy
Register file
Lookup file
Upload file
Download file
File
content
publish
state
Site X of YOUR VO
Computing service
VO Management
Service
(VOMS)
Storage service
File Catalog and/or
Metadata catalog
File
references
21
Cloud infrastructure platform
Scientific portals
(SaaS environments)
Scientific
portal
Scientific
portal
Scientific
portal
Community specific, grid
enabled services
Community
specific, grid
enabled services
Grid
middleware
Grid
middleware
OS
OS
Hardware
Custom
services
Hypervisor
OS
OS
OS
OS
Hardware
OS
OS
Some of the cloud
infrastructure pilot use cases
• Structural biology – We-NMR project:
Gromacs training environments
• Ecology – BioVel project:
Remote hosting of OpenModeller service
• Linguistics – CLARIN project:
Scalable ‘British National Corpus’ service (BNCWeb)
• Software development – SCI-BUS project:
Simulated environments for portal testing
• Space science – ASTRA-GAIA project:
Data integration with scalable workflows
Join as pilot use case, or as provider of high level service on top of the EGI Cloud!
What does the EGI.eu User
Community Support Team do?
European level coordination and support to enable
research communities to evolve into routine EGI users
– Knowing and driving technical activities
(Virtual Team projects)
– Knowing and responding to technical needs
• Support the development and deployment of innovative,
collaborative VREs
–
–
–
–
Integrating applications with EGI platforms
Training and consultancy
Facilitate the reuse of services across communities
Provide services for the EGI Collaboration Platform
• Applications Database, Training Marketplace, Requirements tracker,
CRM
Services of the Collaborative platform 1:
Applications Database
Benefits:
• Gives recognition to reusable
scientific applications,
application developer tools,
portals, workflow systems, etc.
• Gives recognition to application
developers (people profiles)
• Access through web page AND
web gadget
• Community features such as
commenting, rating,
tag-based groupings
http://appdb.egi.eu
Services of the Collaborative platform 2:
Training Marketplace
Benefits:
• Register & share
• training events,
expertise, services,
materials,
resources, online
courses, university
courses
• Browse and search
items
• Community features
such as commenting,
rating
• Access through web
page and web gadget
http://training.egi.eu
Services of the Collaborative platform 3:
Customer Management System (CRM)
• To capture leads and needs of scientific communities
• EGI CRM – main capabilities (based on vTiger CRM):
– Register projects/communities
– Register personal leads
– Register details about ‘Potential for EGI use’ and
‘Interest in e-infrastructures’
• Available to NGI International Liaisons and their collaborator
at http://crm.egi.eu
Services of the Collaborative platform 4:
Requirements tracker
User
Community
Support
Team of
EGI.eu
VOs
VRCs
projects
NGIs
Input
channels for
community
requirements
EGI
Requirements
Tracker
events
User
Community
Board
Structured
scientific
communities
Technology
Coordination
Board
Technology
providers
Operations
Management
Board
Resource
centres
EGI Helpdesk
NGIs
&
projects
http://go.egi.eu/requirements
EGI Web Gadgets:
Embed EGI services into Websites!
• The simplest way to reuse EGI services: embed EGI gadgets
into your website
• Benefits
– Customisability
– Reusability
– Compatibility
• Existing gadgets:
–
–
–
–
Application Database
Training Marketplace
Requirement Tracker
Grid Relational Catalogue
• Use and develop gadgets & share them through EGI:
www.egi.eu/user-support/gadgets
EGI Virtual Team projects
• NGI International Liaisons
– Single point of contact in NGIs for various activities, including Technical
Outreach
• Virtual Team projects: https://wiki.egi.eu/wiki/Virtual_Team_Projects
– Short (1-6 month) project with multiple NGIs involved
– Setup through NILs
– Help EGI enlarge its user base
• Active VT projects with UCST’s involvement:
–
–
–
–
Science Gateway Primer (Final editing of the Primer document)
NGI-ELIXIR ESFRI collaboration (Since October)
Technology study for CTA ESFRI (Since January)
Towards a Chemistry, Molecular & Materials Science and Technology
Virtual Research Community (In setup)
Wokflows
• Progression of steps (tasks, events,
interactions) that
– comprise a work process
– involve two or more persons
– create or add value to the organization's
activities.
• Sequential workflow
– each step is dependent on occurrence of the
previous step
• Parallel workflow
– two or more steps can occur concurrently
Workflow Management
Systems
• A Workflow Management System (WMS) is a piece
of software that provides an infrastructure to setup,
execute, and monitor scientific workflows
– An important function of an WMS during the workflow
execution, or enactment, is the coordination of operation
of individual components that constitute the workflow;
this process is also often referred to as orchestration
– Scientific workflows and WMSs have emerged to
provide an easy-to-use way of specifying the tasks that
have to be performed during a specific in silico
experiment. The need to combine several tools into a
single research analysis still holds, but technical details
of workflow execution are now delegated to Workflow
Management Systems.
Workflow Editor
The skeleton of a workflow is represented by a
Graph.
Jobs denote the activities, which envelop
insulated computations
Channels are directed edges of a graph, directed
from the output ports towards the input ports.
Workflow Editor
We can rely on a local and a
public workflow repository .
Workflow services
• Graph creation
• Concrete workflow creation
• Concrete workflow configuration
– Job types and corresponding properties
– Port properties
• Certificate handling
• Submission
– Log examination
• Submitted instance management
• Result evaluation
• Repository handling (export/import)
CTA Gateway
Workflows instances
CTA Gateway
Workflows instances
SHIWA Motivation
Isolation by technology
– Many different workflow systems exists  We expect no change
– Technological choice isolates users and user communities
– Technological choice determines the adopted DCI/middleware
(WS, gLite, Globus, etc.)
Benefits of the SSP – SHIWA Simulation Platform
– Share your own workflow, re-use workflows of others
– Create and execute ‘meta-workflows’: built from smaller workflows
that use different workflow languages/technologies
– Combine workflows and DCIs
SHIWA  ER-flow
– SHIWA: FP7 R&D project. Created the SHIWA Simulation Platform (2010-2012)
– ER-flow: FP7 support action project. Disseminates the SHIWA technology (20122014).
Generic use-cases
SE
CE
CE
SE
DCI #2
DCI #1
DCI – distributed computing infrastructure; CE/SE – computing/storage element
Generic use-cases
find and re-use
combine WFs
with own data
SE
CE
on own CE?
CE
SE
DCI #2
DCI #1
DCI – distributed computing infrastructure; CE/SE – computing/storage element
SHIWA Simulation Platform
What does the developer of a
(meta)workflow need?
SHIWA
Repository
A portal/desktop to integrate,
parameterize and run these
applications
Access to various workflow
engines and DCIs where these
WF applications can be run
Access to a large
set of ready-to-run
scientific workflow
applications
SHIWA
Portal + Workflow engines
Supercomputer
grids
(PRACE, XSEDE)
Web services
Cluster grids
(EGI, OSG, …)
Local clusters
Clouds
Supercomputers
Desktop grids
(BOINC, Condor, etc.)
Grid systems
Distributed Computing Infrastructures (DCIs)
SHIWA
Repository
Facilitates publishing and
sharing workflows
Supports:
• Abstract workflows with
multiple implementations of
over 10 workflow systems
• Concrete workflows with
execution specific data
Available:
• From the SHIWA Portal
http://ssp.shiwa-workflow.eu
• Standalone interface:
http://repo.shiwa-workflow.eu
Example: Create meta-workflow from
Moteur and WS-PGRADE workflows
SHIWA
Repository
Access to Moteur WF
SHIWA
portal
DCI 1
gLite
Moteur WF engine
DCI 2
Cloud
Connected
workflow
engines
DCI n
ARC
Remarks
• Scientific workflow landscape is fragmented
– Many workflow system with specialised & small user
base
– Many workflows with specialised & small user base
• SHIWA Simulation Platform enables integration
– Within disciplines
– Across disciplines
• ER-flow project provides user support
– Consultancy & Workshops
– User forum
– Open for new communities