Download A user-centric vision for future eInfrastructure

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

TV Everywhere wikipedia , lookup

Service-oriented architecture implementation framework wikipedia , lookup

Transcript
A user-centric vision for future
eInfrastructure and services in Norway
eSOP seminar on eInfrastructure Use Roadmap, March 11, 2011
Hans A. Eide, PhD
Group leader
Research Computing Services
USIT, University of Oslo
University of Oslo and IT, research, HPC
• Two-tier IT organization:
– Local: (at institutes / faculties)
– Central: University Center for Information Technology (USIT)
• USIT
– 240+ FTE and growing
– Covers all aspects of University IT activities
• Section for Education and Research Support (SUF)
– Provides resources, tools, support, competence for
the primary production (education and research), 40
FTE
• Research Computing Services (VD – the HPC group)
– Research support, competence, operations
Research Computing Services group
• 14 people, 9 with research background (Ph.D)
– “buffer” between advanced resources and researchers
– Advanced user support (e.g. parallelization, grid enabling)
– Computation, storage, visualization, emerging tech.
– Not limited to “hard sciences” or HPC
• Multi-source funding
– RCN (Notur, NorStore, Norgrid, projects)
– Research projects (life sci., astro, physics, etc.)
– UiO
• Training, support, operations, help-desk
Tomorrow’s eInfrastructure and services
Tomorrow’s eInfrastructure and services
• Must support all fields of research, be accessible
• Help maximize science production, to the benefit of
society (social, economic, ..), while
• minimizing TCO (i.e. be effective)
• Environmentally friendly
• Quickly adapt to technology changes and new
demands to give competitive edge
• Maintained at a sufficient and stable level relative to
use/need
eInfrastructure really should mean the whole package, but
Usually divided in two aspects:
•eInfrastructure
– Hardware
i.e. computing resources, storage, network, …
•Services
– Software
– Brainware (support services)
eInfrastructure
The eInfrastructure pyramid
(anno 2011)
Capacity
Capability
Tier-0
Multi-Petaflop
WLCG
PRACE
Greenest
Tier-1 (regional/national)
NGIs
Petaflop
Nordic?
Greener
Tier-2 (Local/institutional)
Green(?)
Development
Competence
Services
Training
Portals
Tools
Databases
Sub-Petaflop
Support
Data sources
Today’s situation (simplified) for computing and storage
UiO
UiB
NTNU
Basic infrastructure (network)
UiT
Today’s situation (simplified) for computing and storage
UiO
End of 2010  300kW (maxed out)
From
 900kW (sufficient
to 2013+)
UiB2011 NTNU
UiT
Limited space (and cooling)
Basic infrastructure (network)
Infinite power, space, and cooling
Alternative 1: go alone (x MW in 2015)
Green
datacenter
+
UiO
UiO
Alternative 2: together (y MW in 2015+)
Green
Green
datacenter
Green
datacenter
datacenter
+
UiO
UiB
NTNU
UiT
Alternative 3 (2020!)
Green datacenter
UiO
UiO
UiO
U of X
Green datacenter
“Language technology”
UiO
UiO
UiO
UiO
U of X
“Particle physics”
Green datacenter
“Life science”
U of Y
UiO
UiO
UiO
UiO
UiO
UiO
UiO
UiO
UiO
U of Y
Green datacenter
“Climate”
15
Services
Ideal eInfrastructure services:
• National core services together with local services
–
–
–
–
–
–
–
Fully financed, permanent positions
Close to local resources, users
Pool of competence (advanced user support)
Training, courses, outreach, marketing
Technology watch, early adopters
Partake in Nordic/EU/world-wide programs
Members who are experienced with ICT in the research
process (have background as researchers)
The four waves of extraordinary growth in use of ICT
Advanced services and
infrastructures
Internet applications
PC (affordable)
Mainframe computers
Research and development
1946
1820
Mechanical
calculator
1968
Towards the
computer
2010
1991
A tool
for many
A tool
for “all”
Data systems
everywhere
The evolution of the HPC computing pyramid
(William Gropp, UIUC)
1993
2029
Tera
Flop
Class
www.zettaflops.org
Center
Supercomputer
s
Center
Exascale
Supercomputer
s
Mid-Range Parallel
Processors and Networked
Workstations
Single Cabinet Petascale
Systems
(or attack of the killer GPU
successors)
High Performance
Workstations
Laptops, phones,
wristwatches, eye glasses…
06.07.2017
Users needed to be “inside the box”
19
Users “outside the box”
Tomorrow’s today’s (average) user
• Knows little (nothing) about HPC (and have no interest
in it either)
• Most can’t program (at least not good)
• Don’t want to spend time learning something if it can
be avoided
• Just want results and move on
• Doesn’t know what is available
• ..but expects to get services, resources, and support
for free
SUIT 2010 – UiO user survey
Storage services
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Statistics services
Ikke aktuelt
Kjenner ikke til
Kjenner til
Bruker/Har brukt
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Multimedia services
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Ikke aktuelt
Kjenner ikke til
Kjenner til
Bruker/Har brukt
Qualitative analysis services
Ikke aktuelt
Kjenner ikke til
Kjenner til
Bruker/Har brukt
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Ikke aktuelt
Kjenner ikke til
Kjenner til
Bruker/Har brukt
SUIT 2010 – Research support
12) Bruker du, eller kjenner du til følgende tjenester fra USIT?
Bruker / Har brukt / Kjenner til / Kjenner ikke til / Ikke aktuelt
Challenges
• Even HPC for dummies is too advanced
(and why should users bother?)
• Knowledge about basic methodology seem to be
declining in all fields, among students and researchers
alike (e.g. statistics, mathematics)
• Hard to reach the “customers” with passive marketing
(i.e. web-pages)
• Late adopters of new technologies/capabilities (“don’t
ask me what I need, you should tell me what I need”)
• Serial jobs (not necessarily embarrassingly parallel)
(Some) solutions
• Make it simple to use
F.ex. computing portals (can mitigate problem of serial
jobs by e.g. using GPUs w/o user even knowing!)
• Emphasis on using ICT methods and eInfrastructure in
the education program – part of the curriculum!
• Tailored courses and training for user groups
• Forward-leaning marketing of services (e.g. approach
and ask “why are you not using our xyz service in your
research?”)
• Advanced support (enter early in the problem
formulation/design process), competence
40+ applications
Example: Bioportal
• 2659 registered users, 700+ active
• 40+ applications (MrBayes, RaXML, BLAST, Paup, structure, R,
BEAST og PhyML, …)
•
•
•
•
Bio (life science), chemistry, statistics
Tailored 454 sequencing work-flow
Use nearly 3 mill CPU hrs. in 6 mo.
Pre-compiled binaries allow advanced optimizations,
e.g. use of GPUs and MPI, transparently to the users
ICT services for hum-soc
• Qualitative methods
– Used extensively in humanities and social sciences
– Rich media (audio, video)
– Typical applications: NVIVO, HyperResearch, Transana
• Quantitative methods
–
–
–
–
Statistics
Potentially huge datasets
Sometimes sensitive data
Typical applications: STATA, SPSS, R
• Storage services (data intensive)
• Big need for training
eInfrastructure and services for sensitive data
• Sensitive data enters in many fields
–
–
–
–
–
Life Science
Medicine
Psychology
Social studies
Pedagogic studies
• Lack of eInfrastructure and services for sensitive
research data impairs ability to perform research
DNA-sequencing
Industrial
research
Video/audio
Patient/clinical
Sensitive research data
MRI
Questionnaires
Genetics
eInfrastructure and services in the future
– This is the missing slide about clouds and virtualization
Thanks for your attention!
Questions