Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A user-centric vision for future eInfrastructure and services in Norway eSOP seminar on eInfrastructure Use Roadmap, March 11, 2011 Hans A. Eide, PhD Group leader Research Computing Services USIT, University of Oslo University of Oslo and IT, research, HPC • Two-tier IT organization: – Local: (at institutes / faculties) – Central: University Center for Information Technology (USIT) • USIT – 240+ FTE and growing – Covers all aspects of University IT activities • Section for Education and Research Support (SUF) – Provides resources, tools, support, competence for the primary production (education and research), 40 FTE • Research Computing Services (VD – the HPC group) – Research support, competence, operations Research Computing Services group • 14 people, 9 with research background (Ph.D) – “buffer” between advanced resources and researchers – Advanced user support (e.g. parallelization, grid enabling) – Computation, storage, visualization, emerging tech. – Not limited to “hard sciences” or HPC • Multi-source funding – RCN (Notur, NorStore, Norgrid, projects) – Research projects (life sci., astro, physics, etc.) – UiO • Training, support, operations, help-desk Tomorrow’s eInfrastructure and services Tomorrow’s eInfrastructure and services • Must support all fields of research, be accessible • Help maximize science production, to the benefit of society (social, economic, ..), while • minimizing TCO (i.e. be effective) • Environmentally friendly • Quickly adapt to technology changes and new demands to give competitive edge • Maintained at a sufficient and stable level relative to use/need eInfrastructure really should mean the whole package, but Usually divided in two aspects: •eInfrastructure – Hardware i.e. computing resources, storage, network, … •Services – Software – Brainware (support services) eInfrastructure The eInfrastructure pyramid (anno 2011) Capacity Capability Tier-0 Multi-Petaflop WLCG PRACE Greenest Tier-1 (regional/national) NGIs Petaflop Nordic? Greener Tier-2 (Local/institutional) Green(?) Development Competence Services Training Portals Tools Databases Sub-Petaflop Support Data sources Today’s situation (simplified) for computing and storage UiO UiB NTNU Basic infrastructure (network) UiT Today’s situation (simplified) for computing and storage UiO End of 2010 300kW (maxed out) From 900kW (sufficient to 2013+) UiB2011 NTNU UiT Limited space (and cooling) Basic infrastructure (network) Infinite power, space, and cooling Alternative 1: go alone (x MW in 2015) Green datacenter + UiO UiO Alternative 2: together (y MW in 2015+) Green Green datacenter Green datacenter datacenter + UiO UiB NTNU UiT Alternative 3 (2020!) Green datacenter UiO UiO UiO U of X Green datacenter “Language technology” UiO UiO UiO UiO U of X “Particle physics” Green datacenter “Life science” U of Y UiO UiO UiO UiO UiO UiO UiO UiO UiO U of Y Green datacenter “Climate” 15 Services Ideal eInfrastructure services: • National core services together with local services – – – – – – – Fully financed, permanent positions Close to local resources, users Pool of competence (advanced user support) Training, courses, outreach, marketing Technology watch, early adopters Partake in Nordic/EU/world-wide programs Members who are experienced with ICT in the research process (have background as researchers) The four waves of extraordinary growth in use of ICT Advanced services and infrastructures Internet applications PC (affordable) Mainframe computers Research and development 1946 1820 Mechanical calculator 1968 Towards the computer 2010 1991 A tool for many A tool for “all” Data systems everywhere The evolution of the HPC computing pyramid (William Gropp, UIUC) 1993 2029 Tera Flop Class www.zettaflops.org Center Supercomputer s Center Exascale Supercomputer s Mid-Range Parallel Processors and Networked Workstations Single Cabinet Petascale Systems (or attack of the killer GPU successors) High Performance Workstations Laptops, phones, wristwatches, eye glasses… 06.07.2017 Users needed to be “inside the box” 19 Users “outside the box” Tomorrow’s today’s (average) user • Knows little (nothing) about HPC (and have no interest in it either) • Most can’t program (at least not good) • Don’t want to spend time learning something if it can be avoided • Just want results and move on • Doesn’t know what is available • ..but expects to get services, resources, and support for free SUIT 2010 – UiO user survey Storage services 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Statistics services Ikke aktuelt Kjenner ikke til Kjenner til Bruker/Har brukt 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Multimedia services 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Ikke aktuelt Kjenner ikke til Kjenner til Bruker/Har brukt Qualitative analysis services Ikke aktuelt Kjenner ikke til Kjenner til Bruker/Har brukt 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Ikke aktuelt Kjenner ikke til Kjenner til Bruker/Har brukt SUIT 2010 – Research support 12) Bruker du, eller kjenner du til følgende tjenester fra USIT? Bruker / Har brukt / Kjenner til / Kjenner ikke til / Ikke aktuelt Challenges • Even HPC for dummies is too advanced (and why should users bother?) • Knowledge about basic methodology seem to be declining in all fields, among students and researchers alike (e.g. statistics, mathematics) • Hard to reach the “customers” with passive marketing (i.e. web-pages) • Late adopters of new technologies/capabilities (“don’t ask me what I need, you should tell me what I need”) • Serial jobs (not necessarily embarrassingly parallel) (Some) solutions • Make it simple to use F.ex. computing portals (can mitigate problem of serial jobs by e.g. using GPUs w/o user even knowing!) • Emphasis on using ICT methods and eInfrastructure in the education program – part of the curriculum! • Tailored courses and training for user groups • Forward-leaning marketing of services (e.g. approach and ask “why are you not using our xyz service in your research?”) • Advanced support (enter early in the problem formulation/design process), competence 40+ applications Example: Bioportal • 2659 registered users, 700+ active • 40+ applications (MrBayes, RaXML, BLAST, Paup, structure, R, BEAST og PhyML, …) • • • • Bio (life science), chemistry, statistics Tailored 454 sequencing work-flow Use nearly 3 mill CPU hrs. in 6 mo. Pre-compiled binaries allow advanced optimizations, e.g. use of GPUs and MPI, transparently to the users ICT services for hum-soc • Qualitative methods – Used extensively in humanities and social sciences – Rich media (audio, video) – Typical applications: NVIVO, HyperResearch, Transana • Quantitative methods – – – – Statistics Potentially huge datasets Sometimes sensitive data Typical applications: STATA, SPSS, R • Storage services (data intensive) • Big need for training eInfrastructure and services for sensitive data • Sensitive data enters in many fields – – – – – Life Science Medicine Psychology Social studies Pedagogic studies • Lack of eInfrastructure and services for sensitive research data impairs ability to perform research DNA-sequencing Industrial research Video/audio Patient/clinical Sensitive research data MRI Questionnaires Genetics eInfrastructure and services in the future – This is the missing slide about clouds and virtualization Thanks for your attention! Questions