Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
“A wide-field astronomer in King Malcolm’s court” or My year as a NeSC Research Leader Bob Mann Institute for Astronomy and NeSC University of Edinburgh Outline of talk My background Duties of a NeSC Research Leader Some of my highlights from the year Astronomical testbed for edikt’s BinX project Scientific Data Mining, Integration & Visualization Sky Survey Database Design Virtual Observatory as a Data Grid Conclusions from the year My background “Generalist” astronomer Theory and observations X-ray, optical, infrared, submillimetre, radio Formation/evolution of galaxies & clusters of galaxies Member of Wide Field Astronomy Unit Home of UK’s largest sky survey databases Member of AstroGrid team UK contribution to building an international Virtual Observatory Duties of a NeSC Research Leader encouraging the uptake of Grid technologies in Astronomy and related fields encouraging visitors, with whom you have a research overlap, to visit Edinburgh and work with you and other local colleagues organising and running research workshops assisting with the development of new core Grid and scientific database technologies promoting NeSC within the Universities of Edinburgh and Glasgow through, for example, personal presentations and more widely at conferences and workshops. …and all that in 0.5 FTE! BinX astronomy testbed What is BinX? edikt project see www.edikt.org/binx - download BinX v1.0! XML language description of binary data files library of tools for manipulating files Why BinX and astronomy? Two main data formats for tabular data: X VOTable (XML) and FITS binary tables XML good for interoperability and transformation, but verbose & lots of legacy data in FITS files Want FITS some of the time & VOTable the rest BinX astronomy testbed (2) VOTable FITS conversion with BinX it works! - some performance improvements desirable (use SAX, not DOM) workable solution for astronomy & proof of concept for edikt Possible extensions to BinX data extraction from binary files with XPath 1.0? delivering SAX events from binary files to apps? closer integration with databases - ELDAS? RL time significantly improved interaction Science drivers for GGF DFDL WG “Scientific Data Mining, Integration & Visualization” Two-day workshop in October 2002 Focus for visit by Roy Williams (Caltech) Fifty attendees - astronomy, atmospheric science, bioinformatics, chemistry, digital libraries, engineering, environmental science, experimental physics, marine sciences, oceanography, and statistics…plus CS and software engineers Report [UKeS-2002-06] with 12 recommendations R5. A mechanism should be sought whereby the peer-reviewed publication of datasets can be made part of the standard scientific process. R8. A set of tutorials should be created and maintained, for introducing application scientists to new key concepts in e-science. Spawned e-science Data Mining SIG now - want to discuss solutions, not just problems “Sky Survey Database Design” One-day workshop in April 2003 ~10 people: AstroGrid, UK wide field astronomy, IBM, Oracle Identified spatial indexing in large databases as a problem of interest beyond astronomy Spawned research programme on spatial indexing in sky survey databases - NeSC, WFAU, IBM, Oracle and Microsoft future applications to other spatially-indexed domains “The Virtual Observatory as a Data Grid” Three day meeting in June/July 2003 Focus for visits by Jim Gray (Microsoft), Alex Szalay (Johns Hopkins), Roy Williams (again!) 25 participants - Virtual Observatory, database and data grid communities in UK, US, Europe Report [UKeS-2003-03] Spawning “SkyQuery-G” take SkyQuery.Net WWW service for matching astronomical sources, and make it a grid service, using OGSA-DAI and/or ELDAS Conclusions RL positions valuable from applications side long, steep learning curve for average scientist RL positions valuable from “infrastructure” side sustained involvement with user community allows creation of realistic testbeds Visitor(s) ⇒ Workshop ⇒ Report model works Serious Concern conceptual chasm between infrastructure builders and application scientists RL-type positions help bridge it, but what else?