Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Virtual organizations in astronomy and beyond Tblisi, March 28-30 2007 Prof. Giuseppe Longo Chair of Astrophysics - Department of Physical Sciences University of Napoli Federico II – Italy National Institute of Astrophysics – Napoli Unit [email protected] http://people.na.infn.it/~longo/ The Exponential Growth of Information in Astronomy 1000 100 10 1 0.1 1970 1975 1980 1985 1990 1995 2000 CCDs Total area of 3m+ telescopes in the world in m2, total number of CCD pixels in Megapix, as a function of time. Growth over 25 years is a factor of 30 in glass, 3000 in pixels. • Gigapixel arrays are a reality,hence optical and near infrared surveys are becoming common • Space missions archives are being federated • Old datasets (space and ground based instruments) are being federated Glass • Estimated 1 TB per day in 2008 Astronomy, more than other sciences is facing a Major Data Avalanche ( … a true tsunami…) Large survey projects from ground and from space Distributed data repositories Data are not where the users are PetaBytes of data / week (past, ongoing, future) Massive numerical simulations Distributed computing (PB per simulation) Data federation of MDS Adoption of standards and common onthologies Data analysis and interpretation Need for a new generation of tools (A.I. based) capable to work in a distributed environment International Virtual Observatory Alliance GRID INFRASTRUCTURE The distributed environment • Once the VO’a will come operationals, there will be no need to have locally powerful computing facilities, • Federation of existing and new databases through adoption of common standards Network access to the databases • To provide the user with user friendly access to all federated data • To allow the user to access distributed computing facilities and to exploit all available data withouth moving the data but the codes (… data remain at data centers where the expertise is) • To open entirely new paths to discovery process in astronomy (but not only!) What are some of the goals of VO’s • VO are the most democratic tool ever implemented by any scientific community. • Data repositories are mostly public (either immediately or after proprietary period of observers) • Data analysis and data mining tools are available to the international community through a distributed computing environment • Every one can contribute (either with new data or with new SW-tools) • Once the VO be implemented, new – top level science will be at the “fingers” of any competent scientist who has minimal computing facilities and a good access to the WWW What is being done in Napoli 1 – The surveys VLT Survey Telescope (Napoli,ESO) P.I: Prof. M. Capaccioli 2.5 m diameter - OPTICAL 1x1 sq deg f.o.v. 16 k x 16k CCD mosaic (optical) New technology Adaptive optics 0.2 arcsec psf Operational end 2007 100 GB raw data/night Nobel laureate R. Giacconi visiting VST factory VLT site, cerro Paranal (Chile) What is being done in Napoli 2 – The detector Omegacam French – Netherlands – Italy consortium 16 k x 16 k array CCd mosaic Ready Data processing pipeline European FP6 network ASTROWISE Real time storage and processing of the VST data What is being done in Napoli 3 – The computing CAMPUS GRID Campus GRID Dipartimento di Scienze Fisiche Locale 1G01 “Sala dell’infrastruttura GRID principale” 512 +15 + 24 + 16 + 128 nodes 150 TB storage (IBM, DEC - Alpha, etc.) Armadio telematico infrastrutturale CDS GARR 16 GBaud optical fibers backbone di Centro stella Recently evolved into PON - SCOPE Dipartimento di Chimica Dipartimento di Matematica e Applicazioni 3.6 M€ (8.2 M€ total) for Hardware (512 boards with 4 CPU’s) Financed by Italian Government Operational end 2007 What is being done in Napoli 4 – The Data mining Draco Project building the GRID infrastructure for the Italian VO 400 k€ - MIUR Cost- Action 283 EU Euro – VO, VO-Tech European Virtual Observatory Technological Infrastructures European Infrastructures for VO (UK, D, I, F, etc.) 6.6 M€ - EU VO- Neural (Napoli lead) Building Data Mining and Visualization for Massive Data Sets in a Distributed Environment Complex parameter space Parameter space of incredibly high dimensionality (N>>100) Example 1: panchromatic view of the universe X IR. Opt. radio Crab Nebula: SN 1054 a.C. Example 2: a new way to do conventional astronomy Selection of quasar candidates from a 3 band photometric survey Example: exploring a 3D Parameter Space Given an arbitrary parameter space: • • • • • • Data Clusters Points between Data Clusters Isolated Data Clusters Isolated Data Groups Holes in Data Clusters Isolated Points Nichol et al. 2001 Slide courtesy of Robert Brunner @ CalTech. Example: 21-D parameter space VO- Neural Probabilistic Principal Surfaces Negative ENtropy Clustering + Dendrogram • Multiwavelenght – multiepoch – multinstrument data (federation of databases) hence there is a strong need for a new generation of data processing, data visualization and data-mining tools • These tools must be largely based on Artificial Intelligence • Interoparibility is a must (Plastic is a standard) THESE TOOLS ARE OF WIDE APPLICATION: bioinformatics, geophysics (environment, stratigraphy, etc.), business (stock market, marketing strategies, etc.). Therefore interdisciplinarity is a must! Many probles to be solved: • Missing data (bew data models are needed) • Parallelization of existing codes • Sensibilization of the community through selected scientific cases (astrophysics, bioinformatic, marketing, etc.) We (UK, F, I, D, USA, India) intend to pursue the above tasks using the following instruments: National funds and private companies EU funds through new COST Action and ITN Eventually through RI US funds through NSF Conferences and Schools for young students (dissemination is CRUCIAL) NEW POTENTIAL PARTNERS ARE ENCOURAGED TO CONTACT ME: [email protected] Plate or digital archives of astronomical data Other types of scientific data Advanced programming and mathematical know-how’s