Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Workshop Garching, June 27 – July 1 2005 Statistical Cross-Matching Across Distributed Archives H.-M. Adorf & GAVO Team MPI f. extraterrestrische Physik [email protected] Statistical cross-matching Cross-matching of astrometric and photometric catalogues – core functionality of a virtual observatory Operational modes – on an area of the sky – using an input catalogue (GAVO matcher) Hans-Martin Adorf, GAVO Matcher Demo, Page 2 Philosophy Build a cross-matcher application that – should be usable by scientists and help producing science results – uses what’s there and what works now – doesn’t get stopped by a missing standard Support the VO process by – helping to generate appropriate VO-standards – adopting new VO-standards whenever feasible Hans-Martin Adorf, GAVO Matcher Demo, Page 3 Querying remote archives Movie Hans-Martin Adorf, GAVO Matcher Demo, Page 4 Querying remote archives Movie Using up to 10 servers – distributed around the world – operating in parallel Sneak preview of grid computing – Locally specify your tasks – Execute them remotely at the data centers – Receive results locally for final combination Hans-Martin Adorf, GAVO Matcher Demo, Page 5 Software demo (#1) Input list – 67 galaxies from FIRST radio catalogue Query – 2 remote archives: SDSS, VizieR – 20 catalogues: radio, infrared, optical, X-ray Task – get counterparts for each input coordinate – gather counterparts to form reasonable matches Hans-Martin Adorf, GAVO Matcher Demo, Page 6 The matching problem (#1) Catalogue #2 Catalogue #3 Catalogue #1 Hans-Martin Adorf, GAVO Matcher Demo, Page 7 The matching problem (#2) Hans-Martin Adorf, GAVO Matcher Demo, Page 8 Matcher workflow Hans-Martin Adorf, GAVO Matcher Demo, Page 9 Metadata Querying and cross-matching requires metadata about catalogues & archives – astrometric fields and associated uncertainties – photometric fields and associated uncertainties – some metadata … … are locally generated and stored … are retrieved from archives in real-time Hans-Martin Adorf, GAVO Matcher Demo, Page 10 Software demo (#2) Issue: false alarms – matching is non-unique – input: 67 sources – output: almost 500 match candidates – many of these match candidates are “false alarms” Hans-Martin Adorf, GAVO Matcher Demo, Page 11 Issue: false alarms (#3) Two fundamental, independent probabilities – Hit probability: p(c|C) – False alarm probability: p(c|not C) Goal – keep the hit probability high (completeness) – while keeping the false alarm probability low – goodness depends on S/N ratio in the data Hans-Martin Adorf, GAVO Matcher Demo, Page 12 Issue: false alarms (#4) Solution: use statistics (``fuzzy’’ matching) – compute statistical (Mahalanobis) distance between counterparts and center position – Compute reliability measure for match candidate (reduced chi-squared) Hans-Martin Adorf, GAVO Matcher Demo, Page 13 Software demo (#3) Lower reduced chi-squared from 10,000 to 3 Hans-Martin Adorf, GAVO Matcher Demo, Page 14 Software demo (#3) Lower reduced chi-squared from 10,000 to 3 Result – Hit-rate is still pretty high – False-alarm rate is dramatically reduced Hans-Martin Adorf, GAVO Matcher Demo, Page 15 Issue: server reliability An archive server – may be down (easy to detect) – may be slow today (more difficult to detect) – may deliver wrong results (spoils the science) Hans-Martin Adorf, GAVO Matcher Demo, Page 16 VO Standards Status – Input CSV files for data XML files for query & match process description – Sending plain HTTP/HTML to archive servers – Receiving CSV file from SDSS SkyServer VOTable from VizieR (VO-Std) – Output VOTable with complete match result (VO-Std) - VOPlot various CSV files Hans-Martin Adorf, GAVO Matcher Demo, Page 17 Software demo (#4) VOPlot Hans-Martin Adorf, GAVO Matcher Demo, Page 18 Plans & Ideas GUI for newcomers – Facilitates selection of catalogues, astrometric & photometric columns, etc. – Generates configuration file for query including server selection for core cross-matcher, including chi-squared limit Automatic monitoring of server response and reliability Improved matching algorithm GUI panel for match candidate visualization Hans-Martin Adorf, GAVO Matcher Demo, Page 19 Summary Shown a working cross-matcher application – Operates with distributed archives queried in parallel Demonstrated that – fuzzy matching is needed – reduced chi-squared is a powerful statistical discriminator High hit-probability, low false-alarm probability GAVO cross-matcher currently being used in a first science application Hans-Martin Adorf, GAVO Matcher Demo, Page 20 Thanks Particularly to the folks – from SkyServer/SDSS, and – from VizieR @ CDS and @ mirror sites, who, with their services, have enabled the crossmatcher Hans-Martin Adorf, GAVO Matcher Demo, Page 21 The end Hans-Martin Adorf, GAVO Matcher Demo, Page 22 Issue: false alarms (#5) Hans-Martin Adorf, GAVO Matcher Demo, Page 23 Issue: false alarms (#6) Hans-Martin Adorf, GAVO Matcher Demo, Page 24 GAVO GAVO I – Funded by BMBF – Started end of 2002 – Ended end of March 2005 GAVO interim – Fundend 50% by Leibniz-prize money 50% by BMBF Hans-Martin Adorf, GAVO Matcher Demo, Page 25 The matching problem (#3) Catalogue #2 Catalogue #3 Catalogue #1 Hans-Martin Adorf, GAVO Matcher Demo, Page 26