* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Seasonal-Interannual Climate Variations, Predictability
Expense and cost recovery system (ECRS) wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Data Protection Act, 2012 wikipedia , lookup
Data center wikipedia , lookup
Data analysis wikipedia , lookup
Forecasting wikipedia , lookup
Information privacy law wikipedia , lookup
3D optical data storage wikipedia , lookup
DATA ACCESS, QUERYING, ANALYSIS AND DATA MINING IN A DISTRIBUTED FRAMEWORK FOR EARTH SYSTEM SCIENCE SUPPORT Menas Kafatos* Center for Earth Observing and Space Research (CEOSR) George Mason University [email protected] http://www.siesip.gmu.edu *on Behalf of the SIESIP Team GeoComputation 99 SCIENCE Seasonal to Interannual Earth Science Information Partner (SIESIP) Science Driver: Seasonal-Interannual Climate Variations, Predictability and Prediction Seasonal-Interannual Climate Multidisciplinary/Interdisciplinary Research •Coupled atmosphere/ocean •Effects on Biosphere •Connection to Hydrological Cycle (tropical rainfall, convection, etc.) Multiple Phenomena •ENSO •Monsoons •Teleconnections (effects at continental & sub-continental levels) •Relation to Droughts, Event-driven Phenomena, etc. Multiple Time Scales •Spans short-scale weather and longer-term climate variability Multi-Agency Data Sets (NASA, NOAA, …) Communities of Scientists (Data Providers and Users) • Input being provided by Advisory Board with representation from S-I, TRMM, NSIPP, SCSMEX & IDS communities SIESIP Management Committee Science Advisory Board Federation Management & Members Interactive Operations Batch Operations SIESIP Federation Architecture User (Web) User (Web) User (Web) Internet GMU Data Orders Exchange Protocols GDAAC COLA Data Orders Data Delivery Other Data Sources (e.g. NOAA) Users Web Browsers Internet SIESIP Central User Interface Engine (Phase 1,2,3 queries) Metadata Storage GMU SIESIP Nodes (Phase 2,3 queries) Inter/intranet Analysis Tools Analysis Tools SIESIP data-mart SIESIP data-mart GMU COLA Inter/intranet SIESIP Archive (Phase 3 queries) GDAAC Internet or other media Data Providers DAACs NOAA Others... VDADC ENGINE (Current GMU Prototype) http://www.ceosr.gmu.edu/~vdadcp VDADC ENGINE SEARCH ENGINE DATA RETRIEVAL GODDARD DAAC SQL Query DISC CD LOCAL STORAGE RDBMS (COTS) QUERY CONVERSION DATA CONVERSIO N User Interface Java Applet WEB BROWSER Result interface (Images, Time Series, etc.) WORLD WIDE WEB Data Center 1 Data Center 2 Data Center N USER Current SIESIP Data Sets Seasonal to Interannual (S-I) Climate Data: Model & Observational Data Sets at COLA (Multiple parameters, station data, precipitation, surface temperature, wind stress, ocean subsurface, etc.) Climate Data at GDAAC (Atmospheric dynamics, hydrology & precipitation, ocean color, DAO, etc.) Tropical Rainfall Measuring Mission (TRMM) Data Subsets South China Sea Monsoon Experiment (SCSMEX) Climate Station Data from UDel: South America & Global (Air temperature, precipitation) Climatology Interdisciplinary Data Collection (CIDC) (CD ROM available) Pentad/Decade: The Climatology Interdisciplinary Five and Ten Day Data Collection Special products (e.g. animation of Sea Surface Temperature Anomalies and Winds in the Tropical Pacific) 1997-98 El Niño Effects on the U.S. 1997-98 El Niño Effects on the U.S. SIESIP Supports SCSMEX Data Analysis SIESIP provides TRMM gridded, satellite coincidence data subsets, and GMS data for Field Campaign, seasonal & inter-annual analyses Data available at http://daac.gsfc.nasa.gov/CAMPAIGN_DOCS /TRMM_FE/scsmex/scsmex.html SIESIP is producing TRMM SCSMEX data CD for international distribution at SCSMEX Science Team’s request Tropical Cyclone Leo, 4/29/99 (TSDIS/GMU Orbit Viewer) Climatology Interdisciplinary Data Collection (CIDC) http://daac.gsfc.nasa.gov/ (click on "Interdisciplinary"under DISCIPLINE SPECIFIC INFORMATION) Comes as a 4-CD-ROM set; in addition, all data is available free by electronic transfer. Over 70 Monthly Mean Global Climate Parameters - Land, Ocean, Sun, Cryosphere, Biosphere, Atmosphere. The CD-ROM set was produced in collaboration with the Center for Earth Observing and Space Research (CEOSR) at George Mason University with GrADS developed at the Center for Ocean Land Atmosphere Studies (COLA). AVERAGE SEASONAL-CYCLE ESTIMATES FOR THE WORLD Archived are: climatologically averaged values of monthly and annual air temperature (T) and total precipitation (P) reinterpolated to a 0.5x0.5 degree grid, their associated cross-validation fields, and the climatic water balance computed at each grid point from T and P. Gridded datasets are archived on the SIESIP site, as well as on "climate.geog.udel.edu" under the userid "siesip" (password available on request) AVERAGE SEASONAL-CYCLE ESTIMATES FOR SOUTH AMERICA Archived are: climatologically averaged values of monthly and annual air temperature (T) and total precipitation (P) interpolated to a 0.5x0.5 degree grid, and their associated crossvalidation fields. Genesis of Available Gridded Datasets a) Average monthly station T and P drawn from station climatology archives, spatially interpolated to each grid. b) Average monthly station T drawn from station climatology archives, spatially interpolated to each grid point using DEM-aided interpolation MONTHLY TIME-SERIES ESTIMATES FOR SOUTH AMERICA Archived are: monthly total precipitation (P) and average air temperature (T) interpolated to a 0.5x0.5 degree grid & associated cross-validation fields. INFORMATION TECHNOLOGY STRATEGY Development of science scenarios to serve particular user communities Web accessibility Development of user queries Integration of tools accessibility with data set accessibility to allow meaningful, user-specified queries Integration of freely/easily accessible analysis tool (GrADS); on-line visualization; data mining (pyramid); with metadata searches (XML and relational data base management systems) Three-Phase Data Access Model Phase 1: A user browses and searches the “static” (or description) metadata and contentbased metadata provided by the SIESIP system Phase 2: The user gets a quick look of the contents of the data through on-line data analysis Phase 3: The user has located the data of interest and then orders the data It is an interactive and iterative process COLA IT: GrADS Integrated User Interface Already in Place for – Selecting, Accessing, and Sampling Data Sets (grids, stations, future - images) – Computing and Deriving New Quantities – Quantitatively Visualizing of Results Designed to Handle Geophysical Data Sets Thousands of Users Worldwide El Niño 1982/83 El Niño Event in March 1983 Sea Surface Temperature Anomaly (SSTA) and Wind Field High values of SSTA are found near the west coast of S. America Trade winds have dissipated Display using GrADS SIESIP: Distributed Seasonal-Interannual Data System (Implementation Example) GrADS Analysis Workbench J-GrADS Datamining Interface MetaData Search ContentBrowsing Analysis Data Order Data Order GUI Class Libraries Class Libraries Applet/Plug-In HTML/CGI Applet/Plug-In HTML GrADS Server Data Order Server Internet GrADS Server NOAA Data GrADS Server NASA Data InterOperability Wrapper User Interface Driver 1 MetaData Server Data Pyramid Server Internet NOAA Server DODS Local SIESIP Data Sets SWIL Data and Metadata Systems on the Internet Outside of SIESIP Metadata Data Pyramid E-R Diagram for SIESIP Phenomenon Instance Phenomenon Predefined Region Parameter Specific Parameter Platform Instrument Contact Data Product Data Format Cell Altitude Coverage Data File Temporal Coverage Cell Value Pyramid Data Model Motivation -- to support the interactive content-based browsing of large volumes of data For example, queries on the statistical properties of the data can be used in a content-based browsing process The challenge in query processing performance for large data volumes Solution -- to speed up query evaluations by precomputing intermediate results which contribute to answering user queries. What kind of precomputations? & How to apply them? Precomputed Data Attributes Query evaluation performance can be improved through precomputation ( i.e. precompute the predefined data attributes which contribute to query evaluations) and approximation ( i.e. query answers could be derived approximately based on the precomputed data attributes) Choosing what kind of precomputed data attributes vary with the types of queries to be answered, which further depend on specific domain applications SIESIP GUI Data Interoperability •SIESIP is one of DODS data server sites. •GrADS has been added to the DODS suite of client software. •DODS data access enabled through SIESIP GUI interface. •COLA ftp data access enabled though SIESIP GUI interface •GrADS as part of DODS server -To manipulate DODS data before transferring -To support more data types and data formats