Download Seasonal-Interannual Climate Variations, Predictability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Expense and cost recovery system (ECRS) wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Big data wikipedia , lookup

Data Protection Act, 2012 wikipedia , lookup

Data model wikipedia , lookup

Data center wikipedia , lookup

Data analysis wikipedia , lookup

Forecasting wikipedia , lookup

Information privacy law wikipedia , lookup

3D optical data storage wikipedia , lookup

Data vault modeling wikipedia , lookup

Business intelligence wikipedia , lookup

Transcript
DATA ACCESS, QUERYING, ANALYSIS AND
DATA MINING IN A DISTRIBUTED
FRAMEWORK FOR EARTH SYSTEM
SCIENCE SUPPORT
Menas Kafatos*
Center for Earth Observing and Space Research (CEOSR)
George Mason University
[email protected]
http://www.siesip.gmu.edu
*on Behalf of the SIESIP Team
GeoComputation 99
SCIENCE
Seasonal to Interannual Earth Science
Information Partner (SIESIP) Science
Driver:
Seasonal-Interannual Climate
Variations, Predictability and
Prediction
Seasonal-Interannual Climate
Multidisciplinary/Interdisciplinary Research
•Coupled atmosphere/ocean
•Effects on Biosphere
•Connection to Hydrological Cycle (tropical rainfall, convection, etc.)
Multiple Phenomena
•ENSO
•Monsoons
•Teleconnections (effects at continental & sub-continental levels)
•Relation to Droughts, Event-driven Phenomena, etc.
Multiple Time Scales
•Spans short-scale weather and longer-term climate variability
Multi-Agency Data Sets (NASA, NOAA, …)
Communities of Scientists (Data Providers and Users)
• Input being provided by Advisory Board with representation from S-I,
TRMM, NSIPP, SCSMEX & IDS communities
SIESIP
Management
Committee
Science
Advisory
Board
Federation
Management &
Members
Interactive Operations
Batch Operations
SIESIP Federation Architecture
User (Web)
User (Web)
User (Web)
Internet
GMU
Data Orders
Exchange Protocols
GDAAC
COLA
Data Orders
Data Delivery
Other Data Sources (e.g. NOAA)
Users
Web Browsers
Internet
SIESIP Central
User Interface Engine
(Phase 1,2,3 queries)
Metadata Storage
GMU
SIESIP Nodes
(Phase 2,3 queries)
Inter/intranet
Analysis Tools
Analysis Tools
SIESIP
data-mart
SIESIP
data-mart
GMU
COLA
Inter/intranet
SIESIP Archive
(Phase 3 queries)
GDAAC
Internet
or other media
Data Providers
DAACs
NOAA
Others...
VDADC ENGINE
(Current GMU Prototype)
http://www.ceosr.gmu.edu/~vdadcp
VDADC ENGINE
SEARCH
ENGINE
DATA
RETRIEVAL
GODDARD
DAAC
SQL
Query
DISC
CD
LOCAL
STORAGE
RDBMS
(COTS)
QUERY
CONVERSION
DATA
CONVERSIO
N
User
Interface
Java
Applet
WEB
BROWSER
Result
interface
(Images,
Time Series,
etc.)
WORLD WIDE WEB
Data Center 1
Data Center 2
Data Center N
USER
Current SIESIP Data Sets
Seasonal to Interannual (S-I) Climate Data: Model & Observational Data Sets at COLA
(Multiple parameters, station data, precipitation, surface temperature, wind stress, ocean
subsurface, etc.)
Climate Data at GDAAC
(Atmospheric dynamics, hydrology & precipitation, ocean color, DAO, etc.)
Tropical Rainfall Measuring Mission (TRMM) Data Subsets
South China Sea Monsoon Experiment (SCSMEX)
Climate Station Data from UDel: South America & Global
(Air temperature, precipitation)
Climatology Interdisciplinary Data Collection (CIDC) (CD ROM available)
Pentad/Decade: The Climatology Interdisciplinary Five and Ten Day Data Collection
Special products (e.g. animation of Sea Surface Temperature Anomalies and Winds in the
Tropical Pacific)
1997-98 El Niño Effects on the U.S.
1997-98 El Niño Effects on the U.S.
SIESIP Supports SCSMEX Data Analysis
SIESIP provides TRMM gridded, satellite
coincidence data subsets, and GMS data for Field
Campaign, seasonal & inter-annual analyses
 Data available at
http://daac.gsfc.nasa.gov/CAMPAIGN_DOCS
/TRMM_FE/scsmex/scsmex.html
 SIESIP is producing TRMM SCSMEX data CD
for international distribution at SCSMEX Science
Team’s request

Tropical Cyclone Leo, 4/29/99
(TSDIS/GMU Orbit Viewer)
Climatology Interdisciplinary Data Collection
(CIDC)
http://daac.gsfc.nasa.gov/
(click on "Interdisciplinary"under DISCIPLINE SPECIFIC INFORMATION)
Comes as a 4-CD-ROM set; in addition, all data is available free
by electronic transfer.
Over 70 Monthly Mean Global Climate Parameters - Land, Ocean,
Sun, Cryosphere, Biosphere, Atmosphere.
The CD-ROM set was produced in collaboration with the Center
for Earth Observing and Space Research (CEOSR) at George Mason
University with GrADS developed at the Center for Ocean Land
Atmosphere Studies (COLA).
AVERAGE SEASONAL-CYCLE ESTIMATES FOR THE
WORLD
Archived are: climatologically averaged values of monthly and annual air temperature (T) and
total precipitation (P) reinterpolated to a 0.5x0.5 degree grid, their associated cross-validation
fields, and the climatic water balance computed at each grid point from T and P.
Gridded datasets are archived on the SIESIP site, as well as on "climate.geog.udel.edu" under
the userid "siesip" (password available on request)
AVERAGE SEASONAL-CYCLE ESTIMATES FOR SOUTH
AMERICA
Archived are: climatologically averaged values of monthly and annual air temperature (T) and
total precipitation (P) interpolated to a 0.5x0.5 degree grid, and their associated crossvalidation fields.
Genesis of Available Gridded Datasets
a) Average monthly station T and P drawn from station climatology archives, spatially interpolated to each grid.
b) Average monthly station T drawn from station climatology archives, spatially interpolated to each grid point using
DEM-aided interpolation
MONTHLY TIME-SERIES ESTIMATES FOR SOUTH AMERICA
Archived are: monthly total precipitation (P) and average air temperature (T) interpolated to a
0.5x0.5 degree grid & associated cross-validation fields.
INFORMATION TECHNOLOGY
STRATEGY
 Development
of science scenarios to serve particular user
communities
 Web accessibility
 Development of user queries
 Integration of tools accessibility with data set accessibility
to allow meaningful, user-specified queries
 Integration of freely/easily accessible analysis tool
(GrADS); on-line visualization; data mining (pyramid);
with metadata searches (XML and relational data base
management systems)
Three-Phase Data Access Model
Phase 1: A user browses and searches the
“static” (or description) metadata and contentbased metadata provided by the SIESIP system
 Phase 2: The user gets a quick look of the
contents of the data through on-line data analysis
 Phase 3: The user has located the data of interest
and then orders the data
 It is an interactive and iterative process

COLA IT: GrADS
 Integrated
User Interface Already in Place for
– Selecting, Accessing, and Sampling Data Sets (grids,
stations, future - images)
– Computing and Deriving New Quantities
– Quantitatively Visualizing of Results
 Designed
to Handle Geophysical Data Sets
 Thousands of Users Worldwide
El Niño
1982/83 El Niño Event in
March 1983
Sea Surface Temperature
Anomaly (SSTA) and Wind Field
High values of SSTA are found
near the west coast of S. America
Trade winds have dissipated
Display using GrADS
SIESIP: Distributed Seasonal-Interannual Data System
(Implementation Example)
GrADS Analysis
Workbench
J-GrADS
Datamining
Interface
MetaData
Search
ContentBrowsing
Analysis
Data Order
Data Order
GUI
Class Libraries
Class Libraries
Applet/Plug-In
HTML/CGI
Applet/Plug-In
HTML
GrADS
Server
Data Order
Server
Internet
GrADS
Server
NOAA Data
GrADS
Server
NASA Data
InterOperability
Wrapper
User
Interface
Driver 1
MetaData
Server
Data Pyramid
Server
Internet
NOAA
Server
DODS
Local
SIESIP
Data Sets
SWIL
Data and Metadata
Systems on the Internet
Outside of SIESIP
Metadata
Data
Pyramid
E-R Diagram for SIESIP
Phenomenon
Instance
Phenomenon
Predefined
Region
Parameter
Specific
Parameter
Platform
Instrument
Contact
Data
Product
Data Format
Cell
Altitude Coverage
Data File
Temporal Coverage
Cell Value
Pyramid Data Model





Motivation -- to support the interactive content-based
browsing of large volumes of data
For example, queries on the statistical properties of the
data can be used in a content-based browsing process
The challenge in query processing performance for large
data volumes
Solution -- to speed up query evaluations by
precomputing intermediate results which contribute to
answering user queries.
What kind of precomputations? & How to apply them?
Precomputed Data Attributes


Query evaluation performance can be improved through
precomputation ( i.e. precompute the predefined data
attributes which contribute to query evaluations) and
approximation ( i.e. query answers could be derived
approximately based on the precomputed data attributes)
Choosing what kind of precomputed data attributes vary
with the types of queries to be answered, which further
depend on specific domain applications
SIESIP GUI
Data Interoperability
•SIESIP is one of DODS data server sites.
•GrADS has been added to the DODS suite
of client software.
•DODS data access enabled through SIESIP
GUI interface.
•COLA ftp data access enabled though SIESIP
GUI interface
•GrADS as part of DODS server
-To manipulate DODS data before transferring
-To support more data types and data formats