Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Extensible Storage Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Relational model wikipedia , lookup
Clusterpoint wikipedia , lookup
Integrated Grid workflow for mesoscale weather modeling and visualization Zhizhin, M., A. Polyakov, D. Medvedev, A. Poyda, S. Berezin Space Research Institute of the Russian Academy of Sciences Abstract • • • • For the model input and output we use a scalable parallel storage and data mining system called ActiveStorage. It can store different types of weather data, provided they are in the same Command Data Model (UNIDATA CDM): NCEP reanalysis, NCDC stations weather data, MM5 model output. The MM5 is a mesoscale weather forecast model. For the input boundary conditions the model takes basic parameters such as elevation, air pressure and temperature, etc. It can ingest reanalysis and direct observation data. As the output the model provides high-resolution regional weather grids. To make the MM5 input data and the modeling results accessible on the Grid to the Earth Science community, we have developed a set of grid services (resources and activities) inside the OGSA-DAI (both ver. 2 and 3) grid service container. To visualize the weather data we have developed a special plugin for the NASA World Wind which can read the data directly from the OGSA-DAI resources and plot it over the 3D globe in different ways, such as contour lines, filled areas and vector fields. Active Storage, Modeling, Data Mining and Visualization Services Weather observations and reanalysis time series Geographical information: elevation, hydrology, ... Active Storage Common Data Model Microsoft SQL Server Cluster Numerical Modeling Raw data Input Model output OGSA-DAI and Matlab API Time series Grids Trajectories Derived products from satellite data Data Analysis Environmental Scenario Search Engine (ESSE) Trend and change detection algorithms Ge Nu Ra o g me w d ra p ri c a t hic al m a al inf odel o rm s ati on Trends and relations Parallel mesoscale meteorologial model MM5 Windows Compute Cluster + MPI parallelization Visaulization Microsoft Virtual Earth NASA World Wind EVL UIC Scalable Graphics Environment (SAGE+SAIL) ActiveStorage • ActiveStorage is a generic storage for arrays of primitive data types. • Its data model is based on the Unidata’s Common Data Model, used in netCDF, HDF5 and OpenDAP. • Basically, ActiveStorage is a SQL Server database with CLR stored procedures and a client library. • The stored procedures and the client library provide an abstraction layer for data access. • Large arrays are split into chunks and can be spread across several parallel database servers for better performance. ActiveStorage components Data and directory tables Metadata tables SQL Server 2005/2008 DB Client library Stored procedures Common Data Model Dataset -name Group -name DataType Dimension Attribute -name -length -name -value -dataType Variable -char -byte -short -int -long -float -double -String -name -shape -dataType This is the Common Data Model (CDM) used in the recent versions of OpenDAP, netCDF and HDF5. Its purpose is the representation of multidimensional scientific data. How it works 1. Pass multi-dimensional data request to the client library 2. Issue commands to the database server Application Client library 4. Assemble the data parts into one multi-dimensional array SQL Server DB 3. Return the data parts to the client library 3. Select the requested data from several chunks Parallel query processing SQL Server DB 1 Application Client library SQL Server DB 2 Parallel query performance 1 database server 4 parallel database servers NCEP/NCAR Weather Reanalysis • Continually updating gridded data set • Incorporates observations and global climate model output • 74 weather parameters • 5000 netCDF files, 30 – 500 MB each Time coverage: Grids: • 1948 – 2008 • Regular grid, 2.5 x 2.5 degrees • 4-hourly values • T62 Gaussian grid, 192 x 94 points. NCDC Integrated Surface Database Fixed ground stations Ships Mobile stations Buoys • 1901 – 2008 time coverage. • 470 000 ASCII files packed with gzip. • 30 million sensors. • 50 GB packed; 400 GB unpacked. • 1.7 billion observations. When you’ve downloaded and unpacked the data... Control data section Mandatory data section Section marker Additional data section 0189010020999992007022817004+80050+016250FM-12+000899999V0202201N008019999999N0090001N1+00631+00541098651ADDGA1031+003009999KA1120N+99999... date time lat lon Group marker Parameter group MATLAB script using ActiveStorage library import ru.wdcb.mdb.NcConnector import com.microsoft.sqlserver.jdbc.SQLServerDriver s= 'jdbc:sqlserver://localhost:1433;databaseName=NCEP_01;user=g uest;password=guest'; connector = NcConnector(); ncid = connector.nc_open(s,0); varid = connector.nc_inq_varid(ncid,'air'); origin = [0 0 10 10]; size = [80000 1 1 1]; stride = [1 1 1 1]; A = connector.nc_get_vars_short(ncid,varid,origin,size,stride); plot(A, 'DisplayName', 'A', 'YDataSource', 'A'); figure origin = [0 0 0 0]; size = [1 1 73 144]; stride = [1 1 1 1]; B = connector.nc_get_vars_shortm(ncid,varid,origin,size,stride); B = reshape(B,[73 144]); imagesc (B); figure(gcf); Environmental Data Service: OGSA-DAI plugin Tomcat NCEP database Clients getProperty: sources DAI sources list MM5 weather model getMetadata SPIDR databases Metadata XML MS Excel getXMLData Active Storage NetCDF file serialisation NWS database User data XML getNetCDFData URL to NetCDF file Dataexport NetCDF file Any client Activities for data export • XML output stream – We have plugin for NASA World Wind to visualize XMLformatted data – Can easily be transformed using XSLT to web page or another XML document, e.g. MS Excel – Can be used as input for ESSE fuzzy logic search engine • NetCDF binary data file – Standard for scientific data storage in files – There are several visualization programs for NetCDF – Compatible with Unidata Common Data Model standard Data flow management by OGSA-DAI OGSA-DAI query from single data source OGSA-DAI query from distributed data sources Parallel mesoscale weather model MM5 Same Source Parallel MM5 • Source code for the parallel MPI and the single process MM5 model are the same • Automated parallel code generation from MM5 sources by ANL: – FLIC compiler – RSL library for model domain segmentation and message exchange • We have ported MM5 code to the MS Windows Server 2008 HPC platform MM5 model as a grid client Visualizing data from ActiveStorage with NASA WorldWind A NASA WorldWind plugin, developed at the Moscow State University allows to retrieve data from ActiveStorage via an OGSA-DAI service. Several kinds of visualization are available: - isolines - color map - vector field OGSA-DAI services can be used by other applications to retrieve data from ActiveStorage NASA World Wind as a grid client Using OGSA-DAI services and a special API plugin, the NASA World Wind can visualize both the MM5 input and output datasets