The Unacceptable Time Series Water Resources Artificial Intelligence Download

North Slope Decision Support System
Presented at the DataNet Federation Consortium’s DataGrid Sessions, April 2013
Presented by Jack Hampson/Stephen Bourne, Atkins
Bill Schnabel, Amy Tidwell, Univ. Alaska
Kelly Brumbelow, Texas A&M
Stephen Bourne, Leslie Gowdish, Atkins
Research sponsored by
National Energy Technologies Lab,
U.S. Dept. of Energy
NSDSS Overview
Research Topics addressed in NSDSS
Water Resources Artificial Intelligence:
The Unacceptable Time Series
NSDSS Overview
• Decision Support System
used in ice road planning on
the North Slope of Alaska
• Research Grant from US.
Dept. of Energy. 2008-2012.
• Includes Web App and
• Cyberinfrastructure includes
multiple ODM-based
• Cyberinfrastructure includes
catalog service, similar in
concepts to HIS Central
• Includes Model Sharing of
Ice Road Plans, Water
Budget Models, Lake
Dissolved Oxygen Models.
• MS Silverlight App
• Workbench User Experience paradigm. Map is the work surface. Widgets provide
functionality and float over map.
• Home Bar provides access to search, data publishing, ice road planning, environmental
analysis widgets.
NSDSS Web App: Search and Data Exploration
Type in data you
want. Semantic
Mediation handles
not getting the
exact right name
(Rain = Precip = P =
R = Rainfall =…)
Search Area is
current map
Search results are presented in the
Data Explorer Widget.
Results are:
• Field data (from ODM databases)
• Gridded data (from NetCDF
Clicking on a search result item
highlights the sites on the map with
the data you are looking for.
Click the site to select, then click
Chart data in the widget. The data is
presented in the chart.
Data can be adjusted in terms of
time step and statistic.
Data can be downloaded.
NSDSS Web App: A Slope-wide Lakes Database
Click on any lake on the North Slope
and you get information on its name,
size, fish species, and models that
have been generated for it.
NSDSS Web App: Data Publishing
Site Panel:
• View Existing Sites.
• Select Site.
• Add Site.
Variable Panel:
• View Existing Variables.
• Which Variables the
selected site has.
• Add new variables
Data Panel:
• Paste in data from
• Chart data to verify
• Commit Edits to upload
data to ODM.
NSDSS Web App: Lake Water Budgets and DO Models
Select Lake of Interest
• Select any lake on the landscape
• Interested in Lake Water Budget to
understand impact of removing water for
ice road construction
Tool suggests data to use
(see discussion on the unacceptable time series)
Water Budget Model is built through an
interactive tree workflow.
• Four steps.
• As you click on items in the tree, the controls
for entering data are presented in the right
• Here, the Step 2. Enter Inputs has been
clicked. The grid shows the GCM data that
must be collected to do the forecast of water
budget terms (Rain, Temp, etc.). You can chart
the GCM data when it is collected (see right).
• Needs Rain, Snow, Temp, Net Radiation
• Requires GCM data for forecast
• Historic data from Sites for GCM
• Uses cyberinfrastructure and catalog
service to search for data.
NSDSS Web App: Ice Road Planning
Top Routes
• Top ten routes
• Need to consider multiple routes because
there are multiple criteria. Best route for
one criterion might not be best for all.
Ice Road Routing
• Specify Start and End Point and which
lakes to use.
• Algorithm based on behavior of ants in
finding food.
• Finds best routes from start to end points
based on duration of construction,
construction cost, and road travel time.
• Avoids sensitive vegetation, steep grade,
endangered and at-risk species habitats.
Ice Road Construction and Usage Schedule
• Estimates start and end of Tundra season
(ie. when cold enough to build ice roads)
based on historic and GCM-based
forecast of temperature.
• Using multiple historic years and GCM
forecast to estimate uncertainty in
Tundra Seasons start and close.
NSDSS Technology Research Topics
• Subjects researched:
• Maturing Water Resources Search:
Semantic mediation allows us to type “Rain” and get all data related to rain in the search area, whether it’s called Rain or Rainfall, or
R, or P, or whatever.
What if we can type, “Polar Bear dens,” or “Lakes with Water Budget Models”? Can we extend search to understand a broader
range of intent?
• On-the-fly Unit Conversion
• Data Fuzzing
What if you don’t want to say exactly where endangered species are for fear of poaching, but you want to confirm they are present
in the general area?
Became an exemplar case in the NSF Data Conservancy Project.
• The Undiscovered Cyberinfrastructure:
What if the client doesn’t know what databases are out there?
What if the web services methods are different at each database – ie. not standardized?
Can the central catalog not only communicate to the client which database to go to, but how to collect the data from the database?
Developed methods to generalize catalog communications, to inform the client of the name of web method and the parameters
that must be presented in the call to get the data required.
• Net CDF
Can a standardized NetCDF web service and related file-based database be created?
How shall it extract and provide data to the user?
What about publishing NetCDF data – model results, etc?
NSDSS Technology Research Topics (cont)
• Subjects researched:
• Flattening the Model-building Curve: The Model-Any-Lake workflow.
How can we make it much easier to create models for hydrologic features
Is it possible to build a tool where all the lakes in the landscape are presented, and you can build a water budget, or water
quality model for any lake you choose without the need for data collection and processing?
Can all of the needed data be present? Can you quality control the data right within the tool before using it?
• Model Publishing and Sharing
Models saved as xml blobs in database
Models presented in Web App through map. Select a lake, see its models. Use models in later IRP analysis. Makes way for
hydrologists doing the modeling, and planners doing the planning.
All IRP, Water Budget, and DO models saved in the same cyberinfrastructure
Models carry instructions on which data to pull from CI as input
Models carry result data and inter-processing data within xml.
• Using GCMs for forecasting
Can GCM data be integrated into the modeling exercise and used in ensemble forecasting?
Introduced on-the-fly GCM downscaling to ensure GCM data contains local climate signature.
Works within the “model-any-lake” workflow.
• Water Resources Artificial Intelligence
NSDSS Technology Research Topics (cont)
• Subjects researched:
• Water Resources Artificial Intelligence
• Cyberinfrastructure brings data, lots of data!
• Need to start developing tools that “shake the cyber trees” and deliver the best data for the
modeling processing we intend to do.
• See the next few slides for a presentation on the Unacceptable Time Series.
The Unacceptable Time Series
Water Resources Artificial Intelligence
Suggesting Input Data
Models of many kinds require time series input.
With time series standardization, tools can be developed to seek out time
series that match the intent of the modeling exercise from multiple databases
in the cyberinfrastructure – “Shaking the cyber trees”.
Challenge is to add in human quality control at the right times in the process.
Example – North Slope Lake Water Budgets
From the North Slope
Decision Support System
Project – Dept. of Energy
Tool allows you to select
any lake on the North
Slope and create a water
budget model for it
using field observations
data and data from
Tool suggests best sites
by searching an area
around the lake for
Precip, Temp, and Net
Sites are sought
within a 100km
radius of the target
lake. Rainfall, Temp,
Net Radiation, and
Snow Depth
Time Series Discrimination
Often, more than one
time series is found that
will provide data. Which
one is better?
Algorithm balances
distance from lake with
time span of available
data using an Index.
Index = aDistance + bTimeSpan
The highest index time
series is suggested.
Distance to lake and time
span of data is considered
in selecting a time series
Deeper Time Series Investigation
Once time series are specified, the next
step is to collect the data.
Once downloaded, data quality is assessed.
Often, there are gaps in the data, some that
make it impossible to use the data.
To assess the data, the tool provides a
“Check Completeness” option.
In this case, the net radiation data only has
observations from April to December each
year. The 12-month climatology needed for
the water budget can’t be made.
Note that this type of quality control can
only be done if the tools knows the intent of
the model. The model needs observations
in all 12 months of the year, and the QC
routine has to check for this.
Without knowledge of intent, QC has to be
more general, and modeling opportunities
may be lost.
The Do Not Use List
If a time series is deemed
unusable, users can elect to find
another site.
At this point, the model is
added to the “Do Not Use List,”
which is specific to the model.
Then, on suggesting another
site, the tool knows to avoid the
unacceptable time series.