Download Real - Time Mining of Integrated Weather Information

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

K-means clustering wikipedia , lookup

Transcript
NSF Medium ITR
Real-Time Mining of Integrated Weather
Information
Setup meeting (Aug. 30, 2002)
[email protected]
Goals
Develop dynamic data mining applications
(wherein information is extracted and provided to
forecasters in real-time).
Develop applications of radar data to identify
severe weather signatures in a probabilistic
manner.
Build a prototype system so that these
applications can be developed and tested on realtime and on archived data sets.
Tasks
Projects
Dual-polarization algorithms
Clustering and Prediction
Vortex Identification
Areas of IT research
SVMs (identification & prediction), multivariate feature
identification techniques, probabilistic feature extraction, high
performance issues
I will talk about the tasks: we can decide the
applicable areas as a group.
Funding
Funded at 300K for the first year.
May get $650K over the next two years.
We need to show results at the end of the year, so
it is good to know what the reviewers liked and
did not like about our proposal.
Negative Reviews (NSF)
Unfocused
No high-performance computing or numerical
simulations
Real-time not explicitly defined
Budget way too high
No human-factors expertise
No details of how the approach could solve the
problem.
Reviewers liked these
Develop sensor compensation techniques for
faulty sensors
Strong application focus on a complex domain
Experience with disseminating systems and
WSR-88D algorithms
We seem to have been funded based on what we
have done before, rather than on the merits of this
particular proposal.
From the
th
6
reviewer
Extend their previous working system (WDSS)
with the following features:
integrating multiple sources of data
learning in real-time, thus improving the prediction capabilities
using statistics-based instead of heuristics-based decisions.
Use of these methodologies for teaching purposes, as well as
the dissemination of this software to other research laboratories
and the creation of a common research tool
Also from the
th
6
reviewer
Could have been improved:
the proposal seems to be an enumeration of different
techniques, without any justification of why these methods
have been chosen instead of other ones.
detailed explanations are sometimes missing.
My recommendation is to fund this proposal, but at a lower
level than the one proposed by the investigators.
Tasks
Three tasks:
Vortex Detection
Clustering and prediction
Polarimetric Radar
Real-Time
Classical: data periodicity (keep up with data).
Hard to define for multi-sensor applications
If you have a 3-radar domain, with a new elevation scan every
30 seconds, you get a new updated virtual volume on average
every 10 seconds. Is periodicity 10 seconds?
Lightning strikes are essentially asynchronous.
Proposed: based on required lead-time
Example: average lead-time for a tornado warning is 11
minutes. We could set as a goal, predicting tornadoes 20
minutes into the future. If we can do it with data from 30
minutes ago, then, we have 10 minutes to process data.
Keep mind that the forecasts have to be continuous. We have to
make runs once every 10 minutes.
Task 1: Vortex Detection
At the end of this year, aim to have a vortex
identification and prediction technique that:
Uses data from multiple sensors
Uses some novel data (more on this follows)
Accomodates for faulty information
Is capable of better skill than MDA/TDA
Is capable of providing more lead-time to a forecaster.
Decision Support System: provide forecaster with
rationale for all suggested decisions.
Current MDA/TDA
Mesocylone detection technique
find 2D detections by analyzing azimuthal shear
associate them based on rank and time into 3D circulation
features if they meet some strength thresholds
3D circulations that meet depth, base and strength criteria are
classified as mesocyclones.
Problems with current vortex
algorithms
Defined on radial velocity field.
Single radar
Simple use of radar reflectivity (>0 dBZ)
Mesocyclone spatial extent based on radial
velocity values, which are noisy
How can we improve it?
Use of LLSD
One promising source of data is a linear leastsquares fit of radial velocity in the neighborhood
of a gate.
The size of the neighborhood depends on the
range from the radar.
Fit to a linear combination of azimuth and range
Coefficient for azimuth is an estimate of
azimuthal shear
Coefficient for range is the divergence.
LLSD usage
Azimuthal shear field
Boundaries
Tornadoes frequently
happen at the
boundaries between air
masses
Not necessary
Image shows dryline boundary
Image processing for
boundaries to detect
gust-fronts would be
useful.
Input Sources
The LLSD has never been used in vortex
detection. Unlike the raw radial velocity, it can be
combined from multiple radar.
Also have satellite data from spatial domain
Have national/region lightning data.
The Near Storm Environment (RUC model)
Still need to assimilate LLSD and reflectivity data
from multiple radar in a fault-tolerant manner.
(Can now do fault-tolerant time-based merges).
Learning
Add a learning component
Incorporate warnings issued by forecaster into the learning by
the algorithm.
Warnings can be faulty. Different forecasters have different
skills. Therefore, this has to be achieved by the algorithm
learning on the fly.
Validate the algorithm against storm reports. The verification
data is noisy. Have to come up with robust ways of doing this
verification.
Data: status
The WDSS-II system already ingests radar data
from multiple radars and national/regional
lightning data.
Work is underway to ingest satellite data in realtime (archived cases can be done already).
We have archived warnings and RUC data since
April of this year.
Currently testing process to compute LLSD at
different scales.
RUC model data needs to be ingested.
Discussion
What kinds of techniques are appropriate for vortex
detection?
Multiple-sensor reflectivity, LLSD
RUC model data (in Lambert projection)
Multivariate analysis
Gust-front detection
Task 2: Clustering and Prediction
Currently there are two ways to identify storms:
Heuristic threshold-based technique that operates on radial
reflectivity field.
Texture segmentation method.
Once identified, the storms are predicted by:
Matching centroids of storms identified and linear
extrapolation
Find motion estimate by minimizing mean absolute error on
actual field. Then, forecast.
SCIT / kmeans
The centroid and threshold-based technique
called SCIT (storm cell identification and
tracking) is used on the WSR-88D.
The texture segmentation and error-field
minimization technique is being worked on.
I will show the results from the second technique
because the first technique predicts only centroid
location. (We want to do field forecasts).
Kmeans
Kmeans
These clusters are actually found at different
scales.
The clusters are used as the domain within which
the error minimization done (the kernel that is
moved around in the previous frame).
And using these, a motion estimate (“wind field”)
is obtained at different scales.
Motion, Prediction
Performance
Compared to a
persistence forecast.
Skill at predicting the
location of 30dBZ or
higher values.
Clutter at the end of
sequence. (Random
data are assigned
motion estimates)
Ideas for future work
Drawbacks with current approach:
Operates on radar or on satellite, not on both.
Can not handle faulty data (as with clutter)
Use multiple inputs in deriving motion estimates:
Storm core movement (as the technique does)
Dual-doppler wind field retreival (?)
Wind-field estimates from mesoscale model (RUC)
Discussion
Why go through wind-field estimate (and not
directly to a forecast)?
To allow forecast of fields other than the input.
Physically reasonable assimilation.
Better ways of identifying storms.
Better ways of predicting location and values
(field forecast).
Task 3: Polarimetric Radar
Algorithms
Essentially open field for research.
Currently only one AI algorithm: a hydrometeor
classification algorithm.
Low-hanging fruit: a hail-size estimation
technique.
Hail Size Estimation
Currently done on Doppler radar (algorithm to
compute field of hail size estimates in WDSS-II
already).
High reflectivity data aloft are assumed to
produce hail.
Polarimetric radar provides way of identifying
hail near the surface (via aspect ratio).
Come up with way to estimate hail size.
Learning
Train the technique on actual hail reports (which
are noisy).
Problems with polarimetric radar include
calibration errors. Techniques have to account for
this.
Use the polarimetric hail-size estimation
technique to improve the predicted hail-size from
the Doppler-based method.
Contacts
People at CIMMS/NSSL who can advise on each
of these tasks:
Vortex Detection: Greg Stumpf, Travis Smith
[email protected][email protected]
Clustering/Prediction: V Lakshmanan, Bob Rabin
●
[email protected][email protected]
Polarimetric Radar: Terry Schuur
●
●
[email protected]