Download Spatial Sequential Pattern Mining for Seismic Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Geographic information system wikipedia , lookup

Gene prediction wikipedia , lookup

Theoretical computer science wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Probabilistic context-free grammar wikipedia , lookup

Data analysis wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Multidimensional empirical mode decomposition wikipedia , lookup

Pattern language wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Data assimilation wikipedia , lookup

Corecursion wikipedia , lookup

Pattern recognition wikipedia , lookup

Transcript
Spatial Sequential Pattern Mining
Spatial Sequential Pattern Mining
for Seismic Data
Riccardo Campisano, Fabio Porto,
Esther Pacitti, Florent Masseglia, Eduardo Ogasawara
Oct 05th, 2016
CEFET/RJ
National
Laboratory
Scientific
Computing
Spatial-time series
●
A large number of different applications collects observed events
data in the form of time series.
●
In many of them, observed events are space related, resulting in
even larger spatial-time series:
–
Astronomical data
–
Seismic data
–
Climatic data
(e.g. temperature changes
are time and latitude related)
2
Spatial-time series
●
Analysis of such data is a challenge due:
–
large volume of data
–
difficulty to perform matches between continuous
(non discrete) observed value [Fu, 2011]
–
complex relationship between spatial and time dimensions
[Han et al., 2007]
3
Real world seismic use-case
●
Seismic dataset of F3 block in the Dutch sector of the North Sea.
4
Real world seismic use-case
●
Data is obtained producing shots on the terrain surface,
provoking waves that are reflected by different subsoil materials
and collected from receivers located at specific locations.
5
Real world seismic use-case
●
The results is a 3D data cube of observations collected at
different positions of the terrain surface, composed of 2D vertical
sections called inlines and crosslines that corresponds to the E,Z
plan and the N,Z plan in the figure, respectively.
Crosslines
Inlines
6
Real world seismic use-case
●
Each inline and crossline are composed of vertical time series.
Time series
at specific
spatial positions
Inline 100
7
Real world seismic use-case
One interesting
analysis aim is the
detection of
faults and horizons,
i.e., zones of major
unconformities that
corresponds to
specific
geologic boundaries
[Zhou, 2014].
Inline 100
8
Methodology
●
●
Sequence pattern mining is used successfully to obtain insight
from large volume of transactional databases.
Scope of this work is the use of such technique to discover
sequential patterns on seismic spatial-time series:
–
–
–
indexing technique used
to discretize the input
adapted algorithm implemented to
retrieve discovered patterns positions
results are presented
over original seismic trace images
to better evaluate the quality of results
1) Discretization
2) Sequential pattern
mining
3) Visualization
9
1) Discretization
●
Time series contains continuous (non discrete) values.
●
Is not possible to find patterns performing an exact match
between items of such sequences.
●
SAX indexation [Lin et al., 2003] was applied to convert
continuous values to a discrete symbolic representation.
10
1) Discretization
Translated figure
of Inline 100
Portion of original seismic dataset
SAX converted data
11
2) Sequential pattern mining
●
●
●
A sequential pattern algorithm was adapted to performs sequence
mining in spatial-time series dataset.
It uses of the Apriori Principle: if a set of items is frequent, any of
its subset is frequent too.
itemsets of size k-1 → itemsets of size k
12
3) Visualization
●
For each detected frequent sequence the algorithm provide all
the positions where the sequence was encountered.
●
With this associated positions is possible to visually represents
the match positions of each sequence and this allow a
supervised evaluation of the quality of the results.
13
Results
alphabet-size: 25
min-support: 70%
max-stretch: 2
Sequence:
<a,a,y,y,>
Several horizon
segments detected
Inline 100
14
Results
alphabet-size: 25
min-support: 80%
max-stretch: 10
Sequence:
<y,y,y,a,a>
Continuous
horizons
detected
Inline 401
15
Conclusions
●
The algorithm was able to detect sequential patterns
in spatial-time series database.
●
The position of detected frequent sequences
follows some of the geologic boundaries of the subsoil.
●
However, the large number of algorithm results
make difficult to select interesting patterns.
●
Moreover, a sequence contained in all the spatial-time series
is not so surprising to be found.
16
Future works
●
Ranking
can be used
to prioritize
interesting results.
●
Tight patterns
can be used
to detect faults.
<a,a,y,y> sequence positions at inline 100
17