Download Satish Tadepalli (VA Tech)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Ontology Driven Data
Mining
Satish Tadepalli
Dept. Of Computer Science
Virginia Tech
A.K. Sinha
Dept. Of Geo Sciences
Virginia Tech
Ontology-Driven Data Mining

Data Mining:
– Analysis of observational data sets to find
unsuspected relationships and to summarize the
data in novel ways

Ontology
– Represents domain knowledge
– Relationships between concepts in a domain

Ontology-driven data mining
– Use the knowledge represented by ontologies to
create a hierarchical structure in the data
– Apply data mining techniques on the structured
data sets
GeoROC Database
(http://georoc.mpch-mainz.gwdg.de/)
GeoROC Data and Present Tectonic
Setting
Broad tectonic classification of GeoROC Data set for
applying Data mining Techniques
Classes
· Convergent
Margins
· Continental
Flood Basalts
· Ocean Basin
Flood Basalts
· Ocean Island
Groups
· Ocean Island
Plateaus
· Others
Subclasses
(Location-based)
· Tonga
· New Zealand
· Papua New
Guinea
· Central America
· Others
Attributes
(Chemical/Isotope)
· SiO2
· Al2O3
· MnO
· Sr87/Sr86
· Others
Structuring the data sets based on
ontology
Correlation Analysis
Correlations in Continental Covergent Margins
Correlations in Oceanic Convergent Margins
1
1
0.8
0.8
0.6
0.6
0.4
Si-K
Si-Na2O
Si-Fe
0.2
0
-0.2
Cascades
Andean
Both
0.4
0.2
0
-0.2
-0.4
-0.4
-0.6
-0.6
-0.8
-0.8
-1
-1
Tonga
Mariana
Both
Si-K
Si-Na2O
Si-Fe
Classification Using Neural
Networks
Present day Plate Tectonic settings and associated data are
the key to recognizing paleo-tectonic settings of rocks.
Ongoing Research
Data mining of spatial data sets using
Gaussian processes
 Sparse data mining

Conclusion

Ontology driven data mining
– Meaningful patterns at multiple levels of abstraction
– Multiple views of same data set
– Ease in choosing the relevant data sets for
comparison