Download Data Mining with Oracle using Classification and Clustering Algorithms

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

K-means clustering wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Cluster analysis wikipedia , lookup

Transcript
Data Mining with Oracle
using Classification and
Clustering Algorithms
Presented by Nhamo Mdzingwa
Supervisor: John Ebden
Overview of Presentation

Recap of Proposal

Classification of Data Mining & DM Algorithms

Oracle Data Mining
Data Mining Process
Evaluation of Results
Progress so far
Updated Timeline
Plans





Objective

Investigate two types of algorithms
available in Oracle10g for data mining
(ODM).

Apply the two algorithms to actual data.
 Analyse
&
 Evaluate results in terms of performance.
Classification of Data Mining

Directed data mining/supervised learning
which build a model that describes one
particular attribute in terms of the rest of the
data.

Undirected DM / Unsupervised learning
builds a model to establish the relationships
amongst all the input attributes by grouping.
Classification of Data Mining
algorithms
Input attributes but
have no output
attributes
DM strategies
Unsupervised
learning
Supervised
learning
Input attributes and
output one or more
attributes
Classification
Clustering
k-Means
O-Cluster
Naive Bayes
Model Seeker
Adaptive Bayes
Estimation
Association Discovery
Prediction
Predictive variance
Visualization
Algorithms offered in Oracle10g
classification
1.
2.
3.
Adaptive Bayes Network
Naive Bayes
Model Seeker
clustering
1.
2.
3.
k-Means
O-Cluster
Predictive variance
association rules
1.
Apriori (association rules)
Evaluation of Results
Evaluation of unsupervised learning
models involves determining the level of
predictive accuracy.
 Evaluated using test data sets.
 Compare confidence and support levels of
models created from the same training
data to determine accuracy.

Progress
Literature Survey
 Oracle10g installed on Athena in Hons Lab
 Exploring the Oracle9i and 10g Suite
including JDeveloper
 Member of MetaLink (Oracle’s online support
service)

Updated Timeline
Continuation from literature and
tutorials
done
Investigate Clustering & Classification
done
algorithms (theory)
Find suitable computerised case
studies of the use of above algorithms
– with or without Oracle.
done
Search datasets for testing
(possibilities: AIDS data & faculty data)
In progress
Apply algorithms to data found then
Critically Analyse & assess results
Second semester
Write up paper
September vacation and 3rd term
Final project write up
Due 7/11