Download Paper Title (use style: paper title)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
INTERNATIONAL JOURNAL OF TECHNOLOGICAL EXPLORATION AND LEARNING (IJTEL)
www.ijtel.org
A New Approach for the Dynamic Association Rule
Mining Algorithm
Niharika Dhakad
Dr Pratima Gautam
Computer Science & Engineering
AISECT University,
Bhopal, India
Computer Science & Engineering
AISECT University
Bhopal, India
Abstract— Business information received from advanced
data analysis and data mining is a critical success factor
for companies wishing to maximize competitive advantage.
The use of traditional tools and techniques to discover
knowledge is ruthless and does not give the right
information at the right time. Data mining should provide
tactical insights to support the strategic directions. In this
paper, we introduce a dynamic approach that uses
knowledge discovered in previous paper. The proposed
approach is to be effective for solving problems related to
the efficiency of handling database updates, accuracy of
data mining results, gaining more knowledge and
interpretation of the results, and performance. Our results
do not depend on the approach used to generate item sets.
In our analysis, we have used an FP-like approach as a
local procedure to generate large item sets. We prove that
the Dynamic Data Mining algorithm is correct and
complete.
Keywords:-FP Growth; Dynamic Association Rule Mining;
Data Mining.
I.
INTRODUCTION
“Data mining refers to extracting or “mining” knowledge
from large amounts of data”. Data mining should have been
more appropriately named knowledge mining from data. There
are many other terms carrying a similar or slightly different
meaning to data mining, such as knowledge mining from
databases, knowledge extraction, data/pattern analysis, data
archaeology, and data dredging. Data mining involves the use
of sophisticated data analysis tools to discover previously
unknown, valid patterns and relationships in large data sets.
These tools can include statistical models, mathematical
algorithms, and machine learning methods (algorithms that
improve their performance automatically through experience,
such as neural networks or decision B-trees). Consequently,
data mining consists of more than collecting and managing
data, it also includes analysis and prediction.
II. DATA MINING TECHNIQUES
Data mining is the task of discovering interesting patterns
from large amounts of data where the data can be stored in
databases, data warehouses, or other information repositories.
It is also popularly referred to as knowledge discovery in
IJTEL, ISSN: 2319-2135, VOL.2, NO.5, OCTOBER 2013
databases (KDD). data mining engine which consists of a set of
functional modules for tasks; pattern evaluation module which
interacts with the data mining modules so as to focus the search
towards interesting patterns; and graphical user interface which
communicates between users and the data mining system,
allowing the user interaction with system[6]. Data mining tasks
have the following categories:
A. Class description
It can be useful to describe individual classes and concepts
in Summarized, concise, and yet precise terms.
B. Association analysis
It is the discovery of association rules showing attributevalue conditions that occur frequently together in a given set of
data.
C. Classification
It is the process of finding a set of models that describe and
distinguish data classes or concepts, for the purpose of being
able to use the model to predict the class of objects whose class
label is unknown. The derived model is based on the analysis
of a set of training data, and can be represented in forms like
Classification rules, decision trees.
D. Cluster analysis
Clustering analyzes data objects without consulting a
known class label. In general, the class labels are not present in
the training data simply because they are not known to begin
with. The objects are clustered or grouped based on the
principle of maximizing the intra-class similarity and
minimizing the inter-class similarity.
E. Outlier analysis
Outliers are data objects that do not comply with the
general behavior of model of the data. Outliers may be detected
using statistical tests or using distance measures.
F. Evolution analyses
It describes and models trends for objects whose behaviors
changes over time. It normally includes time-series data
analysis, sequence or periodicity pattern matching, and
similarity-based data analysis [7]
347
INTERNATIONAL JOURNAL OF TECHNOLOGICAL EXPLORATION AND LEARNING (IJTEL)
www.ijtel.org
been reported for data mining process. Some of these assumed
that this is possible for Dynamic data mining process. Up-todate most of the data mining projects have been dealing with
verifying the actual data mining concepts. Since this has now
been established most researchers will move into solving some
of the problems that stand in the way of data mining, this
research will deal with such a problem, in this case the research
is to concentrate on solving the problem of using data mining
dynamic databases.
V.
Figure 1. Dynamic data mining procedure
DDM technology leads to high forecasting accuracy, as
shown in multiple business cases. Additionally, an important
benefit of Dynamic Data Mining technology is provided by its
analysis capabilities. These consist of methods to analyze the
patterns in the data and the strengths and weaknesses of the
current forecasts. They allow the user to "look inside the black
box" to learn more about the data and the forecasting
difficulties which a customer faces. It is important to note that
DDM does not consist of one single algorithm or one single
step of data processing; rather, it consists of several
components, each of which is important in obtaining good
prediction results, and it is the combination of multiple
processing components that gives DDM its power.
III. STATIC DATA MINING PROCESS
Data mining process is a step in Knowledge Discovery
Process consisting of methods that produce useful patterns or
models from the data [2]. Some problems might occur because
of duplicate, missing, incorrect, outliers’ values, and sometimes
a need to make some statistical methods might arise as well,
even though when the problem was known, and correct data is
available as well.
The KDD procedures are shown below in a way to help us
focus on data mining process. It includes five processes: 1)
Defining the data mining problem, 2) Collecting the data
mining data, 3) Detecting and correcting the data, 4) Estimating
and building the model, 5) Model description and validation, as
seen in Figure.1 [3].
Figure 2. Data mining process
IV.
DYNAMIC DATA MINING PROCESS
As mentioned earlier many researchers and developers have
specified a process model designed to guide the user through a
sequence of steps that will lead to good results. Many have
IJTEL, ISSN: 2319-2135, VOL.2, NO.5, OCTOBER 2013
RELATED WORK
On Dynamic Content Association rule mining aims to
explore large transaction databases for association rules.
Classical association Rule Mining (ARM) model assumes that
all items have the same significance without taking their
weight into account. It also ignores the difference between the
transactions and importance of each and every itemsets. But,
the Weighted Association Rule Mining (WARM) does not
work on databases with only binary attributes. It makes use of
the importance of each itemset and transaction. WARM
requires each item to be given weight to reflect their
importance to the user. The weights may correspond to special
promotions on some products, or the profitability of different
items.
.
VI.
PROPOSED WORK
We propose a new solution to this problem, called Dynamic
Data Mining (DDM).
1. The propose method shown the effective method for
handling the large database ,accuracy of data mining results,
gaining more knowledge and interpretation of the results, and
performance.
2.
Tn is the large and emerged item set
3.
The item which is less than support is declined item
4.
Find out minimum support
5.
Find out the count value of larged item set
6.
Than calculate the support value
set
7. Using this support value we calculate emerged
itemset, larged itemset, and declined item set
8.
Now apply apriori algorithm
9.
For all transaction t belongs to Tn
348
INTERNATIONAL JOURNAL OF TECHNOLOGICAL EXPLORATION AND LEARNING (IJTEL)
www.ijtel.org
10. Our results do not depend on the approach used to
generate itemsets. In our analysis, we have used an Frequent
pattern -like approach as a local procedure to generate large
itemsets.
11. We are trying to prove that our approach is efficient
for finding the frequent itemset.
VII. CONCLUSION
In our approach, we dynamically update knowledge
obtained from the previous data mining process. Transactions
domain is treated as a set of consecutive episodes. In our
approach, information gained during a current episode depends
on the current set of transactions and that discovered
information during the previous episode. Finally, we have
proved that the Dynamic Data Mining algorithm is correct. As
a future work, the Dynamic approach will be tested with
different datasets that cover a large spectrum of different data
mining applications, such as, web site access analysis for
improvements in e-commerce advertising, fraud detection,
screening and investigation, retail site or product analysis, and
customer segmentation.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
P.Velvadivu1,and Dr.K.Duraisamy2,Lecturer, Department of Computer
Technology and Applications, Coimbatore Institute of Technology,
Coimbatore 2010.
Dynamic Data Mining: Exploring Large Rule Spaces by Sampling:
Sergey Brin and Lawrence 2008
Fast Online Dynamic Association Rule Mining: Yew-Kwong Woon.
Wee-Keong Ng. Amitabha Das. Nanyang Technological University.
Integrating Dynamic Data Mining with Simulation Optimization M
Better, F Glover, M Laguna - IBM journal of research and 2007 ieeexplore.ieee.org
A Weighted Association Rule Mining on Dynamic Content P Velvadivu
- 2010 - Cited by 2 - Related articles IJCSI International Journal of
Computer Science Issues, Vol. 7, Issue 2, No 5, March 2010..
KDD and more presented by Susan Imberman 2010.
Approximation
Algorithms
for
Classification
www.cs.cornell.edu/HOME/KLEINBER.
Algorithm
for
clustering
data
homepages.inf.ed.ac.uk/rbf/BOOKS/JAIN/Clustering_Jain_Dubes
IJTEL, ISSN: 2319-2135, VOL.2, NO.5, OCTOBER 2013
349