Download Improved Data mining approach to find Frequent Itemset

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

The Measure of a Man (Star Trek: The Next Generation) wikipedia , lookup

Pattern recognition wikipedia , lookup

Data (Star Trek) wikipedia , lookup

Time series wikipedia , lookup

Transcript
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 1, Issue 2, July – August 2012
ISSN 2278-6856
Improved Data mining approach to find
Frequent Itemset Using Support count table
Ramratan Ahirwal1, Neelesh Kumar Kori2 and Dr.Y.K. Jain3
1
Samrat Ashok Technological Institute
Vidisha (M. P.) 464001 India
2
Samrat Ashok Technological
Vidisha (M. P.) 464001 India
3
Samrat Ashok Technological Institute
Vidisha (M. P.) 464001 India
Abstract: Mining frequent item sets has been widely studied
over the last decade. Past research focuses on mining
frequent itemsets from static database. In many of the new
applications mining time series and data stream is an
important task now. Last decade, there are mainly two kinds
of algorithms on frequent pattern mining. One is Apriori
based on generating and testing, the other is FP-growth based
on dividing and conquering, which has been widely used in
static data mining. But with the new requirements of data
mining, mining frequent pattern is not restricted in the same
scenario. In this paper we focus on the new miming
algorithm, where we can find frequent pattern in single scan
of the database and no candidate generation is required. To
achieve this goal our algorithm employ one table which
retain the information about the support count of the itemset
and the table is virtual for static database means generated
whenever required to generate frequent items and may be
useful for time series database. So our algorithm is suitable
for static as well as for dynamic data mining. Result shows
that the algorithm is useful in today’s data mining
environment.
Keywords: Apriori, Association Rule, Frequent Pattern,
Data Mining
1. INTRODUCTION
Mining data streams is a very important research topic
and has recently attracted a lot of attention, because in
many cases data is generated by external sources so
rapidly that it may become impossible to store it and
analyze it offline. Moreover, in some cases streams of
data must be analyzed in real time to provide information
about trends, outlier values or regularities that must be
signaled as soon as possible. The need for online
computation is a notable challenge with respect to
classical data mining algorithms [1], [2]. Important
application fields for stream mining are as diverse as
financial applications, network monitoring, security
problems,
telecommunication
networks,
Web
applications, sensor networks, analysis of atmospheric
data, etc. The innovation in computer science have made
it possible to acquire and store enormous amounts of data
digitally in databases, currently giga or terabytes in a
single database and even more in the future. Many fields
and systems of human activity have become increasingly
Volume 1, Issue 2 July-August 2012
dependent on collected, stored, and processed
information. However, the abundance of the collected
data makes it laborious to find essential information in it
for a specific purpose. Data mining is the analysis of
(often large) observational datasets from the database,
data warehouse or other large repository incomplete,
noisy, ambiguous, the practical application of random
data to find unsuspected relationships and summarize the
data that are both understandable and useful to the data
owner. It is a means that data extraction, cleaning and
transformation, analysis, and other treatment models, and
automatically discovers the patterns and interesting
knowledge hidden in large amounts of data, this helps us
make decisions based on a wealth of data. Information
communication mode of software development lies in
how to collection, analysis, and mine out the hidden
useful information in the various data from information
communication between developers and the staff
interaction with manages, and then used the knowledge to
make decision.
oustead College uses database technology to manage the
library currently. Its main purpose is to facilitate the
procurement of books, cataloging, and circulation
management. In order to better satisfy the needs of
readers, we must to explore the needs of readers, to
provide the information which they need initiatively.
Most current library evaluation techniques focus on
frequencies and aggregate measures; these statistics hide
underlying patterns. Discovering these patterns is the key
that use library services [3]. Data mining is applied to
library operations [4].With the fast development of the
technology and the more requirements of the users, the
dynamic elements in data mining are becoming more
important, including dynamic databases and the
knowledge bases, users' interestingness and the data
varying with time and space. I order to solve the problems
such as low effectiveness; high randomness and hard
implementation in dynamic mining, more research on
dynamic data mining have been done. In [5][6] , an
evolutionary immune mechanism was proposed based on
the fact that the elements involved in the domains could
be modeled as the ones in immune models. It focused on
how to utilize the relationship between antigens and
antibodies in a dynamic data mining such as an
Page 195
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 1, Issue 2, July – August 2012
ISSN 2278-6856
incremental mining. However, the sole immune
mechanism and relative algorithm runs more effectively
only on incremental situations rather than on others. Its
performance and function have to be improved when used
in more complex and dynamic environments like Web.
We provide here an overview of executing data mining
services and association rule. The rest of this paper is
arranged as follows: Section 2 introduces Data Mining
and KDD; Section 3 describes about Literature review
Section 4 shows the description of proposed work Section
5 result analysis of the algorithm and proposed work.
Section 6 describes the Conclusion and outlook.
2. DATA MINING AND KDD
Generally, data mining (sometimes called data or
knowledge discovery) is the process of analyzing data
from different perspectives and summarizing it into useful
information - information that can be used to increase
revenue, cuts costs, or both. Data mining software is one
of a number of analytical tools for analyzing data. It
allows users to analyze data from many different
dimensions or angles, categorize it, and summarize the
relationships identified. Technically, data mining is the
process of finding correlations or patterns among dozens
of fields in large relational databases. There are several
algorithm are devised for this.[5]The process is shown in
Figure 1.
Although data mining is a relatively new term, the
technology is not. Companies have used powerful
computers to sift through volumes of supermarket scanner
data and analyze market research reports for years.
However, continuous innovations in computer processing
power, disk storage, and statistical software are
dramatically increasing the accuracy of analysis while
driving down the cost.
At an abstract level, the KDD field is concerned with the
development of methods and techniques for making sense
of data. The basic problem addressed by the KDD process
is one of mapping low-level data (which are typically too
voluminous to understand and digest easily) into other
forms that might be more compact (for example, a short
report), more abstract approximation or model of the
process that generated the data), or more useful (for
example, a predictive model for estimating the value of
future cases). At the core of the process is the application
of specific data-mining methods for pattern discovery and
extraction.
The traditional method of turning data into knowledge
relies on manual analysis and interpretation. For
example, in the health-care industry, it is common for
specialists to periodically analyze current trends and
changes in health-care data, say, on a quarterly basis. The
specialists then provide a report detailing the analysis to
the sponsoring health-care organization; this report
becomes the basis for future decision making and
planning for health-care management. In a totally
different type of application, planetary geologists sift
through remotely sensed images of planets and asteroids,
carefully locating and cataloging such geologic objects of
interest as impact craters. Be it science, marketing,
finance, health care, retail, or any other field, the classical
approach to data analysis relies fundamentally on one or
more analysts becoming intimately familiar with the data
and serving as an interface between the data and the users
and products. For these (and many other) applications,
this form of manual probing of a data set is slow,
expensive, and highly subjective. In fact, as data volumes
grow dramatically, this type of manual data analysis is
completely impractical in many domains.
Databases are increasing in size in two ways:
(1) The number N of records or objects in the database
and (2) The number d of fields or attributes to an object.
Figure 1: Data Mining Algorithm
Volume 1, Issue 2 July-August 2012
Databases containing on the order of N = 109 objects are
becoming increasingly common, for example, in the
astronomical sciences. Similarly, the number of fields d
can easily be on the order of 102 or even 103, for
example, in medical diagnostic applications. Who could
be expected to digest millions of records, each having tens
or hundreds of fields? We believe that this job is certainly
not one for humans; hence, analysis work needs to be
automated, at least partially. The need to scale up human
analysis capabilities to handling the large number of bytes
that we can collect is both economic and scientific.
Businesses use data to gain competitive advantage,
increase efficiency, and provide more valuable services.
Data we capture about our environment are the basic
evidence we use to build theories and models of the
universe we live in. Because computers have enabled
humans to gather more data than we can digest, it is only
natural to turn to computational techniques to help us
Page 196
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 1, Issue 2, July – August 2012
ISSN 2278-6856
unearth meaningful pattern and structure from the
massive volumes of data. Hence, KDD is an attempt to
address a problem that the digital information era made a
fact of life for all of us: data overload.
3. LITERATURE REVIEW
In 2011, jinwei Wang et al. [12] proposed to conquer the
shortcomings and deficiencies
of
the
existing
interpolation technique of missing data, an interpolation
technique for missing context data based on Time-Space
Relationship and Association Rule Mining (TSRARM) is
proposed to perform spatiality and time series analysis on
sensor data, which generates strong association rules to
interpolate missing data. Finally, the simulation
experiment verifies the rationality and efficiency of
TSRARM through the acquisition of temperature
sensor data.
In 2011, M. Chaudhary et al. [13] proposed
new and more
optimized
algorithm
for
online
rule generation. The advantage of this algorithm is that
the graph generated in our algorithm has less edge as
compared to the lattice used in the existing algorithm.
The
Proposed
algorithm
generates
all
the
essential rulesalso and no rule is missing. The use of non
redundant association rules help significantly in the
reduction of irrelevant noise in the data mining process.
This graph theoretic approach, called adjacency lattice is
crucial for online mining of data. The adjacency lattice
could be stored either in main memory or secondary
memory. The idea of adjacency lattice is to pre store a
number of large item sets in special format which reduces
disc I/O required in performing the query.
In 2011,Fu et al. [14] analyzes Real-time monitoring data
mining has been a necessary means of improving
operational efficiency, economic safety and fault detection
of power plant. Based on the data mining arithmetic of
interactive association rules and taken full advantage of
the association characteristics of real-time test-spot data
during the power steam turbine run, the principle of
mining quantificational association rule in parameters is
put forward among the real-time monitor data of steam
turbine. Through analyzing the practical run results of a
certain steam turbine with the data mining method based
on the interactive rule, it shows that it can supervise
stream turbine run and condition monitoring, and afford
model reference and decision-making supporting for the
fault diagnose and condition-based maintenance.
In
2011,Xin
et
al.
[15]
analyzes
that
use association rule learning to process statistical data of
private economy and analyze the results to improve the
quality of statistical data of private economy. Finally the
article
provides
some
exploratory
comments and suggestions
about
the
application
of association rule mining in private economy statistics.
Volume 1, Issue 2 July-August 2012
4. PROPOSED WORK AND ALGORITHM
The frequent itemset mining is introduced in [2] by
Agrawal and Srikant. To facilitate our discussion; we give
the formal definitions as follows.
Let I = (i1, i2, i3,………im) be a set of items. An itemset X
is a subset of I. X is called k-itemset if |X| = k; where k is
the size (or length) of the itemset. A transaction T is a
pair (tid; X), where tid is a unique identifier of a
transaction and X is an itemset. A transaction (tid;X) is
said to contain an itemset Y iff Y⊆ X: A dataset D is a
set of transactions.
Given a dataset D, the support of an itemset X, denoted as
Supp(X), is the fraction of transactions in D that contain
X. An itemset X is frequent if Supp (X) is no less than a
given threshold S0. An important property of the frequent
itemsets, called the Apriori property, is that every
nonempty subset of a frequent itemset must also be
frequent.
The problem of finding frequent itemsets can be specified
as: given a dataset D and a support threshold S0; to find
any itemset whose support in D is no less than S0. It is
clear that the Apriori algorithm needs at most l + 1,scans
of database D if the maximum size of frequent itemset is
l:On the context of data streams, to avoid disk access,
previous studies focus on finding the approximation of
frequent itemsets with a bound of space complexity.
Mining frequent itemsets in static databases, all the
frequent itemsets and their support counts derived from
the original database are retained. When the transactions
are added or expired, the support counts of the frequent
itemsets contained in them are recomputed. By resuing
the frequent itemsets and their counts retained, the
number of candidate itemsets generated during the
mining process can be reduced. Later to rescan the
original database is required because non-frequent
itemsets can be frequent after the database is updated.
Therefore they cannot work without seeing the entire
database and cannot be applied to data stream.
In our approach we introduce new method in which we
required only single scan of database D to count the
support of each itemset and no candidate generation and
pruning is required to find the frequent itemsets. So our
algorithms reduce the disk access time and directly find
the frequent itemset by using support count table. This
method is application for static database as well as for
dynamic database if the table is created at the initial
stage.
4.1 Support Cont Table:
As state previous that every itemset X of transaction T is
a subset of I (X ⊆ I) and a set of such transactions is the
database D. So in database D every transaction itemset X
will be an element of 2I-1, where 2I is a power set of I.
Power set of I contain all the subsets of I that may be in
the form of transactions itemset in the transaction
database D except  . Hence our algorithm employ one
table that’s name is support count table. That table
Page 197
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 1, Issue 2, July – August 2012
ISSN 2278-6856
assumes as virtual and created when required to finding
frequent itemset. The Length of the table is (2I-1) × 2.
Two field of attributes are itemset and support count. In
this table we make entries of frequency count of each
itemset that are observed in transaction database. The
frequency count of each itemset is the count of the
occurrence of such itemset in transactional database D.
This table is generated and may be stored in cache
memory till the frequent itemset are not found. Generated
table may be used for stationary database as well as for
time series database. Table can be given as follows given
below.
4.2 Entries in Support count table:
Support count table is a table that may be useful to find
frequent itemset from static datasets as well as from
stream line dataset where we used windowing concept. In
static database this table may be created when we want to
analyze the database by single scan of the database and
make entries in the table for every transaction. In support
counts table initially all the entries of support count of
each itemsets are set to zero. If we are using database D
that is static, fixed then we update the table by single
scanning of the database D and make entries of each
itemset in the table. For each transaction itemset X in D
find the corresponding itemset in table and increment the
count of that itemset. In this way for each T we make
entries. Later may retain the table in memory till the
observation not complete. So the added or expired
transactions only required to update the table.
If we consider the database D as random or stream line
database then the table may be more useful because every
incoming or expired transaction only required to update
the table by incrementing or decrementing the
corresponding itemset and this table may be stored in
efficient way so we can use it to find the frequent item
sets or association rules. In this approach we are not
required to save the database in the disk memory only
necessary to save the table and used whenever necessary
to find frequent itemset.
Table 1: Support count table ST.
NO.
Itemset (A)
support count
(Scount)
1
.
.
2I-1
For example Let I=(i1,i2,i3,i4) be the set of items and the
different types of itemset that may be generated from the I
are {i1},{i2},{i3}…..{i1,i2,i3,i4}.Then all transaction
itemset X that may occur in database D are all will be any
subset of I and equal to itemset. Now table created
initially as given below
Volume 1, Issue 2 July-August 2012
Table 2: Initial support count table for I=(i1,i2,i3,i4).
No
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Itemset (A)
Support. count(Scount)
{i1}
{i2}
{i3}
{i4}
{i1,i2}
{i1,i3}
{i1,i4}
{i2,i3}
{i2,i4}
{i3,i4}
{i1,i2,i3}
{i1,i2,i4}
{i1,i3,i4}
{i2.i3,i4}
{i1,i2,i3,i4}
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4.3 Proposed Method to find frequent itemset
In our proposed work we are giving the method that may
be useful for static as well as for stream line database to
find frequent itemset. In our proposed work we employ
the support count table that required only to scaning the
database once to make the entries in the table for each
transaction the table retains the information till the
observation not complete or frequent itemset not found.
When the trasactions are added into dataset or expired
from the dataset simultaneously update the table. The
updated support count table has the frequency count of
each itemset. To find the frequent itemset for any
threshold value we scan the table not the database. As in
A-priori we are required l+1 scan of the dataset and
generate the candidates to find frequent set. Our approach
has only single scan of database and no candidate
generation is required. Table has entries of frequency
count of every itemset but not the total support count of
that itemset. The frequency count of each itemset is the
count of the occurrence of such itemset in transactional
database D so to find frequent itemset we are required to
find the total support count of that itemset, Total support
count of an itemset is the count of the occurrence of total
items of that itemset in the no. of transactions in D. This
total count in our scheme is calculated by scanning the
table and then found total support count compared with
the threshold S0 if the count is greater than the threshold
then itemset is included in frequent set. This procedure is
repeated for every itemset to find frequent them.
Algorithm: To find frequent itemset
Input: A database D and the support threshold S0.
Output: frequent itemsets Fitemset.
Method
Page 198
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 1, Issue 2, July – August 2012
ISSN 2278-6856
Step:1 Scan the transaction database D and update the
Support count table ST. As given in sec.4.2,
Fitemset={  }
Step2: To find frequent itemset we make use of support
count table given below as follows:
Table 3: Frequency count for above example
Step:2 for ( i=1; i<2I ; i++)
//for each itemset A in ST repeat the steps.
//2I gives total element in power set of I
TCount =0;
//Total count
Step3: for (j=1; j< 2I ; j++) // Repeat step3 to find total
count
Step:3.1 If Ai ⊆ Aj
TCount = TCount +Scount(j)
Step:4 If (Tcount ≥ S0)
Then Fitemset = Fitemset U Ai
Step:5 Go to step 2
No
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Itemset (A)
Supportcount(Scount)
{10}
{20}
{30}
{40}
{10,20}
{10,30}
{10,40}
{20,30}
{20,40}
{30,40}
{10,20,30}
{10,20,40}
{10,30,40}
{20.30,40}
{10,20,30,40
}
2
0
0
1
1
2
0
2
0
2
2
0
0
2
1
Step:6 End
To better explain our algorithm, now we consider one
example: Let I= (10, 20, 30, 40) be the set of four items &
value assumed for the threshold is 2.Total transactions in
D are considered 15.Table of transactions of D is given
below:
ti
d
1
2
3
4
transactions
{10}
{10,20}
{30,40}
{10,20,30,40
}
5
{10,30}
6
{10,30}
7
{30,40}
8
{20,30,40}
9
{20,30,40}
10 {10,20,30}
11 {20,30}
12 {40}
13 {20,30}
14 {10,20,30}
15 {10}
Step1: By scanning the database the table of support
count will be as follows: Given in table3.
Volume 1, Issue 2 July-August 2012
To check itemset {10} is frequent or not, we obtain the
total support count by scaning the support count table for
{10}, so from the table total support of {10} is 8.This
value of total support count is compared with threshold
value 2, since threshold value is 2 and less than the total
count, so the itemset {10} is frequent itemset and
included in Fitemset. This process is repeated for every
itemset.
In such a way we get every frequent itemset using support
count table
Frequent itemset for the given dataset is:
Fitemset={{10},{20},{30},{40},{10,20},{10,30},{20,30},{
20,40},{30, 40}, {10,20,30},{20,30,40}}
5. RESULT ANALYSIS
To study the performance of our proposed algorithm, we
have done several experiments. The experimental
environment is intel core processor with operating system
is window XP. The algorithm is implemented with java
netbeans 7.1.The meaning of used parameters are as
follows D for transaction database, I for no. of items in
transactions and S0 for MINsupport. Table 4 shows the
results for execution time in sec when I=5 and
transactional database D scale-up from 50 to 1000 and
MINsupport S scale-up from 2 to 8.We see from the table
Page 199
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 1, Issue 2, July – August 2012
ISSN 2278-6856
that when in rows we scale-up the MINsupport time for
exection is linearly decreasing and scale-up the database
D time is increasing but not in some linear way.
Table 4: Execution time(s)-When D scale-up from 50 to
1000 & S scale-up from 2 to 8.
No. of
transactions
Different Minimum Support(S)
2
2
3
1.7
4
1.4
5
1.1
6
0.9
8
0.6
100
4
3.4
2.8
2.2
1.8
1.2
200
8
6.8
4.2
3.2
2.6
2
250
9.5
8.5
6.7
4.6
4
3
300
400
12
14
10.2
12
8.4
10
6.8
8
5.2
6
3.5
5
500
16.5
14
12.5
8.5
7
6
1000
30
25
20
18
14
10
Figure 4: Comparison of execution time (s) for MINsupport
(S0=2) with algorithm given in reference [16].
Figure.4 shows the comparison of our proposed algorithm
execution time with S0=2 and database D scale-up from
50 to 175. Comparison result shows that our approach
gives some better performance than the method proposed
in reference [16].
Execution Time in Sec.
50
Figure 2: Execution time(s), MINsupport(S0=2);
Figure 2 shows the algorithm execution time {for
MINsupport(S0=2), I=5} is increasing almost linearly with
the increasing of dataset size. It can be concluded our
algorithm has a good scalable performance. Now later to
examine the scalability performance of our algorithm we
increased the dataset D from 1000 to 6000 with same
parameter MINsupport(S0=2), I=5, result is given in figure
5.
Figure 3: Execution time(s), Transaction database
(D=200);
Volume 1, Issue 2 July-August 2012
120
100
80
60
40
20
0
1000 2000 3000 4000 5000 6000
No. of Transactions
Figure 5: Scale-up: Number of transactions.
6. CONCLUSION AND OUTLOOK
Data mining, which is the exploration of knowledge from
the large set of data, generated as a result of the various
data processing activities. Frequent Pattern Mining is a
very important task in data mining. The previous
approaches applied to generate frequent set generally
adopt candidate generation and pruning techniques for
the satisfaction of the desired objective. In this paper we
present an algorithm which is useful in data mining task
and knowledge discovery without candidate generation
and our approach reduce the disk access time and directly
find the frequent itemset by using support count table.
The proposed method work well with static dataset by
using support count table as well as for mining streams
requires fast, real-time processing in order to keep up
with the high data arrival rate and mining results are
expected to be available within short response time.We
also proof the algorithm for static dataset by the
concerning graph results.
Page 200
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 1, Issue 2, July – August 2012
ISSN 2278-6856
In this paper we improve the performance by without
candidate values. The experiment indicates that the
efficiency of the algorithm is faster and some efficient
than presented algorithm of itemset mining.
REFERENCES
[1] M. M. Gaber, A. Zaslavsky, and S. Krishnaswamy,
“Mining data streams: A review,” ACM SIGMOD
Record, vol. Vol. 34,no. 1, 2005.
[2] C. C. Aggarwal, Data Streams: models and
algorithms. Springer, 2007.
[3] Nicholson, S. The Bibliomining Process: Data
Warehousing and Data Mining for Library DecisionMaking. Information Technology and Libraries. 2003,
22(4):146-151.
[4] Jiann-Cherng Shieh, Yung-Shun Lin. Bibliomining
User Behaviors in the Library. Journal of Educational
Media & Library Sciences.2006, 44(1):36-60.
[5] Yiqing Qin, Bingru Yang, Guangmei Xu, et al.
Research on Evolutionary Immune Mechanism in KDD
[A]. In: Proceedings of Intelligent Systems and
Knowledge Engineering 2007 (ISKE2007) [C], Cheng
Du, China, October, 2007, 94-99.
[6] Yang B R. Knowledge discovery based on inner
mechanism: construction, realization and application
[M]. USA: Elliott & Fitzpatrick Inc. 2004.
[7] Binesh Nair, Amiya Kumar Tripathy, “
Accelerating Closed Frequent Itemset Mining by
Elimination of Null Transactions”,
Journal of
Emerging Trends in Computing and Information
Sciences, Volume 2 No.7, JULY 2011, pp 317-324.
[8] E.Ramaraj and N.Venkatesan, “Bit Stream MaskSearch Algorithm in Frequent Itemset Mining”,
European Journal of Scientific Research ISSN 1450216X Vol.27 No.2 (2009), pp.286-297.
[9] Shilpa and Sunita Parashar, “ Performance
Analysis of Apriori Algorithm with Progressive
Approach for Mining Data”, International Journal of
Computer Applications (0975 – 8887) Volume 31–
No.1, October 2011, pp 13-18.
[10] G. Cormode and M. Hadiieleftheriou, “ Finding
frequent items in data streams”, In Proceedings of the
34th International Conference on Very Large Data
Bases (VLDB), pages 1530–1541, Auckland, New
Zealand, 2008.
[11] D.Y. Chiu, Y.H. Wu, and A.L. Chen, “Efficient
frequent sequence mining by a dynamic strategy
switching algorithm”, The International Journal on
Very Large Data Bases (VLDB Journal), 18(1):303–
327, 2009.
[12] Jinwei Wang and Haitao Li ,” An Interpolation
Approach for Missing Context Data Based on the TimeSpace Relationship and Association Rule Mining ”
,Multimedia Information Networking and Security
(MINES), 2011,IEEE.
[13] Chaudhary, M. ,Rana, A. , Dubey, G,” Online
Mining of data to generate association rule mining in
large
databases
”,
Volume 1, Issue 2 July-August 2012
Recent Trends in Information Systems (ReTIS), 2011
International Conference on Dec. 2011,IEEE.
[14] Fu Jun ,Yuan Wen-hua, Tang Wei-xin ,Peng
Yu,”study on Monitoring Data Mining of Steam
Turbine Based on Interactive Association Rules ”,IEEE
2011, Computer Distributed Control and Intelligent
Environmental Monitoring (CDCIEM).
[15] Jinguo, Xin; Tingting, Wei, “The application of
association rules mining in data processing of private
economy
statistics”,
E -Business and E -Government (ICEE), 2011 IEEE.
[16] Weimin Ouyang and Qinhua Huang, “ Discovery
Algorithm for mining both Direct and Indirect weighted
Association Rules”, Internatinal conference on
Artificial Intelligence and Computational Intelligence,
pages 322-325,IEEE 2009
AUTHORS
Mr. Ram Ratan Ahirwal has received
his B.E.(First) degree in Computer
Science & Engineering from GEC
Bhopal University RGPV Bhopal in
2002. During 2003, August he joined
Samrat Ashok Technological Institute
Vidisha (M. P.) as a lecturer in
computer Science & engg. Dept. and
complete his M.Tech Degree (with hons.) as sponsored
candidate in CSE from SATI (Engg. College), Vidisha
University RGPV Bhopal, (M.P) India in 2009.Currently
he is working as assistant professor in CSE dept., SATI
Vidisha. He has more than 12 publications in various
referred international jouranal and in international
conferences to his credit. His areas of interests are data
mining, image processing, computer network, network
security and natural language processing.
Neelesh Kumar Kori received his B.E
(First)
degrees
in
Information
Technology from UIT, BU Bhopal
(M.P) India in 2008 and currently he is
pursuing M. Tech from SATI Vidisha
(M.P), India in Computer Science &
Engineering.
Dr.Y.K.Jain, Head CSE Deptt, SATI
(Degree) Engg. College Vidisha, (M.P.),
India. He has more than 30-40
publications
in
various
referred
international
jouranal
and
in
international conferences to his credit.
His areas of interests are image
processing, computer network.
Page 201