Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected], [email protected] Volume 1, Issue 2, July – August 2012 ISSN 2278-6856 Improved Data mining approach to find Frequent Itemset Using Support count table Ramratan Ahirwal1, Neelesh Kumar Kori2 and Dr.Y.K. Jain3 1 Samrat Ashok Technological Institute Vidisha (M. P.) 464001 India 2 Samrat Ashok Technological Vidisha (M. P.) 464001 India 3 Samrat Ashok Technological Institute Vidisha (M. P.) 464001 India Abstract: Mining frequent item sets has been widely studied over the last decade. Past research focuses on mining frequent itemsets from static database. In many of the new applications mining time series and data stream is an important task now. Last decade, there are mainly two kinds of algorithms on frequent pattern mining. One is Apriori based on generating and testing, the other is FP-growth based on dividing and conquering, which has been widely used in static data mining. But with the new requirements of data mining, mining frequent pattern is not restricted in the same scenario. In this paper we focus on the new miming algorithm, where we can find frequent pattern in single scan of the database and no candidate generation is required. To achieve this goal our algorithm employ one table which retain the information about the support count of the itemset and the table is virtual for static database means generated whenever required to generate frequent items and may be useful for time series database. So our algorithm is suitable for static as well as for dynamic data mining. Result shows that the algorithm is useful in today’s data mining environment. Keywords: Apriori, Association Rule, Frequent Pattern, Data Mining 1. INTRODUCTION Mining data streams is a very important research topic and has recently attracted a lot of attention, because in many cases data is generated by external sources so rapidly that it may become impossible to store it and analyze it offline. Moreover, in some cases streams of data must be analyzed in real time to provide information about trends, outlier values or regularities that must be signaled as soon as possible. The need for online computation is a notable challenge with respect to classical data mining algorithms [1], [2]. Important application fields for stream mining are as diverse as financial applications, network monitoring, security problems, telecommunication networks, Web applications, sensor networks, analysis of atmospheric data, etc. The innovation in computer science have made it possible to acquire and store enormous amounts of data digitally in databases, currently giga or terabytes in a single database and even more in the future. Many fields and systems of human activity have become increasingly Volume 1, Issue 2 July-August 2012 dependent on collected, stored, and processed information. However, the abundance of the collected data makes it laborious to find essential information in it for a specific purpose. Data mining is the analysis of (often large) observational datasets from the database, data warehouse or other large repository incomplete, noisy, ambiguous, the practical application of random data to find unsuspected relationships and summarize the data that are both understandable and useful to the data owner. It is a means that data extraction, cleaning and transformation, analysis, and other treatment models, and automatically discovers the patterns and interesting knowledge hidden in large amounts of data, this helps us make decisions based on a wealth of data. Information communication mode of software development lies in how to collection, analysis, and mine out the hidden useful information in the various data from information communication between developers and the staff interaction with manages, and then used the knowledge to make decision. oustead College uses database technology to manage the library currently. Its main purpose is to facilitate the procurement of books, cataloging, and circulation management. In order to better satisfy the needs of readers, we must to explore the needs of readers, to provide the information which they need initiatively. Most current library evaluation techniques focus on frequencies and aggregate measures; these statistics hide underlying patterns. Discovering these patterns is the key that use library services [3]. Data mining is applied to library operations [4].With the fast development of the technology and the more requirements of the users, the dynamic elements in data mining are becoming more important, including dynamic databases and the knowledge bases, users' interestingness and the data varying with time and space. I order to solve the problems such as low effectiveness; high randomness and hard implementation in dynamic mining, more research on dynamic data mining have been done. In [5][6] , an evolutionary immune mechanism was proposed based on the fact that the elements involved in the domains could be modeled as the ones in immune models. It focused on how to utilize the relationship between antigens and antibodies in a dynamic data mining such as an Page 195 International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected], [email protected] Volume 1, Issue 2, July – August 2012 ISSN 2278-6856 incremental mining. However, the sole immune mechanism and relative algorithm runs more effectively only on incremental situations rather than on others. Its performance and function have to be improved when used in more complex and dynamic environments like Web. We provide here an overview of executing data mining services and association rule. The rest of this paper is arranged as follows: Section 2 introduces Data Mining and KDD; Section 3 describes about Literature review Section 4 shows the description of proposed work Section 5 result analysis of the algorithm and proposed work. Section 6 describes the Conclusion and outlook. 2. DATA MINING AND KDD Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. There are several algorithm are devised for this.[5]The process is shown in Figure 1. Although data mining is a relatively new term, the technology is not. Companies have used powerful computers to sift through volumes of supermarket scanner data and analyze market research reports for years. However, continuous innovations in computer processing power, disk storage, and statistical software are dramatically increasing the accuracy of analysis while driving down the cost. At an abstract level, the KDD field is concerned with the development of methods and techniques for making sense of data. The basic problem addressed by the KDD process is one of mapping low-level data (which are typically too voluminous to understand and digest easily) into other forms that might be more compact (for example, a short report), more abstract approximation or model of the process that generated the data), or more useful (for example, a predictive model for estimating the value of future cases). At the core of the process is the application of specific data-mining methods for pattern discovery and extraction. The traditional method of turning data into knowledge relies on manual analysis and interpretation. For example, in the health-care industry, it is common for specialists to periodically analyze current trends and changes in health-care data, say, on a quarterly basis. The specialists then provide a report detailing the analysis to the sponsoring health-care organization; this report becomes the basis for future decision making and planning for health-care management. In a totally different type of application, planetary geologists sift through remotely sensed images of planets and asteroids, carefully locating and cataloging such geologic objects of interest as impact craters. Be it science, marketing, finance, health care, retail, or any other field, the classical approach to data analysis relies fundamentally on one or more analysts becoming intimately familiar with the data and serving as an interface between the data and the users and products. For these (and many other) applications, this form of manual probing of a data set is slow, expensive, and highly subjective. In fact, as data volumes grow dramatically, this type of manual data analysis is completely impractical in many domains. Databases are increasing in size in two ways: (1) The number N of records or objects in the database and (2) The number d of fields or attributes to an object. Figure 1: Data Mining Algorithm Volume 1, Issue 2 July-August 2012 Databases containing on the order of N = 109 objects are becoming increasingly common, for example, in the astronomical sciences. Similarly, the number of fields d can easily be on the order of 102 or even 103, for example, in medical diagnostic applications. Who could be expected to digest millions of records, each having tens or hundreds of fields? We believe that this job is certainly not one for humans; hence, analysis work needs to be automated, at least partially. The need to scale up human analysis capabilities to handling the large number of bytes that we can collect is both economic and scientific. Businesses use data to gain competitive advantage, increase efficiency, and provide more valuable services. Data we capture about our environment are the basic evidence we use to build theories and models of the universe we live in. Because computers have enabled humans to gather more data than we can digest, it is only natural to turn to computational techniques to help us Page 196 International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected], [email protected] Volume 1, Issue 2, July – August 2012 ISSN 2278-6856 unearth meaningful pattern and structure from the massive volumes of data. Hence, KDD is an attempt to address a problem that the digital information era made a fact of life for all of us: data overload. 3. LITERATURE REVIEW In 2011, jinwei Wang et al. [12] proposed to conquer the shortcomings and deficiencies of the existing interpolation technique of missing data, an interpolation technique for missing context data based on Time-Space Relationship and Association Rule Mining (TSRARM) is proposed to perform spatiality and time series analysis on sensor data, which generates strong association rules to interpolate missing data. Finally, the simulation experiment verifies the rationality and efficiency of TSRARM through the acquisition of temperature sensor data. In 2011, M. Chaudhary et al. [13] proposed new and more optimized algorithm for online rule generation. The advantage of this algorithm is that the graph generated in our algorithm has less edge as compared to the lattice used in the existing algorithm. The Proposed algorithm generates all the essential rulesalso and no rule is missing. The use of non redundant association rules help significantly in the reduction of irrelevant noise in the data mining process. This graph theoretic approach, called adjacency lattice is crucial for online mining of data. The adjacency lattice could be stored either in main memory or secondary memory. The idea of adjacency lattice is to pre store a number of large item sets in special format which reduces disc I/O required in performing the query. In 2011,Fu et al. [14] analyzes Real-time monitoring data mining has been a necessary means of improving operational efficiency, economic safety and fault detection of power plant. Based on the data mining arithmetic of interactive association rules and taken full advantage of the association characteristics of real-time test-spot data during the power steam turbine run, the principle of mining quantificational association rule in parameters is put forward among the real-time monitor data of steam turbine. Through analyzing the practical run results of a certain steam turbine with the data mining method based on the interactive rule, it shows that it can supervise stream turbine run and condition monitoring, and afford model reference and decision-making supporting for the fault diagnose and condition-based maintenance. In 2011,Xin et al. [15] analyzes that use association rule learning to process statistical data of private economy and analyze the results to improve the quality of statistical data of private economy. Finally the article provides some exploratory comments and suggestions about the application of association rule mining in private economy statistics. Volume 1, Issue 2 July-August 2012 4. PROPOSED WORK AND ALGORITHM The frequent itemset mining is introduced in [2] by Agrawal and Srikant. To facilitate our discussion; we give the formal definitions as follows. Let I = (i1, i2, i3,………im) be a set of items. An itemset X is a subset of I. X is called k-itemset if |X| = k; where k is the size (or length) of the itemset. A transaction T is a pair (tid; X), where tid is a unique identifier of a transaction and X is an itemset. A transaction (tid;X) is said to contain an itemset Y iff Y⊆ X: A dataset D is a set of transactions. Given a dataset D, the support of an itemset X, denoted as Supp(X), is the fraction of transactions in D that contain X. An itemset X is frequent if Supp (X) is no less than a given threshold S0. An important property of the frequent itemsets, called the Apriori property, is that every nonempty subset of a frequent itemset must also be frequent. The problem of finding frequent itemsets can be specified as: given a dataset D and a support threshold S0; to find any itemset whose support in D is no less than S0. It is clear that the Apriori algorithm needs at most l + 1,scans of database D if the maximum size of frequent itemset is l:On the context of data streams, to avoid disk access, previous studies focus on finding the approximation of frequent itemsets with a bound of space complexity. Mining frequent itemsets in static databases, all the frequent itemsets and their support counts derived from the original database are retained. When the transactions are added or expired, the support counts of the frequent itemsets contained in them are recomputed. By resuing the frequent itemsets and their counts retained, the number of candidate itemsets generated during the mining process can be reduced. Later to rescan the original database is required because non-frequent itemsets can be frequent after the database is updated. Therefore they cannot work without seeing the entire database and cannot be applied to data stream. In our approach we introduce new method in which we required only single scan of database D to count the support of each itemset and no candidate generation and pruning is required to find the frequent itemsets. So our algorithms reduce the disk access time and directly find the frequent itemset by using support count table. This method is application for static database as well as for dynamic database if the table is created at the initial stage. 4.1 Support Cont Table: As state previous that every itemset X of transaction T is a subset of I (X ⊆ I) and a set of such transactions is the database D. So in database D every transaction itemset X will be an element of 2I-1, where 2I is a power set of I. Power set of I contain all the subsets of I that may be in the form of transactions itemset in the transaction database D except . Hence our algorithm employ one table that’s name is support count table. That table Page 197 International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected], [email protected] Volume 1, Issue 2, July – August 2012 ISSN 2278-6856 assumes as virtual and created when required to finding frequent itemset. The Length of the table is (2I-1) × 2. Two field of attributes are itemset and support count. In this table we make entries of frequency count of each itemset that are observed in transaction database. The frequency count of each itemset is the count of the occurrence of such itemset in transactional database D. This table is generated and may be stored in cache memory till the frequent itemset are not found. Generated table may be used for stationary database as well as for time series database. Table can be given as follows given below. 4.2 Entries in Support count table: Support count table is a table that may be useful to find frequent itemset from static datasets as well as from stream line dataset where we used windowing concept. In static database this table may be created when we want to analyze the database by single scan of the database and make entries in the table for every transaction. In support counts table initially all the entries of support count of each itemsets are set to zero. If we are using database D that is static, fixed then we update the table by single scanning of the database D and make entries of each itemset in the table. For each transaction itemset X in D find the corresponding itemset in table and increment the count of that itemset. In this way for each T we make entries. Later may retain the table in memory till the observation not complete. So the added or expired transactions only required to update the table. If we consider the database D as random or stream line database then the table may be more useful because every incoming or expired transaction only required to update the table by incrementing or decrementing the corresponding itemset and this table may be stored in efficient way so we can use it to find the frequent item sets or association rules. In this approach we are not required to save the database in the disk memory only necessary to save the table and used whenever necessary to find frequent itemset. Table 1: Support count table ST. NO. Itemset (A) support count (Scount) 1 . . 2I-1 For example Let I=(i1,i2,i3,i4) be the set of items and the different types of itemset that may be generated from the I are {i1},{i2},{i3}…..{i1,i2,i3,i4}.Then all transaction itemset X that may occur in database D are all will be any subset of I and equal to itemset. Now table created initially as given below Volume 1, Issue 2 July-August 2012 Table 2: Initial support count table for I=(i1,i2,i3,i4). No . 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Itemset (A) Support. count(Scount) {i1} {i2} {i3} {i4} {i1,i2} {i1,i3} {i1,i4} {i2,i3} {i2,i4} {i3,i4} {i1,i2,i3} {i1,i2,i4} {i1,i3,i4} {i2.i3,i4} {i1,i2,i3,i4} 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4.3 Proposed Method to find frequent itemset In our proposed work we are giving the method that may be useful for static as well as for stream line database to find frequent itemset. In our proposed work we employ the support count table that required only to scaning the database once to make the entries in the table for each transaction the table retains the information till the observation not complete or frequent itemset not found. When the trasactions are added into dataset or expired from the dataset simultaneously update the table. The updated support count table has the frequency count of each itemset. To find the frequent itemset for any threshold value we scan the table not the database. As in A-priori we are required l+1 scan of the dataset and generate the candidates to find frequent set. Our approach has only single scan of database and no candidate generation is required. Table has entries of frequency count of every itemset but not the total support count of that itemset. The frequency count of each itemset is the count of the occurrence of such itemset in transactional database D so to find frequent itemset we are required to find the total support count of that itemset, Total support count of an itemset is the count of the occurrence of total items of that itemset in the no. of transactions in D. This total count in our scheme is calculated by scanning the table and then found total support count compared with the threshold S0 if the count is greater than the threshold then itemset is included in frequent set. This procedure is repeated for every itemset to find frequent them. Algorithm: To find frequent itemset Input: A database D and the support threshold S0. Output: frequent itemsets Fitemset. Method Page 198 International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected], [email protected] Volume 1, Issue 2, July – August 2012 ISSN 2278-6856 Step:1 Scan the transaction database D and update the Support count table ST. As given in sec.4.2, Fitemset={ } Step2: To find frequent itemset we make use of support count table given below as follows: Table 3: Frequency count for above example Step:2 for ( i=1; i<2I ; i++) //for each itemset A in ST repeat the steps. //2I gives total element in power set of I TCount =0; //Total count Step3: for (j=1; j< 2I ; j++) // Repeat step3 to find total count Step:3.1 If Ai ⊆ Aj TCount = TCount +Scount(j) Step:4 If (Tcount ≥ S0) Then Fitemset = Fitemset U Ai Step:5 Go to step 2 No . 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Itemset (A) Supportcount(Scount) {10} {20} {30} {40} {10,20} {10,30} {10,40} {20,30} {20,40} {30,40} {10,20,30} {10,20,40} {10,30,40} {20.30,40} {10,20,30,40 } 2 0 0 1 1 2 0 2 0 2 2 0 0 2 1 Step:6 End To better explain our algorithm, now we consider one example: Let I= (10, 20, 30, 40) be the set of four items & value assumed for the threshold is 2.Total transactions in D are considered 15.Table of transactions of D is given below: ti d 1 2 3 4 transactions {10} {10,20} {30,40} {10,20,30,40 } 5 {10,30} 6 {10,30} 7 {30,40} 8 {20,30,40} 9 {20,30,40} 10 {10,20,30} 11 {20,30} 12 {40} 13 {20,30} 14 {10,20,30} 15 {10} Step1: By scanning the database the table of support count will be as follows: Given in table3. Volume 1, Issue 2 July-August 2012 To check itemset {10} is frequent or not, we obtain the total support count by scaning the support count table for {10}, so from the table total support of {10} is 8.This value of total support count is compared with threshold value 2, since threshold value is 2 and less than the total count, so the itemset {10} is frequent itemset and included in Fitemset. This process is repeated for every itemset. In such a way we get every frequent itemset using support count table Frequent itemset for the given dataset is: Fitemset={{10},{20},{30},{40},{10,20},{10,30},{20,30},{ 20,40},{30, 40}, {10,20,30},{20,30,40}} 5. RESULT ANALYSIS To study the performance of our proposed algorithm, we have done several experiments. The experimental environment is intel core processor with operating system is window XP. The algorithm is implemented with java netbeans 7.1.The meaning of used parameters are as follows D for transaction database, I for no. of items in transactions and S0 for MINsupport. Table 4 shows the results for execution time in sec when I=5 and transactional database D scale-up from 50 to 1000 and MINsupport S scale-up from 2 to 8.We see from the table Page 199 International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected], [email protected] Volume 1, Issue 2, July – August 2012 ISSN 2278-6856 that when in rows we scale-up the MINsupport time for exection is linearly decreasing and scale-up the database D time is increasing but not in some linear way. Table 4: Execution time(s)-When D scale-up from 50 to 1000 & S scale-up from 2 to 8. No. of transactions Different Minimum Support(S) 2 2 3 1.7 4 1.4 5 1.1 6 0.9 8 0.6 100 4 3.4 2.8 2.2 1.8 1.2 200 8 6.8 4.2 3.2 2.6 2 250 9.5 8.5 6.7 4.6 4 3 300 400 12 14 10.2 12 8.4 10 6.8 8 5.2 6 3.5 5 500 16.5 14 12.5 8.5 7 6 1000 30 25 20 18 14 10 Figure 4: Comparison of execution time (s) for MINsupport (S0=2) with algorithm given in reference [16]. Figure.4 shows the comparison of our proposed algorithm execution time with S0=2 and database D scale-up from 50 to 175. Comparison result shows that our approach gives some better performance than the method proposed in reference [16]. Execution Time in Sec. 50 Figure 2: Execution time(s), MINsupport(S0=2); Figure 2 shows the algorithm execution time {for MINsupport(S0=2), I=5} is increasing almost linearly with the increasing of dataset size. It can be concluded our algorithm has a good scalable performance. Now later to examine the scalability performance of our algorithm we increased the dataset D from 1000 to 6000 with same parameter MINsupport(S0=2), I=5, result is given in figure 5. Figure 3: Execution time(s), Transaction database (D=200); Volume 1, Issue 2 July-August 2012 120 100 80 60 40 20 0 1000 2000 3000 4000 5000 6000 No. of Transactions Figure 5: Scale-up: Number of transactions. 6. CONCLUSION AND OUTLOOK Data mining, which is the exploration of knowledge from the large set of data, generated as a result of the various data processing activities. Frequent Pattern Mining is a very important task in data mining. The previous approaches applied to generate frequent set generally adopt candidate generation and pruning techniques for the satisfaction of the desired objective. In this paper we present an algorithm which is useful in data mining task and knowledge discovery without candidate generation and our approach reduce the disk access time and directly find the frequent itemset by using support count table. The proposed method work well with static dataset by using support count table as well as for mining streams requires fast, real-time processing in order to keep up with the high data arrival rate and mining results are expected to be available within short response time.We also proof the algorithm for static dataset by the concerning graph results. Page 200 International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected], [email protected] Volume 1, Issue 2, July – August 2012 ISSN 2278-6856 In this paper we improve the performance by without candidate values. The experiment indicates that the efficiency of the algorithm is faster and some efficient than presented algorithm of itemset mining. REFERENCES [1] M. M. Gaber, A. Zaslavsky, and S. Krishnaswamy, “Mining data streams: A review,” ACM SIGMOD Record, vol. Vol. 34,no. 1, 2005. [2] C. C. Aggarwal, Data Streams: models and algorithms. Springer, 2007. [3] Nicholson, S. The Bibliomining Process: Data Warehousing and Data Mining for Library DecisionMaking. Information Technology and Libraries. 2003, 22(4):146-151. [4] Jiann-Cherng Shieh, Yung-Shun Lin. Bibliomining User Behaviors in the Library. Journal of Educational Media & Library Sciences.2006, 44(1):36-60. [5] Yiqing Qin, Bingru Yang, Guangmei Xu, et al. Research on Evolutionary Immune Mechanism in KDD [A]. In: Proceedings of Intelligent Systems and Knowledge Engineering 2007 (ISKE2007) [C], Cheng Du, China, October, 2007, 94-99. [6] Yang B R. Knowledge discovery based on inner mechanism: construction, realization and application [M]. USA: Elliott & Fitzpatrick Inc. 2004. [7] Binesh Nair, Amiya Kumar Tripathy, “ Accelerating Closed Frequent Itemset Mining by Elimination of Null Transactions”, Journal of Emerging Trends in Computing and Information Sciences, Volume 2 No.7, JULY 2011, pp 317-324. [8] E.Ramaraj and N.Venkatesan, “Bit Stream MaskSearch Algorithm in Frequent Itemset Mining”, European Journal of Scientific Research ISSN 1450216X Vol.27 No.2 (2009), pp.286-297. [9] Shilpa and Sunita Parashar, “ Performance Analysis of Apriori Algorithm with Progressive Approach for Mining Data”, International Journal of Computer Applications (0975 – 8887) Volume 31– No.1, October 2011, pp 13-18. [10] G. Cormode and M. Hadiieleftheriou, “ Finding frequent items in data streams”, In Proceedings of the 34th International Conference on Very Large Data Bases (VLDB), pages 1530–1541, Auckland, New Zealand, 2008. [11] D.Y. Chiu, Y.H. Wu, and A.L. Chen, “Efficient frequent sequence mining by a dynamic strategy switching algorithm”, The International Journal on Very Large Data Bases (VLDB Journal), 18(1):303– 327, 2009. [12] Jinwei Wang and Haitao Li ,” An Interpolation Approach for Missing Context Data Based on the TimeSpace Relationship and Association Rule Mining ” ,Multimedia Information Networking and Security (MINES), 2011,IEEE. [13] Chaudhary, M. ,Rana, A. , Dubey, G,” Online Mining of data to generate association rule mining in large databases ”, Volume 1, Issue 2 July-August 2012 Recent Trends in Information Systems (ReTIS), 2011 International Conference on Dec. 2011,IEEE. [14] Fu Jun ,Yuan Wen-hua, Tang Wei-xin ,Peng Yu,”study on Monitoring Data Mining of Steam Turbine Based on Interactive Association Rules ”,IEEE 2011, Computer Distributed Control and Intelligent Environmental Monitoring (CDCIEM). [15] Jinguo, Xin; Tingting, Wei, “The application of association rules mining in data processing of private economy statistics”, E -Business and E -Government (ICEE), 2011 IEEE. [16] Weimin Ouyang and Qinhua Huang, “ Discovery Algorithm for mining both Direct and Indirect weighted Association Rules”, Internatinal conference on Artificial Intelligence and Computational Intelligence, pages 322-325,IEEE 2009 AUTHORS Mr. Ram Ratan Ahirwal has received his B.E.(First) degree in Computer Science & Engineering from GEC Bhopal University RGPV Bhopal in 2002. During 2003, August he joined Samrat Ashok Technological Institute Vidisha (M. P.) as a lecturer in computer Science & engg. Dept. and complete his M.Tech Degree (with hons.) as sponsored candidate in CSE from SATI (Engg. College), Vidisha University RGPV Bhopal, (M.P) India in 2009.Currently he is working as assistant professor in CSE dept., SATI Vidisha. He has more than 12 publications in various referred international jouranal and in international conferences to his credit. His areas of interests are data mining, image processing, computer network, network security and natural language processing. Neelesh Kumar Kori received his B.E (First) degrees in Information Technology from UIT, BU Bhopal (M.P) India in 2008 and currently he is pursuing M. Tech from SATI Vidisha (M.P), India in Computer Science & Engineering. Dr.Y.K.Jain, Head CSE Deptt, SATI (Degree) Engg. College Vidisha, (M.P.), India. He has more than 30-40 publications in various referred international jouranal and in international conferences to his credit. His areas of interests are image processing, computer network. Page 201