Download Fast Determination of Items Support Technique from Enhanced Tree

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
International Journal of Software Engineering and Its Applications
Vol.8, No.1 (2014), pp.21-32
http://dx.doi.org/10.14257/ijseia.2014.8.1.02
Fast Determination of Items Support Technique from Enhanced
Tree Data Structure
Zailani Abdullah1, Tutut Herawan2, A. Noraziah3 and Mustafa Mat Deris4
1
Department of Computer Science, Universiti Malaysia Terengganu
21030 Kuala Terengganu, Terengganu, Malaysia
2
Department of Mathematics Education, Universitas Ahmad Dahlan
Jalan Prof Dr Soepomo 55166, Yogyakarta, Indonesia
3
Faculty of Computer Systems and Software Engineering, Universiti Malaysia
Pahang, Lebuhraya Tun Razak, 26300 Kuantan Pahang, Malaysia
4
Faculty of Computer Science and Information Technology, Universiti Tun
Hussein Onn Malaysia, Parit Raja, Batu Pahat 86400, Johor, Malaysia
[email protected], [email protected], [email protected],
[email protected]
Abstract
Frequent Pattern Tree (FP-Tree) is one of the famous data structure to keep frequent
itemsets. However when the content of transactional database is modified, FP-Tree must
be reconstructed again due to the changes in patterns and items support. Until this
recent, most of the techniques in frequent pattern mining are using the original database
to determine the items support and not from their recommended trees data structure.
Therefore in this paper, we proposed a technique called Fast Determination of Item
Support Technique (F-DIST) to capture the items support from our suggested Disorder
Support Trie Itemset (DOSTrieIT) data structure. Experiments with the UCI datasets
show that the processing time to determine the items support using F-DIST from
DOSTrieIT is outperformed the classical FP-Tree technique. Furthermore, the
processing time to construct a complete tree data structure for DOSTrieIT is lesser than
the benchmarked CanTree data structure.
Keywords: Association Rules; Frequent Pattern; Tree Structure; Fast Technique
1. Introduction
For the past decades, frequent pattern mining has received a lot of attention and
exploration [1, 2, 3, 20, 22-27]. It was first introduced by Agrawal et al. [4] and still
continuing as an active research in data mining community. Until now, more than
hundreds of research papers have been published including the development of new or
modification of algorithms. Generally, the main problem in frequent patterns mining is
how to efficiency manage the huge data in computer's memory.
As a result, frequent pattern tree (FP-Tree) [1] has been proposed and became one of
the alternative data structure to store the vast transactional in compressed manner. Since
that, several variations of constructing or updating the FP-Tree have been proposed and
discussed in the literature [1, 5-11]. However, there are still two major drawbacks
encountered from the past studies. First, when the existing transactions in database are
updated, the current FP-Tree must be rebuilt again from the beginning. Second, in order
to reconstruct FP-Tree, all items support will be recounted again from the original
database due to the changes in frequent items. Therefore, in order to improve the
processing time of capturing the items support, an enhanced tree data structure called
ISSN: 1738-9984 IJSEIA
Copyright ⓒ 2014 SERSC
International Journal of Software Engineering and Its Applications
Vol.8, No.1 (2014)
Disorder Support of Trie Itemset (DOSTrieIT) and Fast Determination of Item Support
Technique (F-DIST) are proposed and experimented with four Frequent Itemset Mining
Datasets from Repository [21]. The performance analysis between F-DIST and the
benchmarked FP-Tree technique was performed in determining the items support.
In summary, there are three main contributions from this work. First, we propose a
novel, complete and incremental pattern tree data structure, DOSTrieIT that can keep the
entire transactional database. Second, we embed a feature called Single Item Without
Support (SIWE) in DOSTrieIT to speed up the process of capturing the items support.
Lastly, we suggest the F-DIST as a technique to efficiently determine the items support
from DOSTrieIT.
The paper structure is organized as follows. Section 2 explains the related works. In
Section 3, the basic concept and terminology in association rules is discussed. Section 4
elaborates the proposed methods. Detail discussions of the experiments are reported in
Section 5. Finally, Section 6 concludes the paper.
2. Related Works
Since the introduction by Agrawal et al. in 1993 [4], frequent pattern mining has been
received a great deal of attentions from data mining researchers [1, 3]. Thus, more than
hundreds of papers have been published in an attempt to increase its efficiencies and
scalabilities. In general, the algorithms for mining the frequent itemset could be classified
into three; Apriori-like algorithms, frequent pattern-based algorithms and algorithms that
use the vertical data format.
Due to the problem of two nontrivial costs in Apriori [12]; cost of generating candidate
itemsets and cost of repeatedly scanning the database, the frequent pattern based
algorithms without candidate itemsets have been proposed. This method constructs a
compact data structure known as FP-Tree [1] from the original transaction database.
Typically, before constructing the prefix path in FP-Tree, the items in the transaction
must be sorted in support descending order and also must be satisfied the minimum
support threshold. In addition, the construction of FP-Tree is carried out through offline.
Since the idea of FP-Tree, there are abundant researches have been put forward such as
H-Mine [13], PatriciaMine [14], FPgrowth* [15], SOTrieIT [16], AFOPF [5], AFPIM
[6], EFPIM [7], CATS-Tree [8], CanTree [17], FUFP-Tree [9], CP-Tree [18], BSM [19]
and BIT [11]. Due to the limitations faced by FP-Growth algorithm, the H-Mine [13],
PatriciaMine [14] and FPgrowth* are proposed. H-Mine and FPgrowth* use array-based
technique to speed up the mining process. The AFPIM [6], EFPIM [7] and FUFP-Tree [9]
algorithms use a compact data structure to perform incremental mining. The updated
database can be obtained by adjusting the FP-tree according to the latest changes in the
transactions. However, these approaches still require two database scans to construct the
FP-tree structure and to update the tree structure. The limitations of AFPIM and EFPIM
are properly addressed by CATS-Tree [17] that requires only one database scan. Indeed,
CATS-Tree scans only the updated portion in database rather than the whole updated
database. However, the tree construction process is very complicated and it is only
suitable for static database. The drawbacks of CATS-Tree are well addressed by CanTree
[17] that requires only simple tree construction and based on divide-and-conquer to mine
the frequent patterns. However, the compactness of CanTree is not similar to FP-Tree due
to not the items in the tree are not stored in support descending order. CP-Tree [18] and
BSM [19] is proposed in an attempt to mitigate the limitation of CanTree. The compact
prefix-tree structure is constructed with one database scan and the items in the tree are
also arranged as similar to FP-Tree. However, the construction process of CanTree is
quite complicated and still questionable due to overhead cost in adjusting the tree
structure when the portion of database is updated. BIT [11] algorithm merges two small
22
Copyright ⓒ 2014 SERSC
International Journal of Software Engineering and Its Applications
Vol.8, No.1 (2014)
consecutive duration FP-Trees to obtain the final FP-Tree before continuing the mining
process. However, the process of merging two set of FP-Trees is very complex and
unclear in term of memory consumption. Furthermore, it is only suitable for batch
processing and not suitable at all for incremental mining.
3. Association Rules


Throughout this section the set I  i1 , i 2 , , i A , for A  0 refers to the set of


literals called set of items and the set D  t1 , t 2 , , t U , for U  0 refers to the data
set of transactions, where each transaction t  D is a list of distinct items
t  i1 , i 2 , , i M , 1  M  A and each transaction can be identified by a distinct


identifier TID.
Definition 1. A set X  I is called an itemset. An itemset with k-items is called a kitemset.
Definition 2. The support of an itemset X  I , denoted supp X  is defined as a
number of transactions contain X.
Definition 3. Let X , Y  I be itemset. An association rule between sets X and Y is an
implication of the form X  Y , where X  Y   . The sets X and Y are called
antecedent and consequent, respectively.
Definition 4. The support for an association rule X  Y , denoted supp X  Y  , is
defined as a number of transactions in D contain X  Y .
Definition 5. The confidence for an association rule X  Y , denoted conf  X  Y  is
defined as a ratio of the numbers of transactions in D contain X  Y to the number of
transactions in D contain X. Thus
conf  X  Y  
supp X  Y 
supp X 
Definition 6. An itemset X is called frequent item if supp X    , where  is the
minimum support.
The set of frequent item will be denoted as Frequent Items and
Frequent Item  X  I | supp X    
4. Proposed Model
4.1. Definition
In order to easily comprehend the whole process in DOSTrieIT, some required
definitions together with a sample transactional data are presented.
Copyright ⓒ 2014 SERSC
23
International Journal of Software Engineering and Its Applications
Vol.8, No.1 (2014)
Definition 7. Disorder Support Trie Itemset (DOSTrieIT) is defined as a complete tree
data structure in canonical order of itemsets. The order of itemset is not based on the
support descending order. DOSTrieIT contains n-levels of tree nodes (items) and their
support. Moreover, DOSTrieIT is constructed in online manner and for the purpose of
incremental pattern mining.
Example 1. Let
T  1,2,5, 2,4, 2,3, 1,2,4, 1,3, 2,3,6, 1,3, 1,2,3,5, 1,2,3.
A step by step to construct DOSTrieIT is explained in the next section. Graphically, an
item is represented as a node and its support is appeared nearby to the respective node. A
complete structure of DOSTrieIT is shown as in Figure 1.
Figure 1. DOSTrieIT and SIWE Path Arranged in Support Descending Order
Definition 8. Single Item without Extension (SIWE) is a prefix path in the tree that
contains only one item or node. SIWE is constructed upon receiving a new transaction
and as a mechanism for fast searching of single item support. It will be employed during
tree transformation process but it will not be physically transferred into the others tree.
Example 2. From Example 1, the transactions have 6 unique items and it is not sorted in
any order. In Figure 2, SIWE for DOSTrieIT i.e.,
SIWE  2,1,3,4,5,6
Proposition 1. (Instant Support of Single Items Property). For any item ai, the items
support is instantly obtained from the 1-level of DOSTrieIT. All these items or nodes have
no extension or also known as SIWE.
Justification. Let single item a1 , a 2 , , a n , from Definition 8, Single Item without
Extension (SIWE) a1 , a 2 , , a n is a prefix path in the tree that contains only one item or
node. In this case a1 , a 2 , , a n is constructed upon receiving a new transaction. To this
we can accelerate the process of updating and/or searching a support of a1 , a 2 , , a n .
The trie-traversal for examining a support of a1 , a 2 , , a n is truncated once it reaches at
the last of single items without extension. It will be employed during tree transformation
process but it will not be physically transferred into the FP-Tree.
24
Copyright ⓒ 2014 SERSC
International Journal of Software Engineering and Its Applications
Vol.8, No.1 (2014)
Example 3. Let examine the sample transaction from Example 1. The items support of 2,
1, 3, 4, 5, 6 is 7, 6, 5, 2, 2, 1 respectively. For CanTree and CATS-Tree, the support
information is only can be captured after scanning 9 lines of transactions. However, the
similar information can be easily determined from DOSTrieIT via SIWE as shown in
Figure 1. The items support is obtained in DOSTrieIT by traversal in the trie and
immediately stopped after no more single items without extension is found.
4.2. Activity Diagrams
Activity diagram is employed in visualizing the details processes of constructing
DOSTrieIT. It is one of the prominent diagrams in Unified Modeling Language (UML) to
graphically represent the workflows of stepwise with support of choice (condition),
iteration (loop) and concurrency. Figure 2 and Figure 3 show the activity diagrams for FDIST and constructing DOSTrieIT data structure, respectively.
Figure 2. An Activity Diagram for F-DIST
Figure 3. An Activity Diagram for DOSTrieIT
4.2. Pseudocode Development
Pseudocode is an informal high-level description of the operating principle of
a computer program or other algorithm. The main purpose of pseudocode is to
Copyright ⓒ 2014 SERSC
25
International Journal of Software Engineering and Its Applications
Vol.8, No.1 (2014)
comprehend the detailed processes in coding the F-DIST and DOSTrieIT based on C#
programming language. Figures 4 and 5 depict the pseudocode for constructing F-DIST
and DOSTrieIT, respectively.
F-DIST Pseudocode
1:
2:
3:
4:
5:
6:
7:
Read DOSTrieIT
Dowhile (prefixPath DOSTrieIT != eof)
If SIWE != eof Then
Get SIWE
Display SIWE
Endif
EndDo
Figure 4. Pseudocode for F-DIST
DOSTrieIT Pseudocode
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
Read Transaction
If DOSTrieIT != null Then
Read DOSTrieT
Else
Initialize DOSTrieIT
Endif
Dowhile (line Transaction != eof)
Dowhile (prefixPath DOSTrieIT != eof)
If line != prefixPath Then
Insert new SIWE
Insert new prefixPath in DOSTrieIT
Else
If new prefixPath  current prefixPath Then
Update current SIWE
Else
If new prefixPath  current prefixPath Then
Update support in current prefixPath
Update current SIWE
Else
If new prefixPath  current prefixPath Then
Update support in current prefixPath
Update current SIWE
Insert new SIWE
Insert new prefixPath in DOSTrieIT
Endif
Endif
Endif
Endif
Enddo
Enddo
Figure 5. Pseudocode for DOSTrieIT
5. Experimental Setup
In this section, we do comparison tests between F-DIST and benchmarked FP-Tree
technique. The performance analysis was carried out by comparing the computational
26
Copyright ⓒ 2014 SERSC
International Journal of Software Engineering and Its Applications
Vol.8, No.1 (2014)
time required to read and extract the items support. We conducted our experiment in four
benchmarked datasets. The experiment was performed on Intel® Core™ 2 Quad CPU at
2.33GHz speed with 4GB main memory, running on Microsoft Windows Vista. All
coding have been developed using C# as a programming language.
Four benchmarked datasets from Frequent Itemset Mining Dataset Repository [21]
were employed in the experiment. The first dataset was Retails and it contains the retail
market basket data from an anonymous Belgian retail store. For the second experiment,
synthetic dataset T10I4D100K was used. It is a sparse dataset. In this dataset, the
frequent itemsets are short but they are not abundant. The third benchmarked dataset was
Mushroom. This is a dense dataset and consists of 23 species of gilled mushrooms in the
Agaricus and Lepiota Family. The fourth and last benchmarked dataset was Chess. The
chess dataset contain different game configurations, where a pawn on a7 is one square
away from the queen. The task is to determine whether the White can win or not. The
fundamental characteristics of the datasets are depicted in Table 1.
Table 1. Fundamental Characteristics of Datasets
Data sets
Retails
T10I4D100K
Mushrooms
Chess
Size
4.153 MB
3.83MB
0.54MB
0.33MB
#Trans
88,136
100,000
8,124
3,196
#Items
16,471
1000
119
76
Average length
10
10
23
37
Figure 6 shows the comparison between F-DIST and FP-Tree technique in term of
duration taken (or processing time) to capture the items support. In overall, duration to
determine the items support using F-DIST was less than FP-Tree technique. For Retails
datasets, processing time via F-DIST technique was 6.24 times (83.97%) faster than FPTree technique. In term of T10I4D100K dataset, F-DIST technique was faster at 258.37
times (99.61%) as compared to FP-Tree technique. The processing time for Mushroom
dataset based on F-DIST technique was 5.75 times (82.60%) better than FP-Tree
technique. Finally, for the last dataset (Chess), the processing time employed by F-DIST
was 13.88 times (92.79%) faster than FP-Tree technique. In summary, the average
duration to determine the items support by F-DIST was 71.06 times (89.74%) better than
FP-Tree technique.
Figure 6. Performance Analysis for Determining the Items Support using
Different Datasets
In this experiment, the performance of constructing the DOSTrieIT and CanTree using
different datasets was organized and presented collectively. For duration measure,
Copyright ⓒ 2014 SERSC
27
International Journal of Software Engineering and Its Applications
Vol.8, No.1 (2014)
millisecond with Logarithmic scale view was employed. Figure 7 depicted the
performance analysis for both data structures against four benchmarked datasets.
Figire 7. Performance Analysis in Constructing Different Tree Data
Structures Against Different Datasets
In overall, DOSTrieIT construction required less time than CanTree construction. For
Retail dataset, construction of DOSTrieIT was 1.63 times (38.74%) faster than CanTree.
The performance to construct both trees via T10I4D100K was not much different but
DOSTrieIT is still better at 1.01 times (1.25%) than CanTree. For the dataset Mushroom,
the performance of constructing DOSTrieIT was better than CanTree, and it is up to 4.13
times (75.78%). Finally, for the dataset Chess, performance of constructing DOSTrieIT
was 6.35 times (84.25%) faster than CanTree. In summary and based on the combination
of three datasets, DOSTrieIT was outperformed at 3.28 times (50.01%) than CanTree.
6. Conclusion
FP-Tree is a crucial and compact data structure for generating the frequent itemsets.
However for incremental pattern mining, the latest items support must be recalculated
again before reconstructing the FP-Tree. This is due to the changes occurred in the items
and patterns support. At the moment, most of the tree-based techniques are still depend
on the original dataset rather than their own tree data structure. Thus, it is a necessity to
incorporate the feature of single items support in the tree. Therefore, in the paper we
proposed a technique called F-DIST to determine the items support from our suggested
DOSTrieIT data structure. We do experiment with serveral Frequent Itemset Mining
Dataset Repository [21] datasets and found that our proposed technique is outperformed
at 71.06 times (89.74%) faster than benchmarked FP-Tree technique. Moreover, the
processing time to construct a complete set of DOSTrieIT data structure is 3.28 times
(50.01%) better than CanTree.
28
Copyright ⓒ 2014 SERSC
International Journal of Software Engineering and Its Applications
Vol.8, No.1 (2014)
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
J. Han, H. Pei and Y. Yin, “Mining Frequent Patterns without Candidate Generation”, Proceeding of the
2000 ACM SIGMOD, (2000), pp. 1-12.
Z. Zheng, R. Kohavi and L. Mason, “Real World Performance of Association Rule Algorithms”,
Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, ACM Press,
(2001), pp. 401-06.
J. Han and J. Pei, “Mining Frequent Pattern without Candidate Itemset Generation: A Frequent Pattern
Tree Approach”, Data Mining and Knowledge Discovery, vol. 8, (2004), pp. 53-87.
R. Agrawal, T. Imielinski and A. Swami, “Database Mining: A Performance Perspective”, IEEE
Transactions on Knowledge and Data Engineering, vol. 5, no. 6, (1993), pp. 914-925.
G. Liu, H. Lu, W. Lou, Xu and J. X. Yu, “Efficient Mining of Frequent Patterns using Ascending
Frequency Ordered Prefix-Tree”, Data Mining and Knowledge Discovery, vol. 9, (2004), pp. 249-274.
J-L. Koh and S-F. Shieh, “An Efficient Approach for Maintenance Association Rules Based on
Adjusting FP-Tree Structure”, Proceeding of the 2004 International Conference on Database Systems
for Advanced Applications, (2004), pp. 417-424.
X. Li, X. Deng and S. Tang, “A Fast Algorithm for Maintenance of Association Rules in Incremental
Databases”, Proceeding of International Conference on Advance Data Mining and Applications, (2006),
pp. 56-63.
W. Cheung and O. R. Zaïane, “Incremental Mining of Frequent Patterns without Candidate Generation
of Support Constraint”, Proceeding of the 7th International Database Engineering and Applications
Symposium (IDEAS’2003), (2003).
T.-P. Hong, J.-W. Lin and Y-L. We, “Incrementally Fast Updated Frequent Pattern Trees”, Expert
Systems with Applications, vol. 34, no. 4, (2008), pp. 2424-2435.
S. K. Tanbeer, C. F. Ahmed, B. S. Jeong and Y. K. Lee, “Efficient Single-Pass Frequent Pattern Mining
Using a Prefix-Tree”, Information Science, vol. 279, pp. 559-583.
S. G. Totad, R. B. Geeta and P. P. Reddy, “Batch Processing for Incremental FP-Tree Construction”,
International Journal of Computer Applications, vol. 5, no. 5, (2010), pp. 28-32.
R. Agrawal and J. Shafer, “Parallel Mining of Association Rules: Design, Implementation, and
Experience”, IEEE Transaction Knowledge and Data Engineering, vol. 8, (1996), pp. 962-969.
J. Pei, J. Han, H. Lu, S. Nishio, S. Tang and D. Yang, “Hmine: Hyper-Structure Mining of Frequent
Patterns in Large Databases”, Proceedings of IEEE International Conference on Data Mining, (2001),
pp. 441-448.
P. and D. Zandolin, “Mining Frequent Item sets Using Patricia Tries”, Proceedings of the ICDM’03,
(2003).
G. Grahne and J. Zhu, “Efficiently using prefix-trees in mining frequent itemsets”, Proceeding of
FIMI’03, (2003).
Y. K. Woon, W. K. Ng and E. P. Lim, “A Support Order Trie for Fast Frequent Itemset Discovery”,
IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 7, (2004), pp. 875-879.
C. K.-S. Leung, Q. I. Khan, Z. Li and T. Hoque, “CanTree: A Canonical-Order Tree for Incremental
Frequent-Pattern Mining”, Knowledge Information System, vol. 11, no. 3, (2007), pp. 287-311.
S. K. Tanbeer, F. A. Chowdhury, B. S. Jeong and Y.-K. Lee, “CP-Tree: A Tree Structure for SinglePass Frequent Pattern Mining”, T. Washio et al. (Eds.): PAKDD’08, Lecture Notes in Artificial
Intelligence, Springer, Heidelberg, vol. 5012, (2008), pp. 1022-1027.
S. K. Tanbeer, C. F. Ahmed, B.-S. Jeong and Y.-K. Lee, “Sliding Window-based Frequent Pattern
Mining Over Data Streams”, Information Sciences, vol. 179, (2009), pp. 3843-3865.
R. Ivancsy and I. Vajk, “Fast Discovery of Frequent Itemsets: a Cubic Structure-Based Approach”,
Informatica (Slovenia), vol. 29, no. 1, (2005), pp. 71-78.
Frequent Itemset Mining Dataset Repository, http://fimi.ua.ac.be/data/.
T. Herawan and M. M. Deris, “A soft set approach for association rules mining”, Knowledge Based
Systems, vol. 24, no. 1, (2011), pp. 186-195.
T. Herawan, Z. Abdullah, A. Noraziah, M. M. Deris and J. H. Abawajy, “EFP-M2: Efficient Model for
Mining Frequent Patterns in Transactional Database”, N.T. Nguyen et al. (Eds.): ICCCI 2012, Lecture
Notes in Computer Science, Springer-Verlag, vol. 7654, (2012), pp. 29-38.
T. Herawan, Z. Abdullah, A. Noraziah, M. M. Deris and J. H. Abawajy, “IPMA: Indirect Patterns
Mining Algorithm”, N.T. Nguyen et al. (Eds.): ICCCI 2012, Advanced Methods for Computational
Collective Intelligence Studies in Computational Intelligence, Springer-Verlag, vol. 457, (2013), pp.
187-196.
T. Herawan and Z. Abdullah, “CNAR-M: A Model for Mining Critical Negative Association Rules”,
Zhihua Cai et al. (Eds): ISICA 2012, Communications in Computer and Information Science, SpringerVerlag, vol. 316, (2012), pp. 170-179.
Copyright ⓒ 2014 SERSC
29
International Journal of Software Engineering and Its Applications
Vol.8, No.1 (2014)
[26] T. Herawan, P. Vitasari and Z. Abdullah, “Mining critical least association rules of student suffering
language and social anxieties”, International Journal of Continuing Engineering Education and Life
Long Learning, vol. 23, no. 2, (2013), pp. 128-146.
[27] Z. Abdullah, T. Herawan and M. M. Deris, “Tracing Significant Information using Critical Least
Association Rules Model”, International Journal of Innovative Computing and Applications, vol. 5, no.
1, (2013), pp. 3-17.
Authors
Zailani Abdullah received his Ph.D from University Tun Hussein
Onn Malaysia, (UTHM) in 2012. He has published more than 40
research papers in journal and conference proceedings. He has served
as a Co-Chair for the 4th Malaysian Software Engineering Conference
2008 (MySEC 2008), Committee Member for The 16th Asia-Pacific
Software Engineering Conference (APSEC 2009), Committee Member
for The 2nd KnowledgeGrid Malaysia Forum 2009 and Program
Committee Member for The 3rd International Conference on Computer
Systems and Software Engineering (ICSECS 2013). He is also
Microsoft® Certified Technology Specialist (MCTS), .NET
Framework 3.5, ASP.NET Application and Oracle Database 11g
Administrator Certified Associate. He has appointed as editorial board
member for Journal of Computational Intelligence and Electronic
Systems (JCIES) and reviewer for World Applied Science Journal,
Aceh International Journal of Science and Technology (AIJST), Global
Perspective on Engineering Management (GPEM) and Mosharaka for
Researches and Studies, respectively. His research interests include
database, data mining and web-based applications.
Tutut Herawan received a B.Ed degree in year 2002 and M.Sc
degree in year 2006 degree in Mathematics from Universitas Ahmad
Dahlan and Universitas Gadjah Mada Yogyakarta Indonesia,
respectively. He obtained a PhD in Theoretical Data Mining from
Universiti Tun Hussein Onn Malaysia in year 2010. Currently, he is a
lecturer with Department of Mathematics Education, Universitas
Ahmad Dahlan, Indonesia. He currently supervises four PhD and had
successfully co-supervised two PhD students and published more than
120 papers in various international journals and conference
proceedings. He has appointed as an editorial board member for
IJDTA, TELKOMNIKA, IJNCAA, IJDCA and IJDIWC. He is also
been appointed as a reviewer of several international journals such as
Knowledge-Based Systems, Information Sciences, European Journal of
Operational Research, Applied Mathematics Letters, and guest editor
for several special issues of international journals. He has served as a
program committee member and co-organizer for numerous
international conferences/workshops including Soft Computing and
Data Engineering (SCDE 2010-2011 at Korea, SCDE 2012 at Brazil),
ADMTA 2012 Vietnam, DTA 2011-2012 at Korea, DICTAP 2012 at
Thailand, ICDIPC 2012 at Lithuania, DEIS 2012 at Czech Republic,
NDT 2012 at Bahrain, ICoCSIM 2012 at Indonesia, ICSDE’2013 at
Malaysia, ICSECS 2013 at Malaysia, SCKDD 2013 at Vietnam and
many more. His research area includes Knowledge Discovery in
30
Copyright ⓒ 2014 SERSC
International Journal of Software Engineering and Its Applications
Vol.8, No.1 (2014)
Databases, Educational Data Mining, Decision Support in Information
System, Rough and Soft Set theory.
Noraziah Ahmad received Ph.D in Distributed Database from
University Malaysia Terengganu (UMT) in 2007. She has published
more than 150 papers in the journals and conference proceedings.
Currently, she is currently an associate professor at Faculty of
Computer Systems & Software Engineering, University Malaysia
Pahang. In addition to serving as international program committee
member and reviewers in many conferences, she is currently an
editorial board members of the International Journal of Engineering
and Technology (IJET), International Journal of Web Application
(IJWA) and Journal of Emerging Technologies in Web Intelligence
(JETWI); a member of IEEE Computer Society, International
Association of Engineers (IAENG), World Academy of Science,
Engineering and Technology (WASET), Malaysian National Computer
Confederation (MNCC) and Senior member of International
Association of Computer Science Information Technology (IACSIT).
Mustafa Mat Deris received PhD from University Putra Malaysia
in 2002. He is a professor of computer science in the Faculty of
Computer Science and Information Technology, UTHM, Malaysia. He
has successfully supervised ten PhD students and currently he is
supervising six PhD students and published more than 170 papers in
journals and conference proceedings. He has appointed as editorial
board member for Journal of Next Generation Information
Technology, JNIT, Korea, and Encyclopedia on Mobile Computing and
Commerce, Idea Group, USA, Guest editor of International Journal of
BioMedical Soft Computing and Human Science for Special Issue on
“Soft Computing Methodologies and Its Applications” a reviewer of
several international journals such as IEEE Transaction on Parallel and
Distributed Computing, Journal of Parallel and Distributed Databases,
Journal of Future Generation on Computer Systems, Journal of
Information Sciences, Elsevier, Journal of Cluster Computing, Kluwer,
and Journal of Computer Mathematics, Taylor & Francis, UK. He has
served as a program committee member and co-organizer for numerous
international conferences/workshops including Grid and Peer-to-Peer
Computing, (GP2P 2005, 2006), Autonomic Distributed Data and
Storage Systems Management (ADSM 2005, 2006, 2007), and Grid
Pervasive Computing Security, organizer for workshops on Rough and
Soft Sets Theories and Applications (RSAA 2010), Fukuoka, Japan,
and Soft Computing and Data Engineering (SCDE) (2010, 2011,
Korea), (2012, Brazil). His research interests include distributed
databases, data grid, data mining and soft computing.
Copyright ⓒ 2014 SERSC
31
International Journal of Software Engineering and Its Applications
Vol.8, No.1 (2014)
32
Copyright ⓒ 2014 SERSC