Download A Multi-clustering Fusion Algorithm

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Human genetic clustering wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

K-means clustering wikipedia , lookup

Nearest-neighbor chain algorithm wikipedia , lookup

Cluster analysis wikipedia , lookup

Transcript
A Multi-clustering Fusion Algorithm
Dimitrios Frossyniotis1 , Minas Pertselakis1 , and Andreas Stafylopatis2
National Technical University of Athens
Department of Electrical and Computer Engineering
Zographou 157 73, Athens, Greece
1
{dfros, mper}@cslab.ntua.gr
2
[email protected]
Abstract. A multi-clustering fusion method is presented based on combining several runs of a clustering algorithm resulting in a common partition. More specifically, the results of several independent runs of the
same clustering algorithm are appropriately combined to obtain a partition of the data which is not affected by initialization and overcomes the
instabilities of clustering methods. Finally, the fusion procedure starts
with the clusters produced by the combining part and finds the optimal number of clusters in the data set according to some predefined
criteria. The unsupervised multi-clustering method implemented in this
work is quite general. There is ample room for the implementation and
testing with any existing clustering algorithm that has unstable results.
Experiments using both simulated and real data sets indicate that the
multi-clustering fusion algorithm is able to partition a set of data points
to the optimal number of clusters not constrained to be hyper-spherically
shaped.
1
Introduction
Unsupervised classification, also known as data clustering, is a generic label for
a variety of procedures designed to find natural groupings or clusters in multidimensional data, based on measured similarities among the patterns [1]. Clustering is a very difficult problem because data can reveal clusters with different
shapes and sizes. Additionally, the number of clusters in the data often depends
on the resolution with which the data are viewed. As a consequence, different
clustering algorithms have been proposed in the literature and new clustering
algorithms continue to appear.
Moreover, the majority of these algorithms are based on the following four
most popular clustering methods: iterative square-error partitional clustering,
hierarchical clustering, grid-based clustering and density-based clustering [2,3].
Partitional methods can be further classified into two groups. In the first
group, each sample is assigned to one and only one cluster, contrary to the
second group of methods where each sample can be associated (in some sense)
with several clusters. The most commonly used partitional clustering algorithm
I.P. Vlahavas and C.D. Spyropoulos (Eds.): SETN 2002, LNAI 2308, pp. 225–236, 2002.
c Springer-Verlag Berlin Heidelberg 2002
226
D. Frossyniotis, M. Pertselakis, and A. Stafylopatis
is K-means, which is based on the square-error criterion. This algorithm is computationally efficient and yields good results if the clusters are compact, hyperspherical in shape and well separated in the feature space. Numerous attempts
have been made to improve the performance of the simple K-means by using the
Mahalanobis distance to detect hyper-ellipsoidal shaped clusters [4] or by incorporating a fuzzy criterion function resulting in a fuzzy C-means algorithm [5]. A
different partitional clustering approach is based on probability density function
(pdf) estimation using Gaussian mixtures. The specification of the parameters
of the mixture is based on the expectation-minimization algorithm (EM) [6]. A
recently proposed greedy-EM algorithm [7] is an incremental scheme that has
been found to provide better results than the conventional EM algorithm.
Hierarchical clustering methods organize data in a nested sequence of groups
which can be displayed in the form of a dendrogram or a tree [8]. These methods
can be either agglomerative or divisive. An agglomerative hierarchical method
places each sample in its own cluster and gradually merges these clusters into
larger clusters until all samples are ultimately in a single cluster (the root node).
A divisive hierarchical method starts with a single cluster containing all the data
and recursively splits parent clusters into daughters.
Grid-based clustering algorithms are mainly proposed for spatial data mining.
Their main characteristic is that they quantise the space into a finite number of
cells and then they do all operations on the quantised space. On the other hand,
density-based clustering algorithms adopt the key idea to group neighbouring
objects of a data set into clusters based on density conditions.
However, many of the above clustering methods require additional userspecified parameters, such as the optimal number and shapes of clusters, similarity thresholds and stopping criteria. Moreover, different clustering algorithms
and even multiple replications of the same algorithm result in different solutions
due to random initializations, so there is no clear indication for the best partition
result. Consequently, two main of the challenges in cluster analysis are first to
select an appropriate measure of similarity to define clusters, which in general
is cluster shape dependent, and second to specify the optimal number of clusters in the data set. In this direction, clustering strategies have been developed
which prove to perform very satisfactorily in clustering and finding the number
of clusters [9,10,11,12,13]. The present work, following an analogous approach,
proposes a clustering algorithm which tackles these two important problems and
is able to partition a data set in a shape independent manner and to find the
optimal number of clusters existing in the data set.
The paper is organized as follows: Section 2 describes the multi-clustering
fusion method, while experimental results for the evaluation of the proposed
method are presented in Section 3 and, finally, conclusions are presented in
Section 4.
A Multi-clustering Fusion Algorithm
2
227
Description of the Algorithm
The multi-clustering fusion algorithm consists of two procedures that take place
sequentially. The Partitioning procedure, which is used to partition data points
of a set in clusters and the Fusion procedure, which determines the true structure
of the data.
In the primary stage, the initial number of clusters and the number of iterations are defined for the Partioning procedure, wherein a clustering algorithm
and a voting scheme are implemented, in order to produce a distinct partition
of the data set. During the Fusion procedure, this partition is processed and
neighbour clusters are merged, resulting in an optimal number of clusters for
the given data set, according to some specified criteria.
2.1
Partitioning Procedure
The partioning procedure applies the same basic clustering algorithm for a number of iterations, Iter, so as to accomplish a distinct partitioning of N data
points to a predefined number C of clusters. The experimental study of our
work is based on two implementations of the proposed multi-clustering fusion
method using different basic clustering algorithms: the K-means and the greedyEM algorithm.
More specifically, the K-means clustering aims to optimise an objective function that is described by the equation
J=
C d(x, vi )
(1)
i=1 x∈µi
where vi is the center of cluster µi and d(x, vi ) is the Euclidean distance between
a point x and vi . Thus, the criterion function J attempts to minimize the distance
of every point from the center of the cluster to which the point belongs. Starting
from arbitrary initial positions for cluster centers and by iteratively updating
cluster centers, the algorithm moves the cluster centers to sensible locations
within the data set.
As far as the greedy-EM algorithm [7] is concerned, the data are assumed
to be generated by several parameterized Gaussian distributions, so the data
points are assigned to different clusters based on their posterior probabilities
of having been generated by a specific Gaussian distribution. A multivariate
Gaussian mixture is defined as the weighted sum:
p(x) =
C
j )
πj f (x; φ
(2)
j=1
where πj are the mixing weights satisfying
the l-dimensional Gaussian density
j
j ) is
πj = 1, πj 0, and f (x; φ
f (x; φj ) = (2π)−l/2 | Sj |−1/2 exp[−0.5(x − m
j ) Sj−1 (x − m
j )]
(3)
228
D. Frossyniotis, M. Pertselakis, and A. Stafylopatis
parameterized on the mean m
j and the covariance matrix Sj , collectively de j . Usually, for a given number C of kernels,
noted by the parameter vector φ
the specification of the parameters of the mixture is based on the expectationminimization algorithm (EM) [6] for maximization of the data log-likelihood:
L=
N
1 log p(xi )
N i=1
(4)
The algorithm starts with one kernel and adds kernels dynamically one at a time
so as to estimate the true number of components of the mixture (therefore the
true number of clusters, if we consider that each kernel corresponds to a group
of patterns ) as follows. The algorithm is run for a large value of C, and, for the
solution obtained for each intermediate value of C, a model selection criterion is
applied, e.g., cross-validation using a set of test points, a coding scheme based
on minimum description length etc. Finally, the optimal value of C is selected
that corresponds to the optimal value of the model selection criterion. In this
work, we have used as a criterion for the specification of C, the log-likelihood
value on a validation set of points that have not been used for training.
The above procedure is carried out when applying the greedy-EM algorithm
as a stand-alone clustering method. When using the greedy-EM as a basic clustering algorithm within the multi-clustering fusion approach we consider only the
predefined value of C and no intermediate values, so as to obtain a partitioning
to C clusters at each iteration step.
In what concerns the Partioning procedure, the basic clustering algorithm
partitions the data set in a different way for each iteration, creating a problem
of deciding which cluster of one run corresponds to which in another run. This
algorithm tackles this problem using the similarity between the clusters produced
during successive runs. By determining the percentage of points of a cluster in
the t-th run belonging to clusters of the t − 1-th run, each cluster of the new
run is assigned to one of the previous run, resulting in a cluster renumbering
process.
After renumbering, if pattern i is assigned to cluster q, then a positive vote is
given to cluster q and a negative one to all other clusters. This process defines a
voting scheme, during which a voting table VT (of dimension N ×C) is updated,
so that V T (i, j) denotes the membership degree of pattern i to cluster j, where
i = 1, . . . , N , and j = 1, . . . , C.
At the end of the runs, each pattern i is considered to belong to the cluster
i
Cmax
, where
i
Cmax
= argmax(V T (i, j)), j = 1, . . . , C
(5)
The procedure thus results in a distinct partitioning of the data set, assigning
each data point to one cluster.
Using the VT table and the relation between the data points of one cluster
with all the remaining clusters, a table NRT (of dimension C × C) can be produced, so that N RT (i, j) represents the neighbourhood relation between clusters
i and j:
A Multi-clustering Fusion Algorithm
N RT (i, j) =
N
229
p
(V T (p, j)I(Cmax
= i)), i = 1, . . . , C, j = 1, . . . , C, j = i (6)
p=1
where I(z) is an indicator function, i.e. I(z) = 1 if z =true, otherwise I(z) = 0.
2.2
Fusion Procedure
Given the neighbourhood relation among clusters, a Fusion procedure is developed. This procedure starts with the predefined number C of clusters and (after
removing the clusters with zero data points) merges the ones which are closest
to each other.
More specifically, the procedure searches the neighbourhood relation table
(C × C table) for the two clusters (with indexes C1 and C2) that fulfill the
following conditions: first, both clusters are the closest to each other and, second,
these two clusters are the closest of all clusters. The next step is to merge these
clusters into one and to reconfigure the voting table accordingly, by adding the
votes of the second cluster to the first one as follows:
V T (i, C1 ) = V T (i, C1 ) + V T (I, C2 ), i = 1, . . . , N
(7)
where C1 = min(C1, C2) and C2 = max(C1, C2). The new neighbourhood
relation table is created with one cluster less, by removing cluster C2 , and the
procedure starts again until some stopping criterion is met.
The criterion that derives directly from this procedure is that merging will
stop when all clusters end up to have an average ‘sureness’ of 100%. (The average
‘sureness’ is defined as the sum of the membership degrees of points assigned to
a cluster divided by their total number). That means that in the voting table
all data points will be assigned to only one cluster by 100%. Since in practice
this condition is not always possible to be realized, due to overlapping clusters
for example, it was decided to use methods suitable for quantitative evaluation
of the clustering results, which determine the number of clusters better fitting a
data set.
The cluster validity methods used in our study are the Root-mean-square
standard deviation (RMSSTD) and the R-squared (RS) described in [3]. More
specifically, RMSSTD and RS have to be taken into account simultaneously in
order to find the correct number of clusters. The optimal values of the number of
clusters are those for which significant local change in values of RS and RMSSTD
occurs. It should be noted, however, that since these methods give an indication
of the quality of the resulting partitioning they should only be considered as a
tool at the disposal of the experts in order to evaluate the clustering results.
2.3
Pseudo-Algorithm
- Define number of clusters, C
- Define number of iterations, Iter
230
D. Frossyniotis, M. Pertselakis, and A. Stafylopatis
Procedure 1: Partitioning
– i = 1;
Run the basic clustering algorithm to partition the data set into C clusters
– If sample p (p = 1, . . . , N ) belongs to cluster q then
V T (p, q) = 1
V T (p, j) = 0, j = 1, . . . , C, j = q
– For i = 2 to Iter
- Run the basic clustering algorithm to partition the data set into C
clusters
- Renumber clusters
- Voting scheme:
If sample p (p = 1, . . . , N ) belongs to cluster q then
1
V T (p, q) = (i−1)
i V T (p, q) + i
(i−1)
V T (p, j) = i V T (p, j), j = 1, . . . , C, j = q
– Create neighbourhood relation table NRT (C × C)
Procedure 2: Fusion
– Remove clusters with zero data points
– Repeat until stopping criterion is met
- From neighbourhood relation table find the two closest clusters
- Merge pairs, sum the votes
- Recompute NRT with C = C − 1 clusters
The proposed algorithm consists of two procedures that take place sequentially, thus the total complexity is the sum of the respective complexities. The
Partioning procedure has the complexity of the basic clustering algorithm, i.e.,
if the basic clustering algorithm is the K-means then the time complexity is
O(n) where n is the number of points in the dataset. The time complexity of the
Fusion procedure is O(C 3 ) where C is the number of clusters produced from the
Partition procedure.
3
Experimental Results
In this section we present a comparative experimental evaluation of the proposed
methodology using different basic clustering algorithms, namely the K-means
and the greedy-EM algorithm. The resulting multi-clustering fusion method with
K-means as the basic clustering algorithm will be hereafter referred to as multifusion-k-means. Similarly, using the greedy-EM as the basic clustering algorithm
will be referred to as multi-fusion-greedy-EM.
The proposed multi-clustering fusion method has been tested on several data
sets. The basic idea for
√ choosing the initial number of clusters is by setting C
to a large value, say N , N being the number of patterns in the data set. We
used this formula, because partitioning a small data set into a large number of
clusters (compared to the actual number of clusters) usually produces clusters of
A Multi-clustering Fusion Algorithm
231
few points or empty clusters. The experiments presented here consist of Iter =
100 runs of the basic clustering algorithm in the Partitioning procedure with a
number C of clusters. The voting table VT and the neighbourhood relation table
NRT are computed between successive runs and the Fusion procedure follows
according to the final results of the partition. The optimal values of the number
of clusters are those for which a significant local change in values of RS and
RMSSTD occurs.
Finally, for comparison purposes, we also present clustering results from running the greedy-EM algorithm as a stand-alone clustering method. In this case,
we have applied the procedure described in the previous section for selecting the
optimal value of clusters using a validation set of points that have not been used
for training.
10
10
5
5
0
0
−5
−5
−10
−10
−15
0
2
4
6
8
10
12
Fig. 1. Lith data set after the Partitioning procedure (multi-fusionk-means).
3.1
14
−15
0
2
4
6
8
10
12
14
Fig. 2. Lith data set after the
Fusion procedure (multi-fusion-kmeans).
The Lith Data
This is a 2-dimensional data set consisting of 2000 data points. The data is
uniformly distributed along two sausages and is superimposed by a normal
distribution with standard deviation 1 in all directions. We have considered
C = 45 clusters in the Partitioning procedure. The multi-fusion-k-means partitioned the data points correctly into two clusters (Fig. 1 and 2). The validity
indices (RMSSTD and RS) select the clustering scheme of two clusters while
we reached an average ‘sureness’ of the clusters greater than 99%. Similarly, the
multi-fusion-greedy-EM method partitioned the data points into two well separated clusters reaching an average ‘sureness’ of the clusters greater than 99%.
For the stand-alone greedy-EM algorithm, we used 1000 data points for training and 1000 for validation. We ran the algorithm for C = 45 clusters and the
optimal solution obtained was 6 clusters (Fig. 7) with average ‘sureness’ of the
clusters 91.8%.
232
D. Frossyniotis, M. Pertselakis, and A. Stafylopatis
6
6
4
4
2
2
0
0
−2
−2
−4
−4
−6
−6
−8
−8
−10
−12
−10
−12
−10
−8
−6
−4
−2
0
2
4
6
8
Fig. 3. Banana data set after
the Partitioning procedure (multifusion-k-means).
3.2
−10
−8
−6
−4
−2
0
2
4
6
8
Fig. 4. Banana data set after the
Fusion procedure (multi-fusion-kmeans).
The Banana Data
The Banana data set is also a 2-dimensional one consisting of 2000 data points
that belong to two banana shaped clusters. We have considered C = 45 clusters
in the Partitioning procedure. The multi-fusion-k-means partitioned the data
points correctly into two clusters (Fig. 3 and 4). The validity indices (RMSSTD
and RS) select the clustering scheme of two clusters while we reached an average
‘sureness’ of the clusters greater than 99%. Similarly, the multi-fusion-greedy-EM
method partitioned the data points into two well separated clusters reaching an
average ‘sureness’ of the clusters greater than 99%. For the stand-alone greedyEM, we used 1000 data points for training and 1000 for validation. We ran the
algorithm for C = 45 clusters and the optimal solution obtained was 10 clusters
(Fig. 8) with average ‘sureness’ of the clusters 86.8%.
5
5
4
4
3
3
2
2
1
1
0
0
−1
−1
−2
−2
−3
−3
−4
−4
−3
−2
−1
0
1
2
3
Fig. 5. Clouds data set after
the Partitioning procedure (multifusion-k-means).
4
−4
−4
−3
−2
−1
0
1
2
3
Fig. 6. Clouds data set after the
Fusion procedure (multi-fusion-kmeans).
4
A Multi-clustering Fusion Algorithm
3.3
233
The Clouds Data
The Clouds artificial data from the ELENA project [14] are two-dimensional
produced by three different Gaussian distributions. There are 5000 samples in
the data set belonging to three clusters which are relatively highly overlapped.
We have considered C = 70 clusters in the Partitioning procedure. The multifusion-k-means correctly identified the true number of clusters (three) (Fig. 5
and 6). The validity indices (RMSSTD and RS) select the clustering scheme
of three clusters while we reached an average ‘sureness’ of the clusters greater
than 98.5%. Similarly, the multi-fusion-greedy-EM method partitioned the data
points into three clusters reaching an average ‘sureness’ of the clusters greater
than 95.5%. For the stand-alone greedy-EM algorithm, we used 3000 data points
for training and 2000 for validation. We ran the algorithm for C = 70 clusters
and the optimal solution obtained was 4 clusters (Fig. 9) with average ‘sureness’
of the clusters 94.1%. The average ‘sureness’ of the clusters is less than that of
the previous examples for the proposed method. Indeed, the Lith and Banana
data sets have a simple and clear structure, but, unfortunately, in the case of
overlapping clusters (especially in real-world data sets) it is very difficult to find
a ‘very sure’ partitioning.
3.4
The Pima Indians Data
The Diabetes set from the UCI data set repository [15] contains 8-dimensional
data. It is based on personal data from 768 Pima Indians obtained by the National Institute of Diabetes and Digestive and Kidney Diseases. We have considered C = 28 clusters in the Partitioning procedure. The multi-fusion-k-means
yielded four clusters. The validity indices (RMSSTD and RS) select the clustering scheme of four clusters, while we reached an average ‘sureness’ of the
clusters greater than 99%. Similarly, the multi-fusion-greedy-EM method partitioned the data points into four clusters reaching an average ‘sureness’ of the
clusters greater than 96.5%. For the stand-alone greedy-EM algorithm, we used
500 data points for training and 268 for validation. We ran the algorithm for
C = 28 clusters and the optimal solution obtained was 5 clusters with average
‘sureness’ of the clusters 95%.
3.5
Discussion
An important conclusion that can be drawn from the experimental evaluation
is that the proposed multi-clustering fusion method results in a partitioning
scheme that fits optimally the specific data set according to some criteria, such
as ‘sureness’, RMSSTD and RS. We used two different basic clustering algorithms
and came up with similar clustering results. It can be claimed that the multiclustering fusion methodology, independently of the basic clustering algorithm
used, finds the ‘optimal’ number and shape of clusters that fit the data, thus
234
D. Frossyniotis, M. Pertselakis, and A. Stafylopatis
2.5
2.5
2
2
1.5
1.5
1
1
0.5
0.5
0
0
−0.5
−0.5
−1
−1
−1.5
−1.5
−2
−2
−2.5
−2.5
−3
−2
−1
0
1
2
3
−2
Fig. 7. Means and variances of
the kernels using the stand-alone
Greedy-EM for the Lith data set.
−1.5
−1
−0.5
0
0.5
1
1.5
2
Fig. 8. Means and variances of
the kernels using the stand-alone
Greedy-EM for the Banana data
set.
4
3
2
1
0
−1
−2
−3
−4
−3
−2
−1
0
1
2
3
Fig. 9. Means and variances of the kernels using the stand-alone Greedy-EM for the
Clouds data set.
dealing with the problem of initialization dependency and selection of the number
and shape of clusters.
Another interesting observation is that the proposed multi-clustering fusion
method almost always exhibits better clustering performance than the greedyEM algorithm, according to the adopted cluster validity methods and the term of
‘sureness’. However, this comparison should be considered as rather indicative.
4
Conclusions
This paper proposed a general unsupervised learning scheme for combining clustering results produced by several iterations of a basic clustering algorithm. A
fusion procedure takes the resulting partition and finds the optimal number of
clusters in the data set according to some cluster validity methods. Although the
general scheme has been explored here within the framework of K-means and
A Multi-clustering Fusion Algorithm
235
greedy-EM clustering, the data points are typically not uniquely assigned by the
fusion procedure to one cluster, so we can also consider ‘fuzzy’ partitioning.
We have shown that the clustering algorithm implemented in this work can
handle the problem of initialization dependency and selection of the number of
clusters. Moreover, as illustrated by the experimental results, the algorithm can
partition a data set into clusters which are shape independent.
Concluding, the proposed multi-clustering fusion algorithm does not require
additional user-specified parameters, since the only parameter needed to be defined is the initial number of clusters. It must be noted, however, that a good
value for this parameter was found experimentally depending on the size of the
problem. Ongoing work includes the adoption of other basic clustering algorithms
and experimentation with different fusion techniques, as well as comparison of
the proposed method with other AI clustering methods for selecting the optimal
number of clusters. Finally, this multi-clustering methodology can be used for
improving the performance of a multi-net classification system, which is based
on supervised and unsupervised learning [16].
References
1. A.K. Jain and R.C. Dubes. Algorithms for Clustering Data. Englewood Cliffs, N.
J.: Prentice Hall, 1988.
2. A.K. Jain, R.P.W. Duin, and J. Mao. Statistical pattern recognition: A review.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1), 2000.
3. M. Halkidi, Y. Batistakis, and M. Vazirgiannis. Clustering algorithms and validity
measures. In Proceedings of the SSDBM conference, Virginia,USA, July 2001.
4. J.C. Bezdek and S.K. Pal. Fuzzy Models for Pattern Recognition: Methods that
Search for Structures in Data. IEEE CS Press, 1992.
5. J.C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms.
Plenum Press, New York, 1981.
6. A.P. Dempster, N.M. Laird, and D.B. Rubin. Maximum likelihood from incomplete
data via the em algorithm. Roy. Statist. Soc. B, 39:1–38, 1977.
7. Vlassis N. and Likas A. A greedy-EM algorithm for Gaussian mixture learning. Technical report, Computer Science Institute, University of Amsterdam, The
Netherlands, May 2000.
8. E. Boundaillier and G. Hebrail. Interactive interpretation of hierarchical clustering.
Intell. Data Anal., 2(3), 1998.
9. A. Fred. Finding Consistent Clusters in Data Partitions. In Proceedings of the
Second International Workshop on Multiple Classifier Systems (MCS 2001), LNCS
2096, pages 309–318, Cambridge, UK, July 2-4 2001. Springer.
10. E. Dimitriadou, A. Weingessel, and K. Hornik. A voting-merging clustering algorithm. Working Paper 31, SFB ‘Adaptive Information Systems and Modeling in
Economics and Management Science’, April 1999.
11. P. Smyth. Clustering Using Monte Carlo Cross-Validation. In Proceedings Knowledge Discovery and Data Mining, pages 126–133, 1996.
12. P. Cheeseman and J. Stutz. Bayesian Classification (AutoClass): Theory and Results. In Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and Ramasamy Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining.
AAAI Press/MIT Press, 1996.
236
D. Frossyniotis, M. Pertselakis, and A. Stafylopatis
13. D.H. Fisher. Knowledge acquisition via incremental conceptual clustering. Machine
Learning, 2:139–172, 1987.
14. ESPRIT Basic Research Project ELENA (no. 6891).
[ftp://ftp.dice.ucl.ac.be/pub/neural-nets/ELENA/databases], 1995.
15. UCI Machine Learning Databases Repository, University of California-Irvine, Department of Information and Computer Science. [ftp://ftp.ics.edu/pub/machinelearning-databases].
16. D.S. Frossyniotis and A. Stafylopatis. A Multi-SVM Classification System. In
Proceedings of the Second International Workshop on Multiple Classifier Systems
(MCS 2001), LNCS 2096, pages 198–207, Cambridge, UK, July 2-4 2001. Springer.