Download Version2 - School of Computer Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Principal component analysis wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Nearest-neighbor chain algorithm wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

K-means clustering wikipedia , lookup

Cluster analysis wikipedia , lookup

Transcript
Analyzing Empirical Data in Software Engineering
Li Jiang
Armin Eberlein
Aneesh Krishna
School of Computer Science
The University of Adelaide, SA, 5000,
Australia
Computer Engineering Department
American University of Sharjah,
UAE
Curtin University of Technology
Perth, WA 6102, Australia
Abstract: Getting meaningful information from empirical
data is a challenging task in software engineering (SE)
research. It requires an in-depth analysis of the problem data
and structure to select the most suitable data analysis methods
as well as an evaluation of the validity of the analysis result.
This paper reports experiences with three data analysis
methods that were used to analyze a set of empirical data. One
of the major findings is that although each method has its own
value, none of them is sufficient to address all challenges on its
own. The research reveals that it is only possible to get
meaningful analysis results if several data analysis methods are
combined.
Keywords:
Requirements
Engineering,
Software
Engineering, Requirements Engineering Techniques, Data
Analysis Methods, Clustering.
I.
INTRODUCTION
The development of large and medium-sized software
systems usually involves complex processes that make use of
several development techniques. Since the term “software
engineering (SE)” was first coined in 1968 at the first SE
conference, numerous SE techniques have been proposed.
However, early experience has shown that there is no silver
bullet to deal with software engineering problems [1].
Therefore, most software development processes employ a
combination of techniques [2]. Furthermore, several
researchers have emphasized that it is important to select and
use suitable software engineering techniques to tackle
problems in software development [3-8]. Nevertheless, it is
not trivial to assess the suitability of an SE technique within
the context of a software project as many techniques are
available and numerous factors influence decision making.
We therefore started a research project focusing on the
analysis of the suitability of requirements engineering (RE)
techniques for a software project based on its characteristics
[9]. This resulted in methodologies 1 and a framework that
can help select the most suitable RE techniques for a
software project. The aim now is to broaden the framework
to the entire software development process, i.e., to develop
1
We acknowledge the differences between the two terms
“method” and “technique” as used in the SE research community
and the disparities of the definitions given for these two terms in
academia. The term “method” is deliberately used in this paper to
refer to any one or more algorithms and/or methods created for
data clustering and data analysis. The purpose of adopting this
terminology (in this paper only) is to differentiate the two terms
“method” and “technique” with the latter referring to SE
techniques or methods.
methods for the project-specific selection of SE techniques.
This is possible as RE techniques are a subset of SE
techniques, i.e., they possess similar knowledge elements
and structure [9]. However, to extend our previous
framework on the selection of RE techniques to the selection
of the SE techniques is very challenging as the number of SE
techniques is much larger than that of RE techniques. Thus
finding good methods to help analyze SE techniques
information will be the first problem we need to solve in
order to extend the mythologies that we developed and used
before to the selection of SE techniques.
In our previous research, we have:
 analyzed 46 RE techniques in depth. The RE techniques
are the most often used, well-documented and mature
techniques [2]. They are listed in Appendix 1.
 developed a set of attributes that help characterize RE
techniques [9]. The list of the techniques attributes is
given in Appendix 2. The attributes are classified into
two categories: Category 1 includes 13 essential
attributes that are generally applicable to all RE
techniques. The attributes in category 2 are
supplementary to those in category 1 and provide
additional information for the suitability of RE
techniques.
 conducted a survey among 8 RE experts from both
industry and academia to elicit a set of empirical data
about the abilities of 46 RE techniques in dealing with
practical problems [9]. The empirical data is sanitized
and validated against the research results published by
others. The obtained data set is shown in Appendixes 3A
and 3B. As shown in the table, each technique contains
31 attributes (31 dimensional information), a multidimensional data point.
 analyzed RE techniques using the Fuzzy C-means (FCM)
method [10-12]. The basic idea of the method is
illustrated in Figure 1. Partial clustering results are given
in Table 1 which includes the number of clusters and the
values of the cost function of the algorithm obtained in
the research (see [13] for more information about the
research result). Moreover, several new concepts and
relationships between the RE techniques have also been
identified from the research, such as comparable
techniques and complementary techniques [13].
Our initial data analysis of RE techniques using FCM
provided information about the similarity and differences of
Table 1 Clustering Result With The FCM Algorithm
Setting
Number of
clusters=4
Performance
using the
setting
Value of the
cost function
Number of
clusters=5
Number of
clusters=6
Number of
clusters=7
Number of
clusters=8
Number of
clusters=9
Number of
clusters=10
Number of
clusters=11
Number of
clusters=12
All
weights
are 1
(Wi=1)
Various
weights
All
weights
are 1
(Wi=1)
Various
weights
All
weights
are 1
(Wi=1)
Various
weights
All
weights
are 1
(Wi=1)
Various
weights
All
weights
are 1
(Wi=1)
Various
weights
All
weights
are 1
(Wi=1)
Various
weights
All
weights
are 1
(Wi=1)
Various
weights
All
weights
are 1
(Wi=1)
Various
weights
All
weights
are 1
(Wi=1)
Various
weights
27.12
12.96
25.43
10.65
21.78
6.18
18.81
5.11
17.55
3.60
17.92
3.64
18.12
5.50
19.55
5.87
19.38
6.81
Notes: The “weight” refers to the weight of each attribute of the techniques; “various weights” indicates that the attributes were assigned different weights based on the characteristics of the project.
Initialization: stopping value  , and fuzzification coefficient a,
(a=2 is used in this research) and the initial partition matrix
which satisfy.
N
0   uik  N i=1,2, …c,
k 1
and
C
u
i 1
ik
 1 K=1,2, …, N
For n=2 to c
Repeat
Update
n
mi 
u
j 1
n
a
ij
u
j 1
uij 
Xj
,
a
ij
1
 dij 
 

 
k 1  d kj 
p
a
Until max ik uik (iteration  1)  uik (iteration) < 
p
n
Calculate Cost   uij a d ij a
i 1 j 1
Endfor
Where
j=1,…, n; n is the number of objects,
i=1, …, p; p is the number of clusters.
mi is a vector representing the centroid of cluster i,
d i , j  X j  mi is the distance between each object X j and
the cluster centroid mi ,
u ij is the degree of membership of object j in cluster i.
Cost is the cost function calculated for each clustering trial
Fig. 1 Modified FCM Algorithm
RE techniques. However, the detailed analysis of the
clustering result revealed that a number of issues still
remained unanswered. For instance, we do not know how
good the clustering result is after we used FCM, and the
fitness of some techniques to a cluster is questionable based
on in-depth analysis of the techniques within a cluster.
Additionally, finding the right number of clusters is a tedious
process, as the clustering process has to be repeated many
times. Furthermore, the traditional FCM is a local search
algorithm that looks for the local minimum values of
membership with regard to a set of the selected centroids of
all data elements. However, support for RE techniques
selection requires dynamically generated information about
the similarities between RE techniques. The difficulty to find
the optimum number of clusters of techniques prevents us
from providing dynamic support for techniques analysis and
selection in the methodology that we developed for RE
technique selection. This is because clustering analysis is one
of the early essential steps for identification of the
relationships between the RE techniques in the methodology.
Thus, one of the challenging questions here in helping SE
techniques selection is how to find effective methods to
analyze the data of SE techniques to facilitate better
understanding of the techniques and the relationships
between the techniques, and using the best method to help
cluster and analyze SE techniques. To tackle the problem, we
have systematically investigated the existing research on data
analysis and mining on software engineering data. Our
results of the investigation have shown the existing research
on clustering SE data does not provide comparative or
empirical analysis information or heuristics on which
clustering techniques can help generate the clusters numbers
automatically or with limited intervention of human being.
To deal with this problem, we have explored and used three
methods to help cluster and analyze RE techniques in this
research: clustering based on statistical tests [14], genetic
algorithm [21], and dimension reduction [23](Principal
Component Analysis) with combination of FCM method.
The objectives of this research is to understand if the existing
clustering methods or other data analysis method can be used
together to offer meaningful help for data clustering and
analysis. This paper reports the results and experiences
obtained in this research.
The rest of the paper is organized as follows: Section 2
discuses some related research; section 3 presents the
clustering method that is based on statistical tests. Section 4
presents our experiences of using a genetic algorithm to
cluster RE techniques. The clustering of RE techniques by
using the combination of dimension reduction technique and
FCM is presented in Section 5. Section 6 presents our
conclusion and future research.
II.
RELATED RESEARCH
The work related to analysing software engineering data can
be traced back to 1950s when the information about the
“lines of the codes” was analysed [29]. However, the latest
research in formal classification and analysis of the software
engineering data can be accredited to the seminal work done
by Khoshgoftaar and Allen [30], and Mendonca and
Sunderhaft [31] in 1999. Khoshgoftaar and Allen use
classification and regression trees (CART) algorithm to help
modeling various software quality attributes. Whilst
Mendonca and Sunderhaft have done a survey to explore the
existing approaches that can be used in mining software
engineering data. Since then, a lot of research has been done
on mining software engineering data and using various data
mining techniques and data analysis techniques to analyse
software engineering data. For example, Zhong et. al.
investigated two algorithm k-means and Neural-Gas
clustering algorithms to conduct clustering-based analysis
for software quality-estimation problems [32]. Jiang et. al
investigated multiple centroid-based unsupervised clustering
algorithms for network intrusion detection [33]. Dickinson
et. al used clustering approach to analyze the execution
profile to help find the failure data [34]. The most related
research of the clustering algorithm analysis is the research
done by Baraldi and Blonda in [35] where five fuzzy
clustering algorithms for pattern reorganization are
discussed and compared. However, the comparison and
analysis are fundamentally based on theoretical models and
data models are largely limited to the homogeneous dataset.
Even though many Clustering methods have been used in
analysing software engineering data during the last ten years;
most of the existing research focuses on presenting the
clustering results by designing or using a set of clustering
algorithms. The advantage and disadvantage of these
algorithms, and how to use the clustering algorithms to
analyze heterogeneous dataset have not been explicitly
researched or discussed. It appears that limited research has
been found to systematically examine the merit of a
clustering technique utilised in a specific application where
the number of attributes is large and the data is fuzzy in
nature. This research tried to explore merits and issues of
the clustering techniques applied in a specific application.
III.
CLUSTERING BASED ON STATISTICAL TEST
As has been discussed in Section 1, one of the major
problems with FCM is that the optimum number of clusters
cannot be determined before the actual clustering begins. It
has to be determined by trial and error using repeated
clustering. To solve this problem, a Statistical Test-based
Clustering (STC) method was proposed by Gao et. al. [14].
The idea of the approach is to conduct statistical tests in
each cluster formed in the trial so that a unimodal
distribution is achieved in all the trial clusters. According to
the experiment results of applying algorithm done by Gao
et. al. [14], M, the number of origins randomly selected,
needs to exceed 10 to ensure the fast convergence and
generality of the algorithm. However, there are only 46
techniques (data points) in the current dataset, which means
that we get less than 5 (46/10=4.6) data points in each
cluster. Apparently 5 data points in each cluster are not
enough to generate a normal distribution which is a
fundamental requirement of application of the algorithm. To
apply the algorithm, we need to extend our original data set.
According to Lehmann and Casella [15], it is possible to
construct such an extended data set X which is completely
equivalent to the original X statistically. Thus, we have tried
to inject 184 new data points (4 times the original data
points) into the sampled space which increases the total data
points to 230. These new data points are called “unlabelled
data points” while the original data points are called
“labelled data points” which allows us to differentiate them
from the inserted ones. The unlabelled data points are
inserted in a semi-random manner: the data are generated
randomly with the same type and within the same range of
original data (as shown in Appendix 3A and 3B). The
inserted data can be considered as a kind of perturbed data.
Unlike the initial objective of using the perturbed data to
protect confidentiality of data [16], the objective of the
inserting data in this research is to find data patterns or
structures within the original data set. According to
Burridge [17], the property of sufficiency of the perturbed
data set with respect to the given statistical model must be
the same as that of the original data set. Within the given
data and the generated perturbed data in the research, it is
possible to infer that the extended data set X and the original
data set X have the same sufficient statistic 2 [36]. This is
because the inserted data have the same types, range and
distribution with the original ones. Thus, we can conclude
that the result of the statistic analysis of the original data set
will be the same as the statistic analysis of the extended data
set obtained as described above. By using addition of
perturbed data method, we are able to use the modified
Statistical Test-based Clustering method and obtain the
proper clustering numbers of the original dataset of RE
techniques. The modified algorithm is shown in Figure 2.
The algorithm was implemented by using C++ and the
Matlab™ software package. By using the algorithm and
extended data set approach as discussed above, we found
that each cluster reached its single peak distribution when
NL (number of clusters) =8. The obtained result NL is
similar with the values we got in [13].
Our initial analysis with the STC algorithm is promising as
the algorithm has a major advantage over FCM in that it
provides us with the best way to find the exact number of
clusters that we look for analyzing RE techniques.
However, the current result is still subject to further
experiments where the larger data set of SE techniques and
heterogamous dataset will be used. As our knowledge of SE
techniques increases, the number of available data sets of
SE techniques will likely increase, consequently, application
of Statistical Test-based Clustering method or further
improvement of the algorithm is likely which can help to
produce better analysis result.
IV.
2
USING GENETIC ALGORITHM IN FUZZY
CLUSTERING
A sufficient statistic refers to a statistic that has the property of
sufficiency with respect to a statistical model and its associated
unknown parameter θ that are used in statistical calculation and
reasoning [36], i.e. no other statistic that can be calculated from the
same dataset provides any additional information as to the value of
the parameter θ
Set initial values used in the algorithm, including:
(1) Setting M sampling origins represented with vector C
randomly in the data space X={xi, .., xn}.
n
 xi
where C = i 1 , n is the number of elements in X.
(2) Setting cn=2 where cn is the initial number of clusters.
(3) Calculating the distance Di between each sample xi
and vector C, and the distance Ui between xi and xi+1
where xi and xi+1 are perpendicular.
(4) k is the number of neighbours of xi that are going to
search; while p represent the number of clusters
(5) Selecting   0.05 and it is easy to find
T ( )
=1.64485 from the statistic table.
(6) Setting s=0;
1. For each cluster, compute the following normalized
T
statistic K where K is the number of statistical test, and p
is the dimension of the data set xi:
TK  (
1
M
M

i 1
Dip (k )
1
 )  12M
1 p
p
Di (k )  U i (k ) 2
2
T  T ( )
If K
, then s=s+1
Repeat 2 and 3 N times ( N>=100),
then calculate the size of test
s  s/ N
if s   ,
then data set X is declared as multimodal distribution
and is separable.
else X is declared as unimodal distribution.
5
For all i,
If data sets Xi

X is unimodal,
Then stop, and C is the number of clusters that is
looking for
Else cn=cn+1, and go to 2
Fig. 2 Algorithm for Computing the Number of Clusters
Based on the mechanism of natural selection and genetics,
genetic algorithms (GAs) have been designed and widely used
in many optimization problems [18, 19]. GAs have also been
used in clustering algorithms [20]. In this research, we want to
explore if GAs can be used to help generate better clustering
results or provide more information about the SE techniques
data. After comparing different GAs used in clustering methods,
the GA for the Fuzzy Clustering (GAFC) algorithm proposed
by Zhao in [21] was used to help cluster the RE techniques in
this research as the computation complexity of this algorithm is
not very high and it can reach better convergence than other
existing similar algorithms. According to Lee and Takagi, the
number of generations and mutation probability can be set
within the range of 10 to 160, and 0.0001 to 1.0 respectively
[27]. In our experiments, we had found that if the number of
generations set to above 15 and the mutation probability above
0.1, the algorithm can achieve a bit faster convergence that we
expected in computation. Finally, we set the number of
generations to 16, and the mutation probability to 0.10. C++
and several Matlab™ packages were used during the
implementation of the GAFC algorithm. This algorithm was
proved very expensive when we tried to use all 31 attributes
(dimensions) to conduct clustering. After running the
calculation for 2 weeks on a PC with a 1.8GHZ CPU, we were
only able to get clustering results when the number of clusters
is set to 3 or 4. To improve the efficiency of the algorithm, we
decided to reduce the number of attributes used in the
clustering. Instead of using all 31 attributes, 13 attributes in
Category 1 (see Category 1 attributes in Appendix 2) were
selected in clustering. These attributes were rated as highly
important RE technique attributes by RE experts in our
previous research. With the reduced number of attributes, the
number of clusters and the values of the cost function
calculated using GAFC algorithm are shown in Table 2. The
values of the cost function calculated with the FCM algorithm
are also presented in Table 2. As can be seen from the table,
GAFC algorithm converged quickly in the calculation of the
cost function before the number of cluster setting exceeds 6
with comparison to FCM However, it turns out that the
performance of GAFC algorithm is worse than FCM when the
number of clusters is greater than 6. Moreover, it can hardly
reach the reasonable decision with GAFC about the exactly
number of clusters that should be used in analysing RE
techniques as the values of cost function of the algorithm
decrease continuously even after the number of cluster passes 9.
Our initial investigation has shown that the likely two major
reasons for this phenomenon are:
 the limitation of the algorithm itself, and
 the characteristics of the date points that include 13
attributes i.e. the algorithm might not be suitable to higher
dimension (more than 13 dimensional) data.
Our further investigation on 6 attributes, randomly selected
from Category 1 data set, shows that the value of the cost
function with GAFC algorithm also converges fast and it
reaches the minimum values when the number of clusters
reaches 12. The generated number of clusters reaches 12 which
is different from the number of clustering (number 8 or 9)
generated with FCM algorithm in our early research. The major
reason of the difference is that the essential information of the
RE techniques data is lost since the number of attributes (each
attribute can be considered as one dimension of a technique)
used in the cluster is randomly removed. Thus, we conclude
that:
 the performance of GAs is best when it applies to the data
with less number of dimension (low-dimension), i.e. the
dimension of data is less than 6.
Table 2. GA Clustering Result Compared with FCM Clustering Results
Number of
Clusters
Cost Value by Using GAFuzzy Clustering Algorithm
Cost Value by Using
Fuzzy C-Means
Clustering Algorithm
2
3
4
5
6
7
8
9
12.76
9.32
7.86
6.72
5.91
5.01
4.05
3.29
18.89
16.23
12.96
10.65
6.18
5.11
3.60
3.64

it is not appropriate to use GAFC algorithm by randomly
removing a number of dimensions of the given dataset.
As the result of this observation, the immediate question is how
to reduce the dimensions of the data points whist keeping the
essential information of the data points so that the full potential
of GAs can be achieved. To tackle this issue, we utilize
dimension reduction methods which will be discussed in the
next section. The issues related to the improvement of the
GAFC algorithm itself is part of our future research.
V.
CLUSTERING BASED ON DIMENSION
REDUCTION
As has been discussed above, one of the major challenges of
conducting effective data analysis is that the data points contain
too many attributes (dimensions). Many data analysis and
clustering algorithms cannot deal with multi-dimensional data
effectively. Thus, one of the solutions to tackle this problem is
to reduce the dimensions of the data points while keep the
essential information of the original data. Dimension reduction
methods have been used widely in computer vision and pattern
recognition research and have proved effective in analysing the
data that contains many dimensions [22-25]. The major
objective of dimension reduction is to search for a propertypreserved low-dimensional representation of the higher
dimensional data, i.e. to map the high dimensional space to
lower dimensional space in such a way that the required
properties are preserved. For example, we can map a data set
{ D1 (a1 a 2 …a i a i+1 …a n),… Dn (a1 a 2 …a i a i+1 …a n) } that
contains n attributes to a data set { D′1 (a1 a 2 …a k)… D′n (a1 a
2 …a k) } by using certain dimension reduction algorithm;
whilst D′1 (a1 a 2 …a k) contains the essential information of D1
(a1 a 2 …a i a i+1 …a n). Most often, the dimensional reduction
problem is formulated as an optimization problem and the
required properties are quantified by an objective function.
Application of dimensional reduction techniques to software
engineering data makes perfect sense as software engineering
data usually have high dimensionality [28]
There are many dimension reduction methods available, such
as Principal Component Analysis [23], Projection Pursuit [24],
and Principal Curves [25]. In our research, we use the Principal
Component Analysis (PCA) method as it is the most widely
used method in practice and suitable to the size and type of data
that we want to process. Moreover, PCA has been implemented
in one of the Matlab™ software packages (princomp) which
can be used directly. The fundamental idea of PCA is to project
the data with high dimensions along the dimensions with
maximal variances so that the reconstruction error of lowdimension data points can be minimized and properties of the
data points can be maximally preserved.
In this research, we also used a modified algorithm of PCA in
which the weight Wj  [0, 1], j=1, m, is applied to each
attribute based on the importance of each attributes with respect
to RE technique judged by requirements engineers. The
algorithm was also implemented with C++. The modified
algorithm is illustrated in Figure 3 below. We have used both
the princomp in Matlab™ and our implementation of the
modified algorithm in our experiment in this research. By using
the algorithms, we have successfully reduced the 31
dimensional data points (see Table 3) to 6 dimensional data
points. Some examples of the 6 dimensional data points with
princomp are presented in Table 3. The entire list of the
generated data with the reduced dimensions is given in
Appendix 4.
The data generated in the dimension reduction operation (see
Appendix 4) was clustered using the FCM algorithm. The
results of the clustering are shown in Table 4. In the table, a is
fuzzification coefficient, normally a is set to 2. We set a to 1.5
and 2 respectively in this research to compare the convergence
effect of the FCM algorithm before and after using dimension
reduction techniques. As can be seen, there are two choices for
selecting the number of clusters: 8 and 9. The obtained number
of clusters is essentially the same with the number of the
clusterings that we used in our previous research where only
FCM was used. The major gain of using PCA is that it helps to
reduce the complexity of using FCM in the later stage of
clustering. The major problem with the PCA is that it is still
hard to tell the exact number of clusters that shall be chosen,
i.e., this number has to be determined by humans as can be
seen from Table 4, both NL=8 or NL=9 are valid option.
Improving the algorithm further or combining PCA with
GAFC is subject of future research.
1. Prepare the initial data set:
(1) Initializing the data set X i , j , Wj, i=1, n; j=1, m, and
and construct the original matrix
Ui , j
(2) For j=1 to m, calculating
X i, j 
1 n
 X Wj ,
n i1 i , j
(3) Generating U inew
, the new adjusted matrix,
,j
U inew
, j  U i, j  {X i, j }
2. Calculate
(1) Calculating the covariance matrix of
U inew
,j
(2) Calculating the eigenvectors and eigenvalues of the
covariance matrix in 2(1)
(eig1, eig2,… eigm),
(3) Choosing the components and forming a feature vector
FeatureVector=(eig1, eig2,… eigk ),
k (k<=m) is the number of the components
selected for the projection of the original data in
Ui ,
,j
(4) Deriving the new data set
new
FinalDataSet = FeatureVector × U i , j
Note: Wj  [0, 1], j=1, m, is the weight given to each attribute
Fig. 3 A Modified PCA Algorithm
Table 3. An Example of Generated Dataset after Dimension Reduction
Techniques
T1
T2
T3
T4
T5
T6
T7
T8
D1
-1.892500
-1.630000
-1.168700
-1.914800
-1.672900
-1.915200
-1.658000
-0.779230
D2
0.079340
0.103430
0.379490
-0.322170
-0.824850
-0.281250
-0.170510
-0.704200
Notes: (1) Di represents dimension i .
D3
-0.206770
-0.414710
-0.075140
-0.503340
-0.532970
-0.304210
-0.246680
0.290550
Value of Cost
Function (a=1.5)
Value of Cost Function
(a=2)
NL=2
0.049800
0.074970
NL=3
-0.004067
0.023239
NL=4
-0.011247
0.000535
NL=5
-0.009762
0.002730
NL=6
-0.001504
0.030802
NL=7
-0.000941
0.020111
NL=8
-0.013643
-0.010694
NL=9
-0.013643
-0.012545
-0.009591
NL=10
Note: a is fuzzification coefficient.
VI.
D5
-0.419870
-0.663370
-0.558350
-0.617720
0.177100
-0.334170
-0.345710
-0.197310
D6
-0.069110
-0.106870
0.117600
0.004040
-0.117170
0.214430
0.303790
0.635260
(2) See Appendix 1 for the name of Ti
Table 4. Clustering Result
Number of
Clusters
D4
0.030650
0.304470
-0.112600
0.045320
0.094020
0.018030
0.195570
-0.309960
will lead to better results for clustering analysis which is the
question that is subject to our future research. To design and
develop a tool which can facilitate using different data analysis
methods to analyse SE data is another topic for our future
research.
REFERENCES
[1]
[2]
[3]
0.002380
CONCLUSION AND FUTURE WORK
Analysis of empirical data is a difficult yet important task in
SE [26]. This paper reported our experiments with three data
analysis methods and algorithms: clustering based on statistical
tests method, genetic algorithm method and the clustering by
using dimension reduction method on the empirical data
obtained in our previous research. We presented the issues and
demonstrated possible ways to help deal with the determination
of the cluster number and reduction of dimensions for effective
clustering. The research has shown that the best solution to
analyse empirical data is to combine different data analysis
methods. This combined approach might be a time-consuming
and daunting process; however, it is the only way to help
discover meaningful information and underlying structure of
the data. Moreover, with further research and combination of
more clustering methods, it is possible to reduce the effort used
in analysing SE data. At this stage, it is safe to say that the STC
algorithm is promising as the algorithm can provides a way to
find the exact number of clusters that we look for even though
the validity of the conclusion is still subject to further
experiments where the larger data set of SE techniques and
heterogamous dataset are used. Additionally, a combination of
PCA and FCM can provide better data analysis results based on
our experience obtained in this research. Finally, it is still
difficult to say whether the combination of PCA, GA and FCM
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
Brooks F.: No Silver Bullet-Essence and Accident in Software
Engineering, IEEE Computer, 20(4), 10-19 (1987),
Jiang L., Eberlein A., Far B.H. and Mousavi M.: A Methodology
for the Selection of Requirements Engineering Techniques, Journal
of Software and Systems Modeling, 7 (3), 303-328 (2008)
Glass R.L.: Matching Methodology to Problem Domain, Comm. Of
The ACM, 47 (5), 19–21 (2004)
Basili V.R.: The Role of Experimentation in Software Engineering:
Past, Current, and Future, Proc. 18th Int. Conference on Software
Engineering, Berlin, Germany, pp. 442–449 (1996)
Emam K.E., Birk A.: Validating the ISO/IEC 15504 Measure of
Software Requirements Analysis Process Capability, IEEE Trans.
on Software Engineering, 26 (6), 119-149 (2000)
Zowghi D., Damian D., Offen R.: Field Studies of Requirements
Engineering in a Multi-Site Software Development Organization:
Proc. Australian Workshop on Requirements Engineering, Univ. of
New South Wales, (2001)
Neill C.J., Laplante P.A.: Requirements Engineering: the State of
the Practice, IEEE Software, 20 (6), 40–45 (2003)
Antón A.I.: Successful Software Projects Need Requirements
Planning, IEEE Software, 20 (3), pp 44–46 (2003)
Jiang L.: A Framework For Requirements Engineering Process
Development, PhD Thesis, University of Calgary, Canada, Sep.
(2005)
Dunn J. A Fuzzy Relative of the ISODATA Process and its use in
Detecting Compact, Well Separated Cluster. Journal of Cybernetics
3(3),: 32-57 (1974)
Cluster Validity With Fuzzy Sets, Journal of Cybernetics 3(3), 5871: (1974)
Bezdek, J.C.. Pattern Recognition with Fuzzy Objective Function
Algorithms, Plenum Press (1981)
Jiang L, Eberlein A.: Clustering Requirements Engineering
Techniques, The 10th IASTED International Conference on
Software Engineering and Applications, Nov. 13-15, Dallas, Texas,
USA. (2006)
Gao X.B, Ji HB, Li J.: An Advanced Cluster Analysis Method
Based on Statistical Test. IEEE ICSP, pp. 1100-1103, (2002)
Lehmann, E.L., Casella, G. Theory of point estimation. New York:
Springer-Verlag. (1998).
[27] Lee M. A., Takagi H., “Dynamic control of genetic algorithms
using fuzzy logic techniques,” in Proc. Int. Conf. Genetic
Algorithms, Urbana-Champaign, IL, July 1993, pp. 76–83.
[28] Goel A. L. & Shin M. Software engineering data analysis
techniques (tutorial), Proceedings of the 19th international
conference on Software engineering, Boston, Massachusetts,
United States pp: 667 - 668, 1997
[29] Jones, C. Applied Software Measurement: Global Analysis of
Productivity and Quality, Third Edition, McGraw-Hill, 2008
[30] Khoshgoftaar T. M. and Allen E. B. Modeling Software Quality
with Classification Trees. In Recent Advances in Reliability and
Quality Engineering, Hoang Pham Editor. World Scientific,
Singapore, 1999.
[31] Mendonca M. and Sunderhaft N. L.. Mining software engineering,
data: A survey. A DACS state-of-the-art report, Data & Analysis
Center for Software, Rome, NY, 1999.
[32] Zhong S, Khoshgoftaar T.M., Seliya N,. Analyzing software measurement data with clustering techniques. IEEE Intelligent
Systems, 19(2):20, 27, March/April 2004.
[33] Jiang SY, Song XY, Wang H, et al. A clustering-based method for
unsupervised intrusion detections, PATTERN RECOGNITION
LETTERS Vol.: 27 Issue: 7, Pages: 802-810, MAY 2006
[34] Dickinson W., Leon D., and Podgurski A.. Finding failures by
cluster analysis of execution profiles. In Proc. of the Int. Conf. on
Software Engineering, 2001.
[35] Baraldi A, Blonda P. A Survey of Fuzzy Clustering Algorithms for
Pattern Recognition-Part I, IEEE Transactions on Systems. Man
Cybern Part B: Cybern 1999;29(6):778–85.
[36] Hogg, R. V., Craig, A.T. Introduction to Mathematical Statistics.,
Macmillan Publishing Co. Inc. 1978
[16] Liu K.; Kargupta, H.; Ryan, J.; Random projection-based
multiplicative data perturbation for privacy preserving distributed
data mining, IEEE Transactions on Knowledge and Data
Engineering, 18(1), Jan. 2006, 92 - 106
[17] Burridge, J. Information preserving statistical obfuscation. Statistics
and Computing, 13(4), 321–327.. (2003).
[18] Gen, M., Cheng, R.: Genetic Algorithms and Engineering Design.
New York: Wiley. (1997).
[19] Chambers, L.D. The Practical Handbook of Genetic Algorithms
Applications. Chapman & Hall, CRC, (2001).
[20] Cordon, O.: Ten Years of Genetic Fuzzy Systems: Current
Framework and New Trends, in Proceedings Joint 9th IFSA World
Congress and 20th NAFIPS International Conference (Cat. No.
01TH8569)(0-7803-7078-3, 978-0-7803-7078-4), 1241, (2001)
[21] Zhao L., Tsujimura Y., Gen M.: Genetic Algorithm for Fuzzy
Clustering, In Proceedings of IEEE International Conference on
Evolutionary Computation (0-7803-2902-3, 978-0-7803-2902-7),
716. (1996)
[22] Carreira-Perpinan M.A.: A Review of Dimension Reduction
Techniques. Technical Report CS-96-09, Department of Computer
Science, University of Sheffield, (1997).
[23] Jolliffe I.T.: Principal Component Analysis, Springer Series in
Statistics, Springer-Verlag, Berlin, (1986).
[24] Jones M.C.: The Projection Pursuit Algorithm for Exploratory Data
Analysis, PhD thesis, University of Bath, (1983).
[25] Hastie T.J., Stuetzle W.: Principal curves, J. of Ame. Stat. Assoc.,
84, 502-516 (1989)
[26] Shin M., Goel A.L.: Empirical Data Modeling in Software
Engineering Using Radial Basis Functions. IEEE Tran. on Software
Engineering (0098-5589), 26 (6), 567 (2000).
Appendix 1 The List of RE Techniques and Notations
Names of The Techniques
Brain Storming and Idea Reduction
Notation Used for Representation
of Each Technique
T1
Names of The Techniques
State Charts
Notation Used for Representation of
Each Technique
T24
Designer As Apprentice
Document Mining
T2
T3
Petri-Nets
Structured Analysis (SA)
T25
T26
Ethnography
T4
Real-Time Structured Analysis
T27
Focus Group
Interview
T5
T6
Object-Oriented Analysis
Problem Frame Oriented Analysis
T28
T29
Contextual Inquiry
Laddering
T7
T8
Goal-Oriented Verification and Validation
Entity Relationship Diagrams
T30
T31
Viewpoints-Oriented Elicitation
Exploratory Prototype
T9
T10
AHP
Card Sorting
T32
T33
Evolutionary Prototypes
T11
Software QFD
T34
Viewpoints-Oriented Analysis
Repertory Grids
T12
T13
Fault Tree Analysis
Structured Natural Language Specification
T35
T36
Scenario Approach
JAD
T14
T15
Viewpoints-Oriented Verification and Validation
Unified Modeling Language (UML)
T37
T38
The Soft Systems Methodology (SSM)
T16
Z
T39
Goal-Oriented Analysis
Viewpoints-Based Definition
T17
T18
LOTOS
SDL
T40
T41
Future Workshops
Representation Modeling
T19
T20
XP
Formal Requirements Inspection
T42
T43
Functional Decomposition
Decision Tables
T21
T22
Requirements Testing
Requirements Checklists
T44
T45
State Machine
T23
Utility Test
T46
Notation Used For Representation of Each Attribute
Appendix 2. RE Techniques Attributes & Notations
Notation Used For
Representation of
Each Attribute
Techniques Attributes
Category
Techniques Attributes
Notation Used For
Representation of
Each Attribute
Category
Ability to facilitate the communication
A1
2
Capability for requirements verification
A17
1
Ability to understand social issues
A2
2
A18
2
Ability to get domain knowledge
A3
1
Completeness of the semantics of the
notation
Ability to write unambiguous and precise
A19
1
Ability to get implicit knowledge
A4
1
requirements
bycomplete
using therequirements
notation
Ability to write
A20
1
Ability to identify stakeholders
A5
1
Capability for requirements management
A21
1
Ability to identify non-functional
requirements
Ability
to identify various viewpoints
A6
1
Modularity
A22
2
A7
2
Implementabillity (Executability)
A23
2
Ability to model and understand requirements
A8
1
A24
2
Understanding ability for the notations used
in analysis
Ability
to analyze non-functional
A9
2
Ability to identify the unambiguous
requirements
Ability
to identify the interaction
A25
1
A10
1
A26
2
requirements
Ability
to facilitate the negotiation with
customer
Ability
to prioritize the requirements
A11
2
(ambiguous,
inconsistency,
conflict)
Ability
to identify
the incomplete
requirements
Ability
to support COTS-based RE
A27
2
A12
2
process of the supporting tool
Maturity
A28
1
Ability to identify the accessibility of the
system to model interface requirements
Ability
A13
2
Learning curve (Introduction cost)
A29
2
A14
1
Application cost
A30
2
Ability to identify and support requirements
reuse
Ability to represent requirements
A15
2
Complexity of the techniques
A31
2
A16
2
(expressibility)
Appendix 3A
A1 A2 A3 A4 A5 A6 A7 A8 A9 A10
T1 0.8 0.4 1 0.2 1 1 0.8 0 0 0.6
T2 0.8 1 1 1 0.2 1 0.2 0 0 0
T3 0 0.8 1 0.2 0.2 0.8 0.4 0 0 0
T4 0.6 0.8 1 1 0.6 0.4 0.4 0 0 0
T5 1 1 0.6 0.4 1 1 0.8 0 0 0
T6 0.8 0.8 0.6 0.2 1 1 0.8 0 0 0
T7 1 1 0.6 0.2 1 0.6 0.6 0 0 0
T8 0.6 0.6 0.6 0.2 1 0.6 0.6 0 0 0
T9 0.8 1 0.8 0.6 1 0.8 1 0.6 0 0
T10 0.8 0.2 0.4 0.2 0 0 0 0.8 0 0
T11 0.4 0 0 0 0 0 0 1 0.8 0
T12 0 0 0 0 0 0 0 0.8 0.8 0.6
T13 1 0.6 0.6 0.6 0.6 0.2 0.4 0 0 0
T14 0.8 0.6 0.4 0.2 0.4 0.2 0.8 1 1 0.2
T15 1 1 0.6 0.2 1 0.8 1 0 0 0
T16 1 1 0.6 0.2 1 0.4 0.6 0 0 0.6
T17 0 0 0 0 0 0 0 0.8 0.8 0.6
T18 0 0 0 0 0 0 0 0 0 0
T19 1 1 0.6 0.2 1 0.8 1 0.8 0 0.6
T20 0.8 0 0 0.2 0 0 0.2 1 1 0
T21 0 0 0 0 0 0 0 0.8 1 0.2
T22 0 0 0 0 0 0 0 1 1 0
T23 0 0 0 0 0 0 0 1 0.6 0
A11
0
0
0
0
0
0
0
0
0
0.8
0.2
0.8
0
0.4
0
0
0.8
0
0.4
0.4
0.4
0.4
0.4
RE Techniques Assessment (Empirical) Data (1)
A12
0
0
0
0
0
0
0
0
0
0
0
0.6
0
0.4
0
0
0.4
0
0.6
0.2
0.2
0
0
A13
0
0
0
0
0
0
0
0
0
0.8
0.8
0.4
0
0.8
0
0
0.2
0
0.6
0.6
0.6
0.6
0.8
A14
0
0
0
0
0
0
0
0
0
1
1
0.4
0
0.6
0
0
0.4
0
0.4
1
0.2
0
0
A15
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.2
0
0
0
0
0
0
A16
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0.8
0
1
0.8
1
1
A17
0
0
0
0
0
0
0
0
0
0
0.8
0
0
0.6
0
0
1
0.6
0.6
0.6
0.2
0.4
0.8
A18
0
0
0
0
0
0
0
0
0
0
0
0
0
0.6
0
0
0.8
1
0
0.6
0.6
0.2
0.6
A19
0
0
0
0
0
0
0
0
0
0
0
0
0
0.8
0
0
1
0.8
0
1
0.6
1
1
A20
0
0
0
0
0
0
0
0
0
0
0.4
0
0
0.6
0
0.4
0.8
0.8
0.6
0.6
0.6
0.6
0.6
A21
0
0
0
0
0
0
0
0
0
0
0
0
0
0.6
0
0
0.8
0.8
0
0.8
0.6
0.8
1
A22
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.8
0.2
0
0
0.6
0
0
A23
0
0
0
0
0
0
0
0
0
0
0
0
0
0.4
0
0
1
0.2
0
0
0
0
0
A24
0
0
0
0
0
0
0
0
0
0
0
0
0
0.4
0
0
1
0
0
0
0
0.2
0.6
A25
0
0
0
0
0
0
0
0
0
0
0.4
0
0
0.4
0
0
0.8
0
0
0.2
0
0.4
0.6
Legend: 1. See Appendix 1 for the technique name that each T j represents in the table.
2. See Appendix 2 for the attribute name that each Ai represents in the table.
3. The number in each cell represents the degree of how each technique satisfies each attribute
A26
0
0
0
0
0
0
0
0
0.8
0
0.2
0
0
0.2
0
0.2
0.6
0
0.8
0
0
0
0
A27
0
0
0
0
0
0
0
0
0
0
0
0
0
0.2
0
0
0
0
0
0
0
0
0
A28
0
0
0
0
0
0
0
0.4
0.8
0.8
0.6
0
1
0.6
0.4
0
0.6
0.8
0
0.8
0.8
0.8
0.6
A29
0.2
0.2
0.2
0.4
0.6
0.2
0.2
0.2
0.4
0.2
0.4
0.2
0.4
0.4
0.6
0.6
0.8
0.4
0.4
0.2
0.4
0.4
0.6
A30
0.6
0.6
0.4
0.6
0.6
0.4
0.4
0.4
0.4
1
0.4
0.2
0.4
0.4
0.6
0.6
0.6
0.4
0.6
0.2
0.2
0.2
0.6
A31
0.2
0.2
0.2
0.4
0.6
0.2
0.2
0.2
0.2
0.4
0.4
0.2
0.4
0.4
0.2
0.6
0.8
0.2
0.4
0.2
0.2
0.4
0.6
Appendix 3B RE Techniques Assessment (Empirical) Data (2)
T24
T25
T26
T27
T28
T29
T30
T31
T32
T33
T34
T35
T36
T37
T38
T39
T40
T41
T42
T43
T44
T45
T46
A1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0.4
0
0
0
1
0
0
0
0
A2
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0.6
0
0
0
0
A3
0
0
0
0
0
0
0
0
0
0
0.4
0
0
0
0
0
0
0
0.4
0
0
0
0
A4
0
0
0
0
0
0
0
0
0
0
0.2
0
0
0
0
0
0
0
0
0
0
0
0
A5
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0.4
0
0
0
0
A6
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0.2
0
0
0
0
A7
0
0
0
0
0
0
0
0
0
0
0.8
0
0
0
0
0
0
0
0
0
0
0
0
A8
1
1
1
1
1
1
0
1
0
0
0
0.8
0.6
0
1
1
1
1
1
0
0
0
0
A9
0.8
0.2
1
0.8
0.8
1
0
1
0
0
0
1
1
0
0.8
0.4
0.4
0.4
0.8
0
0
0
0
A10
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
A11
0.4
0
0.4
0.4
0.4
0.2
0
0.2
0.6
0.2
0.8
0.2
0.2
0
0.8
0
0
0
1
0
0
0
0
A12
0
0
0.2
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0.8
0
0
0
0
A13
0.6
0.8
0.8
0.8
0.8
0.6
0
0.8
0
0
0.6
0.6
0.6
0
0.6
1
1
1
0.2
0
1
1
1
A14
0.6
0
0
0
0.6
0.6
0
0
0
0
0.6
0
0
0
1
0
0
0
0.8
0
0
0
1
A15
0
0
0
0
1
1
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
A16
1
1
1
1
1
1
0
1
0
0
0
1
1
0
1
1
1
1
0.8
0
0
0
0
A17
0.8
0.8
0.4
0.4
0.4
0.4
1
0.6
0
0
0.4
0
0
0.4
0.6
1
1
1
0.4
0.4
1
0.8
0.8
A18
0.6
0.8
0.6
0.6
0.6
0
0
0.6
0
0
0
0.6
0.6
0.4
0.8
1
1
1
0
0
0
0
0
A19
0.8
1
1
1
0.8
0.6
0
0.8
0
0
0
0.6
0.6
0
0.8
1
1
1
0.4
0
0
0
0
A20
0.6
0.6
0.6
0.6
0.6
1
0
0.8
0
0
0.8
0.6
0.6
0
0.8
0.6
0.6
0.6
0.4
0.6
0.6
0.6
0.6
Legend: 1. See Appendix 1 for the technique name that each T j represents in the table.
2. See Appendix 2 for the attribute name that each Ai represents in the table.
A21
0
1
1
1
0.8
0.8
0
0.8
0
0
0
0.6
0.6
0
0.8
1
1
0.4
0.8
0.8
0.8
0.8
0.8
A22
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.8
0.8
0.8
0.8
0
0
0
0
0
A23
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.6
1
1
1
0
0
0
0
0
A24
0.6
0.6
0
0
0
0
0.8
0
0
0
0
0
0
0.8
0.6
1
1
1
0.4
0.4
0
0
0
A25
0.2
0.6
0.4
0.4
0.4
0.2
0.8
0.4
0
0
0
0
0.2
0.4
0.8
1
1
1
0.4
0.4
1
1
1
A26
0
0
0
0
0
0.8
0.6
0.2
0
0
0.8
0.2
0.2
0.8
0.4
0
0
0
0
0.8
0.8
0.8
0.8
A27
0
0
0
0
0
0
0
0
1
1
1
0
0
0.4
0
0
0
0
0
0
0
0
0
A28
0.6
0.8
0.8
0.6
0.8
0
0.6
0.8
1
0
0.8
0.8
0.8
0.4
1
0.8
0.8
0.8
0.6
0
0
0
0
A29
0.6
0.8
0.4
0.6
0.6
0.4
0.8
0.4
0.6
0.2
0.8
0.2
0.2
0.8
0.6
1
1
1
0.4
0.6
0.2
0.2
0.2
A30
0.6
0.8
0.6
0.6
0.4
0.4
0.6
0.4
0.4
0.4
0.4
0.2
0.2
0.8
0.4
0.8
0.8
0.8
0.4
0.4
0.6
0.6
0.6
A31
0.4
0.8
0.4
0.6
0.2
0.4
0.8
0.4
0.4
0.2
0.8
0.2
0.2
0.8
0.6
1
1
1
0.4
0.6
0.2
0.2
0.2
Appendix 4. An Example of Generated Dataset after Dimension Reduction
Techniques
T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12
T13
T14
T15
T16
T17
T18
T19
T20
T21
T22
T23
T24
T25
T26
T27
T28
T29
T30
T31
T32
T33
T34
T35
T36
T37
T38
T39
T40
T41
T42
T43
T44
T45
T46
D1
-1.892500
-1.630000
-1.168700
-1.914800
-1.672900
-1.915200
-1.658000
-0.779230
-1.739400
-0.237480
-0.016696
-0.245810
-0.881600
0.418380
-1.210400
-1.706900
1.574500
0.795750
-1.737100
0.929870
0.839770
1.185200
1.420900
1.366700
1.447800
1.304900
1.290600
1.245400
0.867110
-0.007875
1.306100
-0.517170
-0.686850
-1.888500
0.720930
0.708600
-0.194780
1.630800
1.946700
1.946700
1.785700
0.054956
0.047944
0.047944
0.022047
0.025468
D2
0.079340
0.103430
0.379490
-0.322170
-0.824850
-0.281250
-0.170510
-0.704200
-0.267080
-0.742650
-0.653690
0.289370
-0.435800
-1.243800
-1.325600
0.066620
-0.358890
0.607680
-0.693110
-0.824860
-0.197750
-0.240130
-0.122170
-0.307620
0.015373
-0.332190
-0.232910
-0.420970
-0.200102
1.392200
-0.344460
0.985160
1.276400
-0.749610
-0.257850
-0.152110
1.315200
-0.573940
-0.116060
-0.116060
-0.104740
-0.089951
1.332400
1.332400
1.330100
1.145700
D3
-0.206770
-0.414710
-0.075140
-0.503340
-0.532970
-0.304210
-0.246680
0.290550
-0.079758
0.577910
0.191440
1.181900
0.344940
-0.055321
-0.576650
-0.191640
-0.900520
0.112670
-0.552530
0.653110
1.017000
0.573550
-0.191520
-0.257750
0.507080
0.460200
0.270880
0.596130
0.585730
-0.579310
0.462860
0.920430
0.891440
-0.445390
0.940860
0.864600
-0.117810
-0.111530
-1.360400
-1.360400
-1.313500
0.375560
-0.530030
-0.520030
-0.450630
-0.397900
Notes: (1) Di represents the generated new dimension i .
(2) See Appendix 1 for the name of Ti
D4
0.030650
0.304470
-0.112600
0.045320
0.094020
0.018030
0.195570
-0.309960
-0.332310
0.380390
0.623020
0.237940
-0.266430
0.169370
0.388760
-0.372020
-0.407790
-0.698890
0.363110
0.347560
-0.078768
0.090900
0.079606
-0.019063
-0.169040
0.100120
0.095809
0.435600
1.129800
-0.263990
0.425450
-0.943720
-0.498060
-0.356510
-0.086485
-0.064430
-0.552300
0.254390
-0.488250
-0.488250
-0.617980
0.136670
1.088900
1.088900
1.552000
1.338500
D5
-0.419870
-0.663370
-0.558350
-0.617720
0.177100
-0.334170
-0.345710
-0.197310
-0.234040
0.135690
0.262950
0.801180
-0.320780
0.050422
0.434210
-0.319680
0.725390
-0.738230
0.218520
-0.026437
-0.242140
-0.178770
-0.142130
-0.084030
-0.276290
-0.276920
-0.313150
0.019795
0.120070
0.640890
-0.253730
0.940190
0.653350
0.868440
-0.430780
-0.422720
0.629180
0.628220
-0.004545
0.004545
0.089288
0.790240
-0.096070
-0.059607
-0.058295
0.273890
D6
-0.069110
-0.106870
0.117600
0.004040
-0.117170
0.214430
0.303790
0.635260
-0.823150
-0.899700
-0.710950
-0.039679
-0.319430
0.072067
-0.037322
-0.354170
0.279240
0.387110
0.005525
-0.033600
0.009505
-0.047044
0.055687
-0.012172
-0.027596
0.227060
0.118010
0.014297
-0.117510
-0.586240
0.097974
0.480050
0.420860
0.959770
0.234140
0.296120
-0.749110
-0.013977
-0.075070
-0.075070
-0.277230
-0.087457
0.323940
0.323940
0.293840
0.176250