Download Supporting Web Application Evolution by Dynamic Analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

URL redirection wikipedia , lookup

Transcript
Supporting Web Application Evolution by Dynamic Analysis
Giuseppe Antonio Di Lucca*, Massimiliano Di Penta*,
Anna Rita Fasolino°, Porfirio Tramontana°
{dilucca, dipenta}@unisannio.it, {fasolino, ptramont}@unina.it
*RCOST - Research Centre on Software Technology, University of Sannio
Palazzo ex Poste, via Traiano, 82100 Benevento, Italy
° Dipartimento di Informatica e Sistemistica, Università di Napoli Federico II
Via Claudio, 21, 80125 Napoli, Italy
appropriate notations (see, for example, the Conallen’s
UML extension [4]) have been proposed.
However, in most cases the only source of
documentation available is just the source code of the
WA itself. This because the fast required development,
often, does not permit the production of adequate
development documentation, useful to reduce the effort of
maintenance/evolution operations. The only possibility is,
in this case, to recover the needed missing information by
reverse engineering the WA [8].
Abstract
The evolution of Web Applications needs to be supported
by the availability of proper analysis and design
documents. UML use case diagrams are certainly useful
to identify features to evolve, as well as to study the Web
Application evolution in terms of features added/removed
or changed. Unfortunately, very often the only source of
documentation available is constituted by the Web
Application source code.
This paper proposes an approach to abstract use case
diagrams from execution traces of a Web Application.
The approach is mainly based on the analysis of a graph
modelling the transitions between the pages navigated
along user sessions and the clustering of the navigated
pages. A case study carried out to validate the proposed
approach and showing its feasibility is reported in the
paper.
In this paper we will focus on the reverse engineering
of WA use case diagrams. These diagrams can support
the WA evolution in different possible ways, such as:
- identifying the features to be evolved, mapping
the modification requests to use cases;
- identifying the features impacted by a
modification;
- analyzing how the WA evolves in terms of
features added, removed or changed: this can be
made analyzing use case diagram snapshots
taken at different releases; or
- supporting the regression testing of the evolved
WA.
Keywords: web application reverse engineering,
dynamic analysis, UML diagram abstraction
1. Introduction
The rapid diffusion of Web Application (WAs), and the
growth of their complexity, have raised the need for
supporting their evolution with a disciplined life cycle
and proper methodologies. Due to the market pressure,
Web applications are characterized by a fast development
and by a high rate of maintenance and evolution
operations, to continuously adapt the application to the
new needs.
When performing maintenance/evolution activities, it
is necessary to have the WA analysis and design
documentation available to effectively and correctly
perform the required intervention. To this aim,
Many WA reverse engineering tasks, included the one
described in this paper, are quite difficult to be performed
relying only on static analysis of the code. This is already
well-known for traditional software: tasks such as the
recovery of design patterns [10], scenarios and sequence
diagrams [16, 3] are just some examples.
WAs tend to be more and more highly interactive and
dynamic than traditional applications: HTML pages can
be dynamically built by server pages, thus, according to
the user inputs or requests, the WA user interface may
1
Proceedings of the 2005 Eighth International Workshop on Principles of Software Evolution (IWPSE’05)
1550-4077/05 $20.00 © 2005
IEEE
adoption of the Conallen notation [4] for WA
documentation introduced the need for reverse
engineering UML documentation relying on that
extension. To this aim, Di Lucca et al. [8] proposed an
approach and a tool, named WARE, to recover WA's
documentation represented by UML diagrams.
In particular, Di Lucca et al. [6, 7] presented an
approach to abstract use case diagrams, sequence
diagrams and business object models from WAs. The
approach proposed relies on static information, which
may not suffice for an effective and complete abstraction
of UML diagrams, due to the dynamic nature of some
WA components.
change at run-time. Moreover, even pieces of code (e.g.,
client-side scripts) can be dynamically generated.
Thus, for highly dynamic WAs, static analysis is
likely to give only an imprecise and approximate picture,
and only dynamic analysis allows a proper understanding
of complex and dynamic application behaviour (such as
the client-side logic). Dynamic analysis also allows to
track several other information, such as the session and
cookie data, the DBMS tables and queried entities, or the
frequency of exercising a particular link [17], and the type
of link actually exercised (e.g., hyperlinks, submit with
GET or POST).
This paper proposes an approach for abstracting UML
documentation from information collected dynamically
by the execution of instrumented WAs. In particular, the
execution traces of user sessions are exploited to
recognise the use cases the users executed and to
highlight their relationships in use case diagrams.
The approach is based on the production of a graph,
named Transition Graph, depicting the Web pages the
users navigated along a set of user sessions. Some criteria
are defined and applied to analyse this graph and deduce
use cases.
In the past, approaches for the abstraction of sequence
diagrams and interaction scenarios have been proposed
for traditional applications. Kollmann et al [11] compared
the (static) reverse-engineering capabilities of the existing
commercial UML tools. T. Systä [16] presented a tool for
abstracting scenarios from traces obtained from
debugging Java bytecode. Similarly, Richner and Ducasse
[15] used dynamic information for recovering
collaboration diagrams and roles. Tonella and Potrich
[18] presented an approach, based on static flow analysis,
to extract interaction diagrams. Briand et al. [3] presented
an approach for reverse engineering sequence diagrams
from execution traces. They also present a survey on the
existing techniques, highlighting their pros and cons. El
Ramly et al. [9] presented an approach to recover use
cases from execution traces for the purpose of
reengineering legacy systems. Use case extraction was
performed by detecting patterns over sequences of
screens. Similarities can be found here when detecting
sequences of pages, even if, as shown in Section 3,
peculiarities such as the need for clustering similar
generated pages emerge when analyzing WAs.
The paper is organized as follows. After a review of
the related work in Section 2, Section 3 describes the
proposed approach. Section 4 shows the approach
availability on a case study. Finally Section 5 concludes
and outlines the directions for future work.
2. Related Work
The lack of a disciplined development process for WAs
has introduced the need for suitable reverse engineering
approaches. Antoniol et al. [1] proposed an approach,
based on the Relational Management Methodology
(RMM), to recover web site architectures. Ricca and
Tonella, developed the ReWeb tool to analyze web sites
[12, 13, 14]. In particular, they extended to WAs
traditional static flow analyses such as reachability,
dominance, and data flow analysis. Ricca and Tonella
also proposed to enhance the analyses considering
dynamic information [17]. We agree with their statement:
the abstraction of use case diagrams relies, in the present
paper, on dynamic information extracted from WA
execution. However, while ReWeb obtains dynamic
information from web server logs we obtained dynamic
information by instrumenting the WAs in order to capture
some other data that are not available from server logs,
such as data stored into/read from a data base. The first
approach does not require page instrumentation: however,
the fact extraction capability is limited. To obtain
information such as variable passed between pages,
database or file access, instrumentation is necessary. The
Antoniol et al. [2] proposed a tool, named WANDA,
for WA dynamic analysis. The tool enables a fine-grained
level dynamic analysis of WAs under execution. This
permits the extraction of extended UML diagrams, using
stereotypes and tagged values, with information such as
the frequency of traversing a link, the percentage of read
or write operations performed on a file, the type of
operations performed on databases, the interaction with
Web Services. Also, by analyzing session variables,
cookies and variables passed by the GET or POST
method, the tool permits the identification of the data
flow between pages. The information extracted by
WANDA relies on a metamodel of a WA, and is stored
into a database designed according to such metamodel. A
similar metamodel is used in the present paper as a
baseline to extract UML documentation from WAs.
2
Proceedings of the 2005 Eighth International Workshop on Principles of Software Evolution (IWPSE’05)
1550-4077/05 $20.00 © 2005
IEEE
3. The Abstraction Process
User Session
Trace
Abstracting the behaviour model of a WA based on static
analysis of its source code may produce incomplete and
imprecise results, due to the technologies currently
adopted to implement Was (such as JSP. PHP, and so on).
Dynamic analysis represents a viable solution to
overcome static analysis limitations and to support an
effective abstraction of diagrams describing the WA
behaviour.
In this section, a reverse engineering approach based
on dynamic analysis will be presented to abstract use case
diagrams of an existing WA. The approach exploits the
information that are recorded during the executions of an
instrumented WA.
To this aim a WA is modelled as a set of web pages
that a user can access along a working session. An
accessed page may be a Server Page or a Client Page, by
which the user interacts with the WA. A Client Page may
be a Static Client Page (its content is fixed, stored in a
permanent way) or a dynamically Built Client Page, (its
content may vary over the time and it is generated on-thefly as output of an execution of a server page). A
Transition is composed of sequentially visited pages by
navigating a link from a Starting Page to a Target Page.
A Transition is due to different types of relationships
between pages (Hyperlinks, form Fubmission, Build,
Redirection, Inclusion). A Transition may be
characterised by a set of Parameters passed from the
Starting page to the Target one. It is worthwhile to note
that we consider just the transitions corresponding to links
between two pages actually implemented in the WA, i.e.
we do not consider the transitions a user makes by using
the forward/back browser buttons or by directly typing
the URL of a page in the browser command line.
However, this does not affect the effectiveness of the
proposed analysis because when a user go back o forward
to a page already accessed before, the WA will show the
same previous behaviour
Thus, each WA execution is modelled by a User
Session Trace, i.e. a sequence of Web Pages accessed by
a user along her/his working sessions. All the user
Session Traces may be collected in Execution Trace,
representing all the executions of the WA.
The class diagram in Figure 1 models this view of a
WA; this model is based on the one proposed in [8].
The complete set of the execution traces can be
represented by a directed graph, named Transition Graph,
where each node represents a web page and a directed
edge between two pages represents a transition from the
starting page to the target one a user made along a
session.
+starti ng page
T ransi tion
Web Page
Inclusion
Client Page
Server Page
Hyperlink
Form
Subm issi on
Static Cli ent
Page
Build
Redirection
Built Client
Page
Figure 1: The WA model
Appropriate criteria have been defined and have been
applied to the Transition Graph for identifying subsets of
Web Pages potentially implementing different Use Cases
of the system and suggesting possible relationships
between use cases.
The abstraction process proposed in this paper
includes the following steps:
1) WA Instrumentation
2) Execution of the Instrumented WA
3) Identification and Grouping of Equivalent Built
Client Pages
4) Generation of the Transition Graph
5) Use Case Diagram abstraction
a. Clustering of the Transition Graph
b. Abstracting Use Cases and their
relationships
Figure 2 depicts the proposed process, whose steps
will be described in the remainder of this section.
3.1 Web Application Instrumentation
The instrumentation of the WA is obtained by using the
tool WANDA [2] that automatically instruments the code
of a WA by inserting probes able to identify relevant
information, such as which pages a user accessed, which
transitions he/she activated, which parameters were
involved in the activated transition, which database or file
was accessed. This information is stored in a repository.
Moreover, WANDA stores in the repository, also, the
HTML source code of each Built Client Page that is
generated by the Server pages of the application as a
response to a user request.
3
IEEE
Parameter
*
+target page
Proceedings of the 2005 Eighth International Workshop on Principles of Software Evolution (IWPSE’05)
1550-4077/05 $20.00 © 2005
Execution
Trace
User
WA
Instrumentation
WA
Instrumented
WA
Use Case
Diagram
Abstraction
Use Case
Diagram
Human
Expert
Trace
Repository
WA Execution
Use Case
Identification
Validation
Validated
Diagrams
Clustered
Transition
Graph
Relationship
Identification
Built
Client
Pages
User Session
Traces
Cloned Built
Pages
Identifier
Transition
Graph
Generation
and Analysis
Pruning
Backward
Transitions
Classes of
equivalent
Built Pages
Clustering
Figure 2: The Abstraction Process
control and data component. Pages with the same control
component, but different data component, can be
considered as equivalent pages, belonging to a same
equivalence class. Of course, all the pages included in the
same class exhibit the same behaviour; thus we can
reduce the comprehension effort because we shall analyse
just a page for each class and not all of them.
The set of BCPs generated from a server page is
analysed to identify groups of equivalent pages. An
equivalence class will be defined for each groups and a
single equivalent page will be used to represent each
groups of equivalent Built Client Pages of a given Server
Page.
The identification of clusters of equivalent BCPs will
be obtained by exploiting the clone detection techniques
proposed in [5]. These approaches identify as clones
groups of similar pages according to a Levenshtein
distance over structural information.
It is not possible to know a-priori the complete set of
client pages that a server page will be able to build at runtime. However, the classification of observed Built Client
Pages in different groups of equivalent pages is possible.
A method for grouping these pages will be defined in
section 3.3. Therefore, each server page will build client
pages belonging to a finite set of Built Client Pages
Equivalent Classes.
3.2 Execution
Application
of
the
Instrumented
Web
To collect execution traces useful for an effective
extraction of use case models, the instrumented WA
needs to be executed in a real usage environment. This
allows the collection of information about the interaction
of users with the WA, by storing into a repository the
information 'captured' by the probes.
3.3 Identification and Grouping of Equivalent
Built Client Pages
3.4 Generation of the Transition Graph
Once a significant set of user session traces has been
obtained by executing the instrumented WA, and after the
clone detection has been able to detect equivalent BCPs
from these traces, the next step of the abstraction process
requires that a graph representing all web pages reached
during the navigations and all transitions from a page to a
successive one is produced. Such a graph is called
Transition Graph, TG(N, E), where N is a sub-set of the
WA pages, and E is the set of edges associated to the
transitions between consecutive pages.
In general, the layout of Built Client Pages (BCP)
resulting from different executions of a server page will
be different, depending on the input data provided by the
users. In particular, Built Client Pages will differ either
for their control component (i.e., the set of items - such as
the HTML code and scripts - determining the page layout,
business rule processing, and event management) or for
the data component (i.e., the set of items - such as text,
images, multimedia objects - determining the information
to be read/displayed from/to a user), or for both the
4
Proceedings of the 2005 Eighth International Workshop on Principles of Software Evolution (IWPSE’05)
1550-4077/05 $20.00 © 2005
IEEE
The TG can be obtained by analysing the available
execution traces, and collecting all pages and transitions
into a graph. The TG is generated by the three-steps
process described in the following sub-sections.
3.4.2 Unification of the Trace Graphs
The Transition Graph is produced by merging the Trace
Graphs. Supposing that m Trace Graphs have been
generated, where Ti(Ni, Ei) is the generic i-th graph and
BT is the set of identified Backward Transitions, the
Transition Graph (N, E) will be defined as follows:
3.4.1 Analysis of the User Session Traces for
Identifying and Pruning ‘Backward Transitions’
Each user session trace can be represented by an oriented
graph, called Trace Graph, Ti(Ni, Ei) where Ni is the set
of pages included in this trace and Ei is the set of edges
representing the transitions between a pair of
consecutively navigated pages.
This graph may contain cycles, since a trace may
include ‘Backward Transitions’, i.e. those transitions
representing the user navigation from a page to another
one that she/he had already accessed during the
navigation1. These transitions are not meaningful for our
scopes, since they do not indicate the activation of any
new WA behaviour. Therefore, for each Trace Graph,
edges associated with backward transitions will be
detected and pruned.
A possible process for detecting Backward Transitions
in the traces requires that the Trace Graph edges are
analysed, and the corresponding nodes are stored in a list.
If an edge reaches a node already included in the list, then
this edge can be classified as a Backward Transition, and
the corresponding edge is removed from the trace, that
will result divide into two separated sub-sequnces. The
remaining part of the trace will be analysed independently
from the previous part of the trace.
As an example, let’s consider the following trace
(each letter represents a page, each arrow represents a
transition):
aÆbÆcÆdÆaÆcÆgÆc
The Figure 3 (a) shows the corresponding Trace
Graph.
Analyzing this trace, the transition dÆa will be
identified as a backward transition because it reaches the
already visited page a. After this identification, the
transition dÆa is removed and the trace results made up
by the two separated sub-sequences aÆbÆcÆd and
aÆcÆgÆc. The remaining sub-sequence (aÆcÆgÆc)
is analysed independently from the aÆbÆcÆd, and
therefore only the transition gÆc will be identified as a
backward transition. Figure 3 (b) shows the final subsequences pruned of the backward transitions.
N= N1 ‰ N2 ‰ .... ‰ Nm
E= E1 ‰ E2 ‰ .... ‰ Em - BT
Figure 3 (c) shows the Transition Graph resulting
from the unification of the Trace Graph sub-sequences in
figure 3 (b).
a)
b)
c)
Figure 3: Examples of graphs: a) Trace Graph - b)
Pruned Trace Graph - c) Transition Graph
3.5 Use Case Diagram abstraction
The final step of the abstraction process consists of the
identification of use cases and their relationships. This
step will be carried out on the basis of the following
assumptions.
Usually, in a WA a use case is implemented by a set
of pages that interacts through the links existing among
them. In an execution trace the execution of a use case
will correspond to a sequence of linked pages. In the
Transition Graph (TG) such a sequence will correspond to
a TG sub-path made up by nodes, each of which is
characterised just by one entering edge and just one
leaving edge. Moreover, web pages associated with TG
nodes having more than one leaving edge usually
correspond either to client pages allowing a user to
choose among several actions/functions, or to server
1
These backward transitions can be associated to the
occurrence of backward connections between web pages
implementing shortcuts among pages, such as those due to
anchors towards the home page, menu pages, or pages navigated
previously.
5
Proceedings of the 2005 Eighth International Workshop on Principles of Software Evolution (IWPSE’05)
1550-4077/05 $20.00 © 2005
IEEE
generated, with clusters including a growing number of
nodes. Depending on the desired granularity level of the
clusters included in the CTG, the software engineer
carrying out the analysis will decide when the clustering
process will have to be stopped. Therefore, the final CTG
will be submitted to the next step of the abstraction
process.
pages activating different actions/functions according to a
user input. Finally, TG nodes with more than one entering
edge may be associated to pages (client or server)
implementing a common behaviour, included by all the
pages belonging to TG sub-paths reaching those nodes.
On the basis of these considerations, the TG is
analysed in order to identify groups of linked nodes
composing notable sub-graphs, and group these nodes
into clusters. After the clustering activity, a number of
heuristics are used to define the WA use cases and
possible relationships between them.
3.5.2 Abstracting Use Cases and their relationships
The use cases of the WA will be deduced from the CTG,
according to the rule that associates each cluster of the
CTG to a use case.
Moreover, analysing the composition of the clusters,
the existence of alternative use case scenarios, or possible
‘extend’ or ‘include’ relationships between use cases will
be suggested.
In particular, the following rules will be applied for
proposing the existence of relationships between clusters:
any CTG cluster obtained by the b) rule is a
candidate to implement a use case that is likely
to be extended by other use cases, associated
with clusters whose nodes are reached from the
Fork node;
any CTG cluster obtained by the c) rule is a
candidate to implement a use case that is
included in other use cases: the including use
cases are those associated with clusters whose
nodes reach the Join node;
any CTG cluster obtained by the e) heuristic
rule is a candidate to implement a use case
showing more than one interaction scenario.
3.5.1 Clustering of the Transition Graph
Clustering of the TG requires that all TG nodes be
classified (and labelled) according to the number of edges
entering and leaving them:
Groupable (G) nodes: each node with just one
entering edge and just one edge leaving it.
Fork (F) nodes: each node with just one entering
edge and with more than one edge leaving it.
Join (J) nodes: each node with more than one
entering edge and with just one edge leaving it.
Join/Fork (N) nodes: each node with more than
one entering edge and with more than one
leaving edge.
The following heuristics are used to carry out the TG
clustering:
a)
two or more consecutively linked G nodes (i.e.
a sequence of two or more G nodes) will be
clustered together;
b)
a F node will be clustered with the G node (or
sequence of G nodes) reaching it;
c)
a J node will be clustered with the G node (or a
sequence of G nodes) it reaches;
d)
a J node will be clustered with a F node it
reaches; and
e)
a F node will be clustered with the G nodes it
reaches if all the edges forking by the F node
are G nodes.
A validation of the proposed use cases and their
relationships will have to be carried out by analysing the
semantic of each web page included in the involved
clusters.
3.5.3 Associating Actors to Use Cases
Actors will be associated to each use case corresponding
to clusters including at least a client page. However to
make readable the resulting use case, just the actors
associated to base use case are drawn in the diagram. The
type of each actor has to be defined by the software
engineer.
When these clustering rules will have been applied to
the TG, each group of clustered nodes will be replaced by
a single node representing that cluster. This new node will
inherit the edges reaching or leaving the cluster nodes and
that reach (or are reached by) at least a node not included
in the cluster.
Consequently, the set of Transition Graph nodes and
edges will change and a new graph, called Clustered
Transition Graph (CTG), will be obtained.
This new graph needs to be analysed in order to detect
if there are new groups of nodes that can be clustered
together, and collapsed into new single nodes.
The rules can be applied iteratively on the Clustered
Transition Graph while they are able to group nodes on
this graph. In this way, a hierarchy of CTGs can be
4. Case Study
To validate the proposed approach, a case study aiming at
assessing its effectiveness has been carried out on some
small/medium sized WAs. In the following, the results
obtained by applying the approach on a small WA will be
reported.
The Web Application under analysis allows users to
make predictions about some sport events (such as
football matches); the Player who made the greatest
6
Proceedings of the 2005 Eighth International Workshop on Principles of Software Evolution (IWPSE’05)
1550-4077/05 $20.00 © 2005
IEEE
number of right predictions wins the game. WA users
registration is required to participate to the game. An
Administrator inputs the results of the considered sporting
events; according to the inputted results and the
predictions made, each player is assigned a score, and a
ranking of the players is computed.
The WA consists of 11 server pages, 2 static client
pages while a database is used to record predictions and
results. It is implemented using ASP and Javascript
scripting languages.
The Web Application was instrumented by the
WANDA tool and 1587 user session traces of the
instrumented version of the Web Application were
recorded and stored in the repository.
Each of the 11 server pages generated several Built
Client Pages (BCPs), that were stored in the repository,
too. These BCPs were analysed using clone detection
techniques and for each server page the groups of
equivalent BCPs were collected in equivalence classes. In
total 20 equivalence classes were identified.
Table 1 reports the identifiers and the filenames of the
static pages of the application and, in the third column,
the identifiers of the equivalence classes of BCPs each
server page generates. Conventionally, Server Pages are
labelled as SPxx, Client Pages as CPxx and Built Client
Pages as BCPxx.
The TG nodes were classified as:
-
20 groupable nodes (G);
11 fork nodes (F);
2 join nodes (J).
Figure 4 reports the Transition Graph. In this figure, F
nodes are depicted with diamonds, J nodes with ellipses,
N with circles and G nodes with boxes. In the figure,
boxes are drawn around the nodes to show the clusters
generated at each step of the iterative application of the
clustering rules; each box has a tag Cx, where x is a
number indicating the step where the cluster was
generated. Thus inner box shows the clusters generated in
the first steps of the clustering process. The final
Clustered Transition Graph presented 8 singleton clusters
(i.e. cluster made up by just one page) and 7 clusters with
more than one node.
Each cluster in the Clustered Transition Graph was
associated to a candidate use case.
The use cases were submitted to a validation process
carried out by a software engineer that had no knowledge
of the application. He was able to assign a concept to each
of the candidate use cases, i.e. all the clusters made up
valid use cases. He also defined two types of Actors: the
Player and the Administrator.
The user session traces were analysed and nine
Backward Transitions were identified. The analysis of the
WA executions confirmed that they were actual
Backward Transitions.
The generated Transition Graph included 33 pages
and 35 transitions among them.
Table 2 reports the list of clusters (the cluster IDs
correspond to the ones reported in Figure 4 near the larger
box delimiting the clusters, or the page identifier for
singleton clusters) and the concepts assigned to each
corresponding use case.
Table 1: WA pages
Table 2: Abstracted Use Cases
Page
ID
CP13
Filename
Equivalence Classes of Built Client Pages
Cluster ID
C4
C5
C8
C9
C10
C11
C12
SP1
SP14
BCP2.3
BCP15.1
BCP15.2
BCP18.1
BCP18.2
CP13
/login.htm
CP21
/nuovo.htm
SP9
/class.asp
SP11
/insscomm.asp
BCP12.1, BCP12.2, BCP12.3
SP4
/menu.asp
BCP15.1, BCP15.2
SP3
/menuadm.asp
BCP18.1, BCP18.2
SP1
/accept.asp
BCP2.1, BCP2.2, BCP2.3
SP19
/nuovo.asp
BCP20.1, BCP20.2
SP17
/risult.asp
BCP22
SP16
/scomm.asp
BCP23.1, BCP23.2
SP5
/adminsr.asp
BCP6.1, BCP6.2
SP7
/admris.asp
BCP8.1, BCP8.2
SP14
/logout.asp
BCP10
7
Use case Description
View Ranking
Registration
Insert Result
Input the Predictions
View Results
Validate Player
Validate Admin
Check Login
Logout
Access Denied
Player Menu
Access Denied
Admin Menu
Access Denied
Home Page
Proceedings of the 2005 Eighth International Workshop on Principles of Software Evolution (IWPSE’05)
1550-4077/05 $20.00 © 2005
IEEE
The relationships among the validated use cases were
defined according the guidelines described in section 3.5.
In two cases, the relationships proposed by the
heuristic were refused; in both cases an include
relationship was proposed while an extend one was
considered more suitable by the software engineer. In
another case, both include and extend relationships were
proposed, and the extend one was chosen in this case as
well.
Figure 5 shows the resulting Use Case diagram. It can
be observed that only extend relationships among the use
cases exist. This is because the fork nodes correspond to
pages whose behaviour is conditioned by the selections
the users may do in the client pages. To not affect the
readability of the diagrams, in figure 5 the two actors
Player and Administrator are not reported.
Figure 5 also highlights that there is more than one
use case named 'Access Denied'. These use cases
correspond to the generation of a BCP when a user is not
recognised as a registered one by different server pages
making this check operation. In this case, a reengineering
intervention could be suggested to encapsulate all the user
checking operation in just one page.
In future work, a wider experimentation involving
larger size WAs will be carried out with the aim of
assessing the scalability of the approach. It will be also
interesting to experiment how developer will benefit of
these diagrams when maintaining/evolving their WAs.
References
[1] G. Antoniol, G. Canfora, G. Casazza, and A. De Lucia,
“Web site reengineering using RMM,'' in Proceedings of
International Workshop on Web Site Evolution, Zurich,
Switzerland, March 2000, pp. 9-16
[2] G. Antoniol, M. Di Penta and M. Zazzara “Understanding
Web Applications through Dynamic Analysis”, in
Proceedings of the 12th International Workshop on Program
Comprehension, 24-26 June 2004, Bari, Italy, pp. 120-129
[3] L. Briand, Y. Labiche, and Y. Miao, “Towards the reverse
engineering of UML sequence diagrams,'' in Proceedings of
10th IEEE Working Conference on Reverse Engineering,
WCRE 2003, 13-16 November 2003, Victoria, British
Columbia, Canada pp. 57-66
[4] J. Conallen, Building Web Applications with UML (2nd
Edition). Addison-Wesley Publishing Company, 2002.
5. Conclusions
In this paper dynamic analysis has been proposed for
abstracting UML Use Case Diagrams from WAs. These
diagrams, together with other extracted documentation,
constitute an important support for evolving WAs.
Indeed, the knowledge of WA pages implementing use
cases boundaries makes WA maintenance and evolution
easier.
The approach first models the recorded WA
executions by a graph called Transition Graph; then this
graph is analysed and clusters of nodes are defined. Each
cluster is associated to a candidate use case. The use cases
are arranged in a use case diagram where the relationships
among the use cases are defined according to some
heuristics. Actors are defined according to the semantic of
each use case and associated to base use case.
Results obtained abstracting use case diagrams from
medium size WAs showed that the dynamic information
collected via WAs instrumentation allows to precisely
identify the set of WA use cases. As expected, the
analysis of a higher number of user session traces
improved the meaningfulness of the abstracted diagrams.
The proposed approach allows the comprehension
effort needed to evolve a web application to be reduced
sensibly. Indeed it provides an automated support to
identify the groups of web pages responsible of the use
cases the application implements, and that will easy the
identification of the pages impacted by the changes an
evolutionary operation requires.
[5] G. A. Di Lucca, M. Di Penta, A. R. Fasolino, “An approach
to identify duplicated web pages”, Proceedings of the 26th
Annual International Computer Software and Applications
Conference, COMPSAC 2002, 26–29 August 2002, Oxford,
England, UK, pp. 481 - 486
[6] G. A. Di Lucca, A. Fasolino, P. Tramontana, and U. De
Carlini, “Abstracting business level UML diagrams from
web applications,'' in Proceedings of 5th IEEE International
Workshop on Web Site Evolution, WSE 2003, 22
September 22 2003, Amsterdam, The Netherlands pp. 12-19
[7] G. A. Di Lucca, A. Fasolino, P. Tramontana, and U. De
Carlini, “Recovering a business object model from web
applications,'' in Proceedings of 26th Annual Conference on
Computer and Software Applications, COMPSAC 2003, 3-6
November 2003, Dallas, Texas, USA, pp. 348-353
[8] G.A. Di Lucca, A.R. Fasolino, P. Tramontana, "Reverse
Engineering Web Application: the WARE approach",
Journal of Software Maintenance and Evolution: Research
and Practice (Wiley), Volume 16, Issue 1-2, 2004, pp. 71101
[9] M. El-Ramly, E. Stroulia and P. Sorenson, “Mining SystemUser Interaction Traces for Use Case Models”, in
Proceedings for the 10th IEEE International Workshop on
Program Comprehension, IWPC 2002, 26-29 June 2002,
Paris, France pp.21-29
8
Proceedings of the 2005 Eighth International Workshop on Principles of Software Evolution (IWPSE’05)
1550-4077/05 $20.00 © 2005
IEEE
Figure 4: The Clustered Transition Graph of the WA from the case study
Figure 5: The Use Case Diagram of the WA from the case study
9
Proceedings of the 2005 Eighth International Workshop on Principles of Software Evolution (IWPSE’05)
1550-4077/05 $20.00 © 2005
IEEE
[10] D. Heuzeroth, T. Holl, G. Högström, and W. Löwe,
“Automatic design pattern detection'' in Proceedings of
the 11th IEEE International Workshop on Program
Comprehension, IWPC 2003, 10-11 May 10-11,
Portland, Oregon, USA pp. 94-103
[15] T. Richner and S. Ducasse, “Using dynamic
information for the iterative recovery of collaborations
and roles”, in Proceedings of IEEE International
Conference on Software Maintenance, ICSM 2002, 3-6
October 2002, Montréal, Canada, pp.34-43
[11] R.Kollmann, P. Selonen, E. Stroulia, T. Systä and A.
Zundorf, “A Study on the Current State of the Art in
Tool-Supported
UML-Based
Static
Reverse
engineering“, In Proceedings of the. 9th Working
Conference on Reverse Engineering, WCRE 2002, 29
October - 1 November 2002, Richmond, Virginia, USA
pp. 22-32
[16] T. Systä, “On the relationships between static and
dynamic models in reverse engineering Java software''
in Proceedings of 6th IEEE Working Conference on
Reverse Engineering, WCRE 1999, 6-8 October, 1999,
Atlanta, Georgia, USA, pp. 304-313
[17] P. Tonella and F. Ricca, “Dynamic model extraction
and statistical analysis of web applications'' in
Proceedings of 4th IEEE International Workshop on
Web Site Evolution, WSE 2002, 2 October 2, 2002,
Montréal, Canada, pp. 43-52
[12] F. Ricca and P. Tonella, “Web site analysis: Structure
and evolution'' in Proceedings of IEEE International
Conference on Software Maintenance, ICSM 2000, 1114 October, 2000, San Jose, California, USA, pp. 76-85
[18] P. Tonella and Alessandra Potrich. “Reverse
Engineering of the Interaction Diagrams from C++
Code”, in Proceedings of the International Conference
on Software Maintenance, ICSM 2003, 22-26
September 2003, Amsterdam, The Netherlands pp. 159168
[13] F. Ricca and P. Tonella, “Understanding and
restructuring web sites with ReWeb“, IEEE
Multimedia, vol. 8, pp. 40-51, Apr-Jun 2001.
[14] F. Ricca and P. Tonella, “Analysis and testing of web
applications'' in
Proceedings of the International
Conference on Software Engineering, ICSE 2001, 1219 May 2001, Toronto, Ontario, Canada, pp. 25-34
10
Proceedings of the 2005 Eighth International Workshop on Principles of Software Evolution (IWPSE’05)
1550-4077/05 $20.00 © 2005
IEEE