Download Theme-based Opinion Analysis Incorporating Social Relationships

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
OOSE 01/17
Institute of Computer Science and Information Engineering,
National Cheng Kung University
Member:Q76001074 薛弘志
P76014020 蔡文豪
F74982155 周詩御
Reference
Prasad, A.V.K. and Ramakrishna, S. (2010b), ‘Data
Mining for Secure Software Engineering – Source
Code Management Tool Case Study’, International
Journal of Engineering Science and Technology, vol. 2
(7), pp.2667-2677.
2
Introduction
To improve software productivity and quality, software
engineers are increasingly applying data mining algorithms to
various software engineering tasks.
However mining software engineering data poses several
challenges, requiring various algorithms to effectively mine
sequences, graphs and text from such data.
Using well established data mining techniques, it can explore
the potential of this valuable data in order to better manage
their projects and do produce higher-quality software systems
that are delivered on time and with in budget.
3
Introduction(cont.)
Mining algorithms for software engineering falls into four
main categories:
Frequent pattern mining – finding commonly occurring
patterns.
2. Pattern matching – finding data instances for given
patterns.
3. Clustering – grouping data into clusters and
4. Classification – predicting labels of data based on
already labeled data.
1.
4
Introduction(cont.)
Software engineering data can be broadly categorized into:
1.
Sequences such as execution traces collected at runtime,
static traces extracted from source code, and co-changed
code locations.
2.
Graphs such as dynamic call graphs collected at runtime and
static call graphs extracted from source code.
3.
Text such as bug reports, e-mails, code comments, and
documentation.
5
Objectives
The objective of the research work to propose strategic
Data Mining tools for program source code debugging
which improves Software Reliability & Quality.
Software engineers can start with either a problem driven
approach, but in practice they commonly adopt a mixture
of the first two steps: collecting data to mine and
determining the SE tasks to assist.
The three remaining steps are inorder, preprocessing
data, adopting a mining algorithm, and post processing
applying mining results.
6
Objectives(cont.)
Processing data involved first extracting relevant data
from the raw SE data. This data is further processed by
cleaning and properly formatting it for the mining
algorithm.
The next step produces a mining algorithm and its
supporting tool, based on the mining requirements
derived in the first two steps.
The final step transforms the mining algorithm results in
to an appropriate format required to assist the SE task.
7
Objectives(cont.)
Further, many such tools are general purpose and should
be adapted to assist the particular task at hand.
However, software engineering researchers may lack the
expertise to adapt mining algorithms, while data mining
researchers may lack the background to understand
mining requirements in the software engineering domain.
On promise way to reduce this gap is to foster close
collaborations between the software engineering
community(requirement providers) and data mining
community(solution providers).
8
Implementations
The management of source code is one of the greatest
challenges facing programmers today. As programs
become larger and more complex, the need to organize
and manage source code increases.
Author’s motivation is to implement source code
maintenance routines which parse tokens from an ANSI
C++ file, formats the file, extract header files and colorize
a file.
9
Implementations(cont.)


When files are shared among objects, it is difficult to
track which files are dependent on others.
A source code maintenance program can parse the
source code and produce documentation that describes
each class its member variables and functions.
Maintaining structure code amongst team members is
extremely difficult and time consuming because
programmers must modify their individual styles.
A source code formatter offers a convenient solution to
this problem.
10
Implementations(cont.)
Code maintenance modules receive source code as
input, break the code down into tokens and then output
them in a new format.
The utility is based on three class groups:
A scanner reads the code and breaks it down into
tokens and returns them back to the parser. It also
identifies the type of token to return.
The parser requests successive tokens from the scanner
and takes appropriate action before requesting the next
token. The action of parser is to write out the token.
11
The sequence diagram of the overall
code maintenance process
12
The Sample Class Diagram of the
CToken Hierarchy
13
Token Classes Derived from CToken
14
Valid Formatting flags
15
Format Strings for C++
16
Result
17
Conclusion
The mining algorithms works on software engineering
data like text, sequences, graphs which improves
software engineering tasks like Programming,
Maintenance, Bug Detection & Debugging.
The author only implemented the tool for source code
management, it is useful for code maintenance and
programming.
For a programmer, bug detection and debugging is more
important, so how to use the mining algorithms to assist
programmer is the future work.
18
Thank you for your listening!