Download View - ijarcset

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

K-means clustering wikipedia , lookup

Cluster analysis wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Transcript
ISSN : 2454-9924
NOVEL AUTOMATIC TEST PACKET
GENERATION
K.KOLAPANENI OMSRI HARIKA1, S.Suresh2
1
2
M.Tech Student, Sree Rama institute of technology and science Kuppenakuntla,Penuballi,Khammam,TS INDIA
Asst Prof,CSE DeptSree Rama institute of technology and science Kuppenakuntla,Penuballi,Khammam,TS INDIA
trigger a separate mechanism to
ABSTRACT:
. Networks are getting larger and
more complex, yet administrators rely
on rudimentary tools such as and to
debug problems. We propose an
automated and systematic approach
for testing and debugging networks
called
“Automatic
Test
Packet
Generation” (ATPG). ATPG reads
router configurations and generates a
device-independent model. The model
is used to generate a minimum set of
test packets to (minimally) exercise
every
link
in
the
network
or
(maximally) exercise every rule in the
network.
Test
packets
are
sent
periodically, and detected failures
localize the fault. ATPG can detect
both
firewall
functional
rule)
(e.g.,
and
incorrect
performance
problems (e.g., congested queue).
ATPG complements but goes beyond
earlier work in static checking (which
cannot detect liveness or performance
faults) or fault localization (which
only localize faults given liveness
results). We describe our prototype
ATPG implementation and results on
two real-world data sets: Stanford
University’s backbone network and
Internet2. We find that a small
number of test packets suffices to test
all rules in these networks: For
ISSN : 2454-9924
example, 4000 packets can cover all
Debugging
rules in Stanford backbone network,
becoming harder as networks are
while 54 are enough to cover all links.
getting bigger (modern data centers
Sending 4000 test packets 10 times
may contain 10 000 switches, a
per second consumes less than 1% of
campus network may serve 50 000
link capacity. ATPG code and the
users,
data sets are publicly available.
machine learning applications. They
a
networks
100-Gb/s
is
only
long-haufor
can be divided into four broad
Index Terms-Data plane analysis,
categories: the Embedded, Wrapper,
network troubleshooting, test packet
Filter, and Hybrid approaches The
generation.
embedded
methods
incorporate
feature selection as a part of the
INTRODUCTION:
training process and are usually
I T IS notoriously hard to debug
specific to given learning algorithms,
networks.
Every
network
and therefore may be more efficient
engineers
wrestle
router
than the other three categories .
day,
with
misconfigurations, fiber cuts, faulty
Traditional
interfaces, mislabeled cables, software
algorithms like decision trees or
bugs, intermittent links, and a myriad
artificial
other reasons that cause networks to
examples of embedded approaches.
misbehave
The
or
fail
completely.
machine
neural
wrapper
learning
networks
methods
use
the
of
a
Network engineers hunt down bugs
predictive
using the most rudimentary tools
predetermined learning algorithm to
(e.g., , , SNMP, and ) and track down
determine the goodness of the selected
root causes using a combination of
subsets, the accuracy of the learning
accrued
algorithms is usually high. However,
wisdom
and
intuition.
accuracy
are
ISSN : 2454-9924
the generality of the selected features
is limited and the computational
Existing System:
complexity
Inorder to getting a better predictor
is
large.
The
filter
methods are independent of learning.
for
algorithms, with good generality.
information which is already present
Their computational complexity is
in other feature(s). Of the many
low, but the accuracy of the learning
feature subset selection algorithms,
algorithms is not guaranteed . The
some
hybrid methods are a combination of
irrelevant features but fail to handle
filter and wrapper methods by using a
redundant features yet some of others
filter method to reduce search space
can eliminate the irrelevant while
that will be considered by the
taking care of the redundant features .
subsequent wrapper. They mainly
Our proposed FAST algorithm falls
focus on combining filter and wrapper
into the second group. Traditionally
methods to achieve the best possible
feature subset selection research has
performance with a particular learning
focused on searching for relevant
algorithm
features. A well-known example is
with
similar
time
that
can
they
provide
effectively
mostly
eliminate
complexity of the filter methods. The
Relief
which weighs each feature
wrapper methods are computationally
according to its ability to discriminate
expensive and tend to over fit on
instances under different targets based
small training sets. The filter methods,
on distance-based criteria function.
in addition to their generality, are
However, Relief is ineffective at
usually a good choice when the
removing redundant features as two
number of features is very large.
predictive
Thus, we will focus on the filter
features are likely both to be highly
method in this paper.
weighted. Relief
but
highly
correlated
extends Relief,
ISSN : 2454-9924
enabling this method to work with
Butterworth et al. proposed to cluster
noisy and incomplete data sets and to
features using a special metric of
deal with multiclass problems, but
Barthelemy-Montjardet distance, and
still
then makes use of the dendrogram of
cannot
identify
redundant
features.
the resulting cluster hierarchy to
choose the most relevant attributes.
Unfortunately, the cluster evaluation
Proposed System:
measure
based
on
Barthelemy-
Recently, hierarchical clustering has
Montjardet distance does not identify
been adopted in word selection in the
a feature subset that allows the
context
classification
classifiers to improve their original
Distributional clustering has been
performance accuracy. Further more,
used to cluster words into groups
even compared with other feature
based either on their participation in
selection
particular grammatical relations with
accuracy is lower.
of
text
methods,
the
obtained
other words by Pereira et al. or on the
distribution of class labels associated
FEATURE SUBSET SELECTION ALGORITHM
with each word by Baker and
3.1 Framework and Definitions
McCallum
.
As
distributional
clustering of words are agglomerative
in nature, and result in suboptimal
word clusters and high computational
cost, Dhillon et al. proposed a new
Information
theoretic
divisive
algorithm for word clustering and
applied
it
to
text
classification.
Irrelevant
features,
along
with
redundant features, severely affect the
accuracy of the learning machines
.Thus, feature subset selection should
be able to identify and remove as
much of the irrelevant and redundant
information as possible. Moreover,
we develop a novel algorithm which
ISSN : 2454-9924
can efficiently and effectively deal
with both irrelevant.
Results and Analysis
and redundant features, and obtain a
In this section, we present the
good feature subset. We achieve this
experimental results in terms of the
through a new feature selection
proportion of selected features, the
framework (shown in Fig. 1) which
time to obtain the feature subset, the
composed of the two connected
classification accuracy, and the Win/
components of irrelevant feature
Draw/Loss record. For the purpose of
removal and redundant feature
exploring the statistical significance
elimination.
obtains
of the results, we performed a
features relevant to the target concept
nonparametric Friedman test followed
by eliminating irrelevant ones, and the
by Nemenyi posthoc test as advised
latter removes redundant features
by Demsar and Garcia and Herrerato
from relevant ones via choosing
to statistically compare algorithms on
representatives from different feature
multiple data sets. Thus, the Friedman
clusters, and thus produces the final
and the Nemenyi test results are
subset.
reported as well.
The
former
ISSN : 2454-9924
Testing liveness of a network is a
fundamental problem for ISPs and
large data center operators. Sending
probes between every pair of edge
ports
is
neither
exhaustive
nor
scalable [30]. It suffices to find a
minimal set of end-to-end packets that
traverse each link. However, doing
this requires a way of abstracting
across device specific configuration
files (e.g., header space), generating
headers and the links they reach (e.g.,
all-pairs reachability), and finally
determining a minimum set of test
packets ZENG et al.: AUTOMATIC
TEST PACKET GENERATION 565
(Min-Set-Cover).
Even
the
fundamental problem of automatically
generating test packets for efficient
liveness testing requires techniques
akin to ATPG. ATPG, however, goes
much further than liveness testing
with the same framework. ATPG can
test for reachability policy (by testing
all rules including drop rules) and
CONCLUSION
performance health (by associating
ISSN : 2454-9924
performance measures such as latency
named our system, we discovered to
and loss with test packets). Our
our surprise that ATPG was a well-
implementation also augments testing
known acronym in hardware chip
with
localization
testing, where it stands for Automatic
scheme also constructed using the
Test Pattern Generation [2]. We hope
header space
in
network ATPG will be equally useful
software testing, the formal model
for automated dynamic testing of
helps maximize test coverage while
production networks.
a
simple
fault
framework. As
minimizing test packets. Our results
show that all forwarding rules in
ACKNOWLEDGMENTS
Stanford backbone or Internet2 can be
The authors would like to the editors
exercised by a surprisingly small
and the anonymous reviewers for their
number of test packets ( for Stanford,
insightful and helpful comments and
and for Internet2). Network managers
suggestions,
today use primitive tools such as and .
substantial
Our survey results indicate that they
work. This work is supported by the
are eager for more sophisticated tools.
National Natural Science Foundation
Other fields of engineering indicate
of China under grant 61070006.
that
these
desires
are
which
resulted
improvements
to
in
this
not
unreasonable: For example, both the
ASIC and software design industries
are buttressed by billion-dollar tool
businesses that supply techniques for
both static (e.g., design rule) and
dynamic (e.g., timing) verification. In
fact, many months after we built and
REFERENCES
[1] H. Almuallim and T.G. Dietterich,
“Algorithms for Identifying
Relevant
Features,”
Canadian
Intelligence,
Conf.
Proc.
Ninth
Artificial
ISSN : 2454-9924
pp. 38-45, 1992.
vol. 5, no. 4, pp. 537-550, July 1994.
[2] H. Almuallim and T.G. Dietterich,
[6] D.A. Bell and H. Wang, “A
“Learning Boolean Concepts in
Formalism for Relevance and Its
the Presence of Many Irrelevant
Application
Features,” Artificial Intelligence,
Selection,” Machine Learning, vol.
vol. 69, nos. 1/2, pp. 279-305, 1994.
41,
in
Feature
Subset
no. 2, pp. 175-195, 2000.
[3] A. Arauzo-Azofra, J.M. Benitez,
and J.L. Castro, “A Feature Set
[7]
Measure Based on Relief,” Proc.
“Features
Fifth Int’l Conf. Recent Advances
Dimensional
in
data a Pearson Redundancy Based
Soft Computing, pp. 104-109, 2004.
Filter,” Advances in Soft
[4] L.D. Baker and A.K. McCallum,
Computing, vol. 45, pp. 242-249,
“Distributional Clustering of
2008.
J.
Biesiada
and
Election
W.
for
Duch,
High-
Words for Text Classification,” Proc.
21st Ann. Int’l ACM SIGIR
[8] R. Butterworth, G. Piatetsky-
Conf. Research and Development
Shapiro, and D.A. Simovici, “On
in information Retrieval, pp. 96-103,
Feature
1998.
Clustering,” Proc. IEEE Fifth Int’l
Selection
through
Conf.
[5]
R.
Battiti,
“Using
Mutual
Data Mining, pp. 581-584, 2005.
Information for Selecting Features in
Supervised Neural Net Learning,”
[9] C. Cardie, “Using Decision Trees
IEEE Trans. Neural Networks,
to Improve Case-Based Learning,”
ISSN : 2454-9924
Proc. 10th Int’l Conf. Machine
Intelligent Data Analysis, vol. 1, no.
Learning, pp. 25-32, 1993.
3, pp. 131-156, 1997.
[10] P. Chanda, Y. Cho, A. Zhang,
[14] M. Dash, H. Liu, and H. Motoda,
and M. Ramanathan, “Mining of
“Consistency Based Feature
Attribute
Using
Selection,” Proc. Fourth Pacific
Metrics,”
Asia Conf. Knowledge Discovery
Interactions
Information
Theoretic
Proc.
IEEE
and
Int’l
Conf.
Data
Mining
Data Mining, pp. 98-109, 2000.
Workshops, pp. 350-355, 2009.
[15] S. Das, “Filters, Wrappers and a
[11] S. Chikhi and S. Benhammada,
Boosting-Based Hybrid for
“ReliefMSS: A Variation on a
Feature Selection,” Proc. 18th Int’l
Feature Ranking Relieff Algorithm,”
Conf. Machine Learning, pp. 74-
Int’l J. Business Intelligence and
81, 2001.
Data Mining, vol. 4, nos. 3/4, pp.
375-390, 2009.
[16]
M.
Dash
and
H.
Liu,
“Consistency-Based Search in Feature
[12] W. Cohen, “Fast Effective Rule
Selection,”
Induction,” Proc. 12th Int’l Conf.
vol. 151, nos. 1/2, pp. 155-176,
Machine Learning (ICML ’95), pp.
2003.
Artificial
Intelligence,
115-123, 1995.
[17]
J.
[13] M. Dash and H. Liu, “Feature
Comparison
Selection for Classification,”
Multiple
Demsar,
of
“Statistical
Classifiers
over
ISSN : 2454-9924
Data Sets,” J. Machine Learning
Theoretic
Res., vol. 7, pp. 1-30, 2006.
Algorithm for Text Classification,”
Feature
Clustering
J. Machine Learning Research,
[18] I.S. Dhillon, S. Mallela, and R.
vol. 3, pp. 1265-1287, 2003.
Kumar, “A Divisive Information
K.KOLAPANENI OMSRI HARIKA is an M.Tech Depart ment of Co mputer Science & Engineering,
Sreerama Institute of Technology & science, Penuballi Mandal, Khammam, Kotha Kuppenkuntla
S. Suresh well known author and excellent teacher. He is working as a H.O.D for the
Department of M.Tech., Computer Science &
Engineering, Sree Rama Institute of Technology & Science, Kuppenakuntla, Penuballi,
Khammam. He has vast teaching experience in various engineering colleges. To his
credit couple of publications both National& International conferences / journals. His area
ofInterest
includes
Data
Warehouse
and
Data
Mining,information
security,
Data
Communications &Networks, Software Engineering and otheradvances in Computer
Applications. He has guided many projects for Engineering Students