Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ISSN : 2454-9924 NOVEL AUTOMATIC TEST PACKET GENERATION K.KOLAPANENI OMSRI HARIKA1, S.Suresh2 1 2 M.Tech Student, Sree Rama institute of technology and science Kuppenakuntla,Penuballi,Khammam,TS INDIA Asst Prof,CSE DeptSree Rama institute of technology and science Kuppenakuntla,Penuballi,Khammam,TS INDIA trigger a separate mechanism to ABSTRACT: . Networks are getting larger and more complex, yet administrators rely on rudimentary tools such as and to debug problems. We propose an automated and systematic approach for testing and debugging networks called “Automatic Test Packet Generation” (ATPG). ATPG reads router configurations and generates a device-independent model. The model is used to generate a minimum set of test packets to (minimally) exercise every link in the network or (maximally) exercise every rule in the network. Test packets are sent periodically, and detected failures localize the fault. ATPG can detect both firewall functional rule) (e.g., and incorrect performance problems (e.g., congested queue). ATPG complements but goes beyond earlier work in static checking (which cannot detect liveness or performance faults) or fault localization (which only localize faults given liveness results). We describe our prototype ATPG implementation and results on two real-world data sets: Stanford University’s backbone network and Internet2. We find that a small number of test packets suffices to test all rules in these networks: For ISSN : 2454-9924 example, 4000 packets can cover all Debugging rules in Stanford backbone network, becoming harder as networks are while 54 are enough to cover all links. getting bigger (modern data centers Sending 4000 test packets 10 times may contain 10 000 switches, a per second consumes less than 1% of campus network may serve 50 000 link capacity. ATPG code and the users, data sets are publicly available. machine learning applications. They a networks 100-Gb/s is only long-haufor can be divided into four broad Index Terms-Data plane analysis, categories: the Embedded, Wrapper, network troubleshooting, test packet Filter, and Hybrid approaches The generation. embedded methods incorporate feature selection as a part of the INTRODUCTION: training process and are usually I T IS notoriously hard to debug specific to given learning algorithms, networks. Every network and therefore may be more efficient engineers wrestle router than the other three categories . day, with misconfigurations, fiber cuts, faulty Traditional interfaces, mislabeled cables, software algorithms like decision trees or bugs, intermittent links, and a myriad artificial other reasons that cause networks to examples of embedded approaches. misbehave The or fail completely. machine neural wrapper learning networks methods use the of a Network engineers hunt down bugs predictive using the most rudimentary tools predetermined learning algorithm to (e.g., , , SNMP, and ) and track down determine the goodness of the selected root causes using a combination of subsets, the accuracy of the learning accrued algorithms is usually high. However, wisdom and intuition. accuracy are ISSN : 2454-9924 the generality of the selected features is limited and the computational Existing System: complexity Inorder to getting a better predictor is large. The filter methods are independent of learning. for algorithms, with good generality. information which is already present Their computational complexity is in other feature(s). Of the many low, but the accuracy of the learning feature subset selection algorithms, algorithms is not guaranteed . The some hybrid methods are a combination of irrelevant features but fail to handle filter and wrapper methods by using a redundant features yet some of others filter method to reduce search space can eliminate the irrelevant while that will be considered by the taking care of the redundant features . subsequent wrapper. They mainly Our proposed FAST algorithm falls focus on combining filter and wrapper into the second group. Traditionally methods to achieve the best possible feature subset selection research has performance with a particular learning focused on searching for relevant algorithm features. A well-known example is with similar time that can they provide effectively mostly eliminate complexity of the filter methods. The Relief which weighs each feature wrapper methods are computationally according to its ability to discriminate expensive and tend to over fit on instances under different targets based small training sets. The filter methods, on distance-based criteria function. in addition to their generality, are However, Relief is ineffective at usually a good choice when the removing redundant features as two number of features is very large. predictive Thus, we will focus on the filter features are likely both to be highly method in this paper. weighted. Relief but highly correlated extends Relief, ISSN : 2454-9924 enabling this method to work with Butterworth et al. proposed to cluster noisy and incomplete data sets and to features using a special metric of deal with multiclass problems, but Barthelemy-Montjardet distance, and still then makes use of the dendrogram of cannot identify redundant features. the resulting cluster hierarchy to choose the most relevant attributes. Unfortunately, the cluster evaluation Proposed System: measure based on Barthelemy- Recently, hierarchical clustering has Montjardet distance does not identify been adopted in word selection in the a feature subset that allows the context classification classifiers to improve their original Distributional clustering has been performance accuracy. Further more, used to cluster words into groups even compared with other feature based either on their participation in selection particular grammatical relations with accuracy is lower. of text methods, the obtained other words by Pereira et al. or on the distribution of class labels associated FEATURE SUBSET SELECTION ALGORITHM with each word by Baker and 3.1 Framework and Definitions McCallum . As distributional clustering of words are agglomerative in nature, and result in suboptimal word clusters and high computational cost, Dhillon et al. proposed a new Information theoretic divisive algorithm for word clustering and applied it to text classification. Irrelevant features, along with redundant features, severely affect the accuracy of the learning machines .Thus, feature subset selection should be able to identify and remove as much of the irrelevant and redundant information as possible. Moreover, we develop a novel algorithm which ISSN : 2454-9924 can efficiently and effectively deal with both irrelevant. Results and Analysis and redundant features, and obtain a In this section, we present the good feature subset. We achieve this experimental results in terms of the through a new feature selection proportion of selected features, the framework (shown in Fig. 1) which time to obtain the feature subset, the composed of the two connected classification accuracy, and the Win/ components of irrelevant feature Draw/Loss record. For the purpose of removal and redundant feature exploring the statistical significance elimination. obtains of the results, we performed a features relevant to the target concept nonparametric Friedman test followed by eliminating irrelevant ones, and the by Nemenyi posthoc test as advised latter removes redundant features by Demsar and Garcia and Herrerato from relevant ones via choosing to statistically compare algorithms on representatives from different feature multiple data sets. Thus, the Friedman clusters, and thus produces the final and the Nemenyi test results are subset. reported as well. The former ISSN : 2454-9924 Testing liveness of a network is a fundamental problem for ISPs and large data center operators. Sending probes between every pair of edge ports is neither exhaustive nor scalable [30]. It suffices to find a minimal set of end-to-end packets that traverse each link. However, doing this requires a way of abstracting across device specific configuration files (e.g., header space), generating headers and the links they reach (e.g., all-pairs reachability), and finally determining a minimum set of test packets ZENG et al.: AUTOMATIC TEST PACKET GENERATION 565 (Min-Set-Cover). Even the fundamental problem of automatically generating test packets for efficient liveness testing requires techniques akin to ATPG. ATPG, however, goes much further than liveness testing with the same framework. ATPG can test for reachability policy (by testing all rules including drop rules) and CONCLUSION performance health (by associating ISSN : 2454-9924 performance measures such as latency named our system, we discovered to and loss with test packets). Our our surprise that ATPG was a well- implementation also augments testing known acronym in hardware chip with localization testing, where it stands for Automatic scheme also constructed using the Test Pattern Generation [2]. We hope header space in network ATPG will be equally useful software testing, the formal model for automated dynamic testing of helps maximize test coverage while production networks. a simple fault framework. As minimizing test packets. Our results show that all forwarding rules in ACKNOWLEDGMENTS Stanford backbone or Internet2 can be The authors would like to the editors exercised by a surprisingly small and the anonymous reviewers for their number of test packets ( for Stanford, insightful and helpful comments and and for Internet2). Network managers suggestions, today use primitive tools such as and . substantial Our survey results indicate that they work. This work is supported by the are eager for more sophisticated tools. National Natural Science Foundation Other fields of engineering indicate of China under grant 61070006. that these desires are which resulted improvements to in this not unreasonable: For example, both the ASIC and software design industries are buttressed by billion-dollar tool businesses that supply techniques for both static (e.g., design rule) and dynamic (e.g., timing) verification. In fact, many months after we built and REFERENCES [1] H. Almuallim and T.G. Dietterich, “Algorithms for Identifying Relevant Features,” Canadian Intelligence, Conf. Proc. Ninth Artificial ISSN : 2454-9924 pp. 38-45, 1992. vol. 5, no. 4, pp. 537-550, July 1994. [2] H. Almuallim and T.G. Dietterich, [6] D.A. Bell and H. Wang, “A “Learning Boolean Concepts in Formalism for Relevance and Its the Presence of Many Irrelevant Application Features,” Artificial Intelligence, Selection,” Machine Learning, vol. vol. 69, nos. 1/2, pp. 279-305, 1994. 41, in Feature Subset no. 2, pp. 175-195, 2000. [3] A. Arauzo-Azofra, J.M. Benitez, and J.L. Castro, “A Feature Set [7] Measure Based on Relief,” Proc. “Features Fifth Int’l Conf. Recent Advances Dimensional in data a Pearson Redundancy Based Soft Computing, pp. 104-109, 2004. Filter,” Advances in Soft [4] L.D. Baker and A.K. McCallum, Computing, vol. 45, pp. 242-249, “Distributional Clustering of 2008. J. Biesiada and Election W. for Duch, High- Words for Text Classification,” Proc. 21st Ann. Int’l ACM SIGIR [8] R. Butterworth, G. Piatetsky- Conf. Research and Development Shapiro, and D.A. Simovici, “On in information Retrieval, pp. 96-103, Feature 1998. Clustering,” Proc. IEEE Fifth Int’l Selection through Conf. [5] R. Battiti, “Using Mutual Data Mining, pp. 581-584, 2005. Information for Selecting Features in Supervised Neural Net Learning,” [9] C. Cardie, “Using Decision Trees IEEE Trans. Neural Networks, to Improve Case-Based Learning,” ISSN : 2454-9924 Proc. 10th Int’l Conf. Machine Intelligent Data Analysis, vol. 1, no. Learning, pp. 25-32, 1993. 3, pp. 131-156, 1997. [10] P. Chanda, Y. Cho, A. Zhang, [14] M. Dash, H. Liu, and H. Motoda, and M. Ramanathan, “Mining of “Consistency Based Feature Attribute Using Selection,” Proc. Fourth Pacific Metrics,” Asia Conf. Knowledge Discovery Interactions Information Theoretic Proc. IEEE and Int’l Conf. Data Mining Data Mining, pp. 98-109, 2000. Workshops, pp. 350-355, 2009. [15] S. Das, “Filters, Wrappers and a [11] S. Chikhi and S. Benhammada, Boosting-Based Hybrid for “ReliefMSS: A Variation on a Feature Selection,” Proc. 18th Int’l Feature Ranking Relieff Algorithm,” Conf. Machine Learning, pp. 74- Int’l J. Business Intelligence and 81, 2001. Data Mining, vol. 4, nos. 3/4, pp. 375-390, 2009. [16] M. Dash and H. Liu, “Consistency-Based Search in Feature [12] W. Cohen, “Fast Effective Rule Selection,” Induction,” Proc. 12th Int’l Conf. vol. 151, nos. 1/2, pp. 155-176, Machine Learning (ICML ’95), pp. 2003. Artificial Intelligence, 115-123, 1995. [17] J. [13] M. Dash and H. Liu, “Feature Comparison Selection for Classification,” Multiple Demsar, of “Statistical Classifiers over ISSN : 2454-9924 Data Sets,” J. Machine Learning Theoretic Res., vol. 7, pp. 1-30, 2006. Algorithm for Text Classification,” Feature Clustering J. Machine Learning Research, [18] I.S. Dhillon, S. Mallela, and R. vol. 3, pp. 1265-1287, 2003. Kumar, “A Divisive Information K.KOLAPANENI OMSRI HARIKA is an M.Tech Depart ment of Co mputer Science & Engineering, Sreerama Institute of Technology & science, Penuballi Mandal, Khammam, Kotha Kuppenkuntla S. Suresh well known author and excellent teacher. He is working as a H.O.D for the Department of M.Tech., Computer Science & Engineering, Sree Rama Institute of Technology & Science, Kuppenakuntla, Penuballi, Khammam. He has vast teaching experience in various engineering colleges. To his credit couple of publications both National& International conferences / journals. His area ofInterest includes Data Warehouse and Data Mining,information security, Data Communications &Networks, Software Engineering and otheradvances in Computer Applications. He has guided many projects for Engineering Students