Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Developing Network Testbed Data Sets Visualisation Network-of-Experts Working Group Supporting NATO Research Task Group IST-059/RTG-025 November 6-8, 2007 Amy Vanderbilt Marcus Lem Cristin Hall Joanne Treurniet Rob Young NATO IST-059 Network of Experts Developing Network Testbed Data Sets BACKGROUND Q6 – how can “clean” data sets be produced for various types of networks to provide testbeds with realistic traffic and other elements? The DARPA Intrusion Detection Experiment data is a good although outdated example (closed computer network traffic) How can we develop similar test bed data sets for other network types? One way may be to accept real world networks (social and otherwise) where a certain sub-network is modeled in detail based on historical (and hopefully unbiased) data. Such testbed data sets may be the first step towards answering many questions NATO IST-059 Network of Experts Developing Network Testbed Data Sets WHY To provide initial validation of algorithms for various purposes (Prediction, Detection, Etc) For ease of use – to allow the larger research community to test and forward their research towards needed solutions without having to hand out classified, current real world data To determine the independent variables, minimal models Research hypothesis testing – honing in applications for transferring work from theory on Stages of validation: Test on simulated data Test on historical data Test on current data NATO IST-059 Network of Experts Developing Network Testbed Data Sets WHAT SHOULD A DATA SET BE? Can we create a generic network data set that would be applicable across applications? We could create a generic motif (subnet) component for each network type and property set based on the framework categorization Then we can build larger networks from these motifs depending on the application We will need a mapping from the applications to the network types and properties Each node and link must have parameters and/or constraints Battery life, bandwidth, distance capabilities, latency, restrictions on traffic, etc Need to be able to test on the margins (extreme cases, catastrophic scenarios). Need to allow the users of these data sets to change attributes on links and nodes We need to have a probability distribution based historical data Need to tag submissions as coming from open or closed systems NATO IST-059 Network of Experts Developing Network Testbed Data Sets VITA SEARCH VITA search on Network Data Test Social shows no actual data repositories but a few sporadic lists compiled by individuals NATO IST-059 Network of Experts Developing Network Testbed Data Sets HISTORICAL DATA COLLECTION Current data sets collected are sporadic, have little relevancy (too specific) and are often not accessible We need a way to collect historical real world data on which to base the probability distribution and other aspects of the nodes and links Social – massive multiplayer online games Computer – need to collect the traffic, connections, computer nodes and services running on each node, Sensors – a simpler version of computer networks. We could place sensors in an appropriate environment and collect the network properties (ARL has open source data available) Might be able to extract properties from a real network and generate the motif based on that Need to be able to handle embedding fields and networks IDEA: Set up a data bank where people can contribute datasets coded in a preferred way and containing required parameters, etc (they can get data if they contribute data) AND develop a way to auto-generate data sets NATO IST-059 Network of Experts Developing Network Testbed Data Sets STEPS TAKEN First we listed out attributes needed for Motifs – structure of a subnet Nodes Links Traffic Then we considered how to “scrub” such a data set to allow open source use NATO IST-059 Network of Experts Developing Network Testbed Data Sets ATTRIBUTES COMPUTER NETWORKS Category Motifs Attributes Generic Multi? Scrub Structure: Same as generic Centrality None Node degree distribution None Moments of degree distribution None Node betweenness distribution None Connectedness None Reachability None Shortest path length None Diameter None Size None Average clustering coefficient None NATO IST-059 Network of Experts Notes Developing Network Testbed Data Sets ATTRIBUTES COMPUTER NETWORKS Category Motifs Attributes Generic Multi? Scrub Notes Type: Same as generic Regular lattice Small world Random homogeneous Scale-free Etc… Nodes Hostname or IP Unique ID – who Anonymous code label Services offered Purpose - why Anonymous code label Access control Conditions when Anonymous code label Hardware and OS Traits - what Anonymous code label Asset value criticality Anonymous code label NATO IST-059 Network of Experts Varies by app. Developing Network Testbed Data Sets ATTRIBUTES COMPUTER NETWORKS Category Links Traffic Attributes Generic Multi? Scrub Bandwidth Capacity Anonymous code label Direction Direction Anonymous code label Physical Path GIS embedding Anonymous code label Transmission medium Link Traits Anonymous code label Range & attenuation Conditions Anonymous code label Interaction Traffic Type Anonymous code label Transmission rate & encryption Traits Anonymous code label NATO IST-059 Network of Experts Notes Developing Network Testbed Data Sets NEXT STEPS Complete and finalize attribute lists Write a paper Seek out who might have data Look for funding! 1 Data Bank Development 1.1 Determine formats for submission 1.2 Develop archive and web-services architecture 1.3 Set Up Website 1.4 Advertise to the research community 2 Develop Automated Data Set Generation 2.1 Historical data collection planning 2.2 Historical data collection 2.3 Develop initial motif sets 2.4 Develop user interface 2.5 Testing 2.6 Deploy to Website 3 Data Bank Maintenance & Improvement (ph 2) 3.1 Develop additional motif sets 3.2 Maintain web services 3.3 Evangelize the data bank at conferences, etc NATO IST-059 Network of Experts Developing Network Testbed Data Sets WORKSHOP TOPICS Bootstrapping for creation of more substantial data sets Amelioration of uncertainty Prediction via hypothetical network models Self generating networks Dynamic uncertainty – eruption and propagation of uncertainty in the evolution of networks NATO IST-059 Network of Experts