Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Fusion and Data Mining Julia I. Couto Ph.D. James Madison University A future battlefield scenario – distributed system of sensors for intelligence gathering, targeting and surveillance • Platforms hosting operators, sensors and weapons – tank gunners hosting operators, IR, video cameras – soldiers equipped with IR, video camera, thermal sights • Remote sensors – airborne sensors – satellite – unmanned aircraft : IR, radar, video camera – interconnected electronically to enhance target identification, surveillance, assessment and understanding of battlefield situation Data Fusion • Data fusion is a process dealing with • association, correlation and combination of data and [information] from multiple sensors or sources to achieve a more accurate assessment of observed entities in a observed scenarios • continuous refinement of estimates and assessments • evaluation of the need for additional sources, or modification of the process itself to enhance situation awareness Data fusion • Threat Assessment • Situation Assessment • Aggregation of Entities • Behavior of Entities • Identity of Entities • Position/Velocity • Existence of Entities D. L. Hall, “Mathematical Techniques for Data Fusion” JDL Information Fusion Process & Functional Model • pixel/signal level data association and characterization DATA FUSION DOMAIN Level 0 Processing Sub-object Data Association & Estimation • Intel Sources • Air Surveillance • Surface Surveillance • Space Surveillance • • • • Level 1 Processing Level 2 Processing Level 3 Processing Object Refinement Situation Refinement Impact Assessment Level 4 Processing Process Refinement Human • fuses data from multiple sensors Computer Interaction to a joint estimate of identity for detected entities Data Base Management System Support Database Fusion Database Threat assessment • Monitors data collections Estimate aggregation of force capabilities • Data retrieval to support the and •fusion Assessment of relationships Prediction of Storage, enemy intent automated functions Implications of future actionsof level 1 - 4 between entities • Aggregation entities into a higher level of abstraction JDL Data Fusion Process Data Fusion And Data Mining Data mining techniques attempt to extract knowledge from a large, heterogeneous and amount of multidimensional data – – – – – – Classification of objects /patterns Discovery of useful hidden patterns and structures Discovery rules that links two or several objects Outliers and deviation detection Identification of spatial and time trends Prediction of events Classification Assign an object to a category according to its features • • • • • • Linear Discriminant Analysis Quadratic Discriminant Analysis K-nearest Neighbor Support Vector Machines Decision Trees ANN Limitations: – requires labeled training data – majority of classification techniques do not deal with uncertain|missing values and noisy data Application: – Object Identification, Identity Fusion Clustering Analyzes and group objects according to their similarity No-training data is required – Hard Clustering: – An object is assign to one and only cluster • • • • Partitioning Hierarchical Density-Based Graph-Based – Soft Clustering: – An object is assigned to a cluster with a certain probability or degree of membership • Fuzzy-Based Clustering Applications: – Data Preprocessing • Image segmentation – Object tracking • A preprocessing step before track-assignment clustering similar track in a scenario – Entity Aggregation • Aggregated similar entities in a higher level of abstraction and to explore new relationships in data Limitations: • • • • • • Heuristic algorithms Definition of an appropriated similarity metric Unable to discover groups of arbitrary-shaped clusters Do not handle a mixture of numerical and categorical attributes Scalabity problems Do not consider physical constraints that may occur in a real spatial scenarios such as obstacles and crossings Bayesian Networks Bayesian Networks RSW EW IFF Identity Comm Pos Class Kin K.C. Chang, K. Blackmond Laskey, “Partially Dynamic Bayesian Networks for Tracking and Object Identification” Dynamic Bayesian Networks Partially Dynamic Bayesian Networks Bayesian Networks: IFF Identification G. Laskey, K. Laskey , “Combat Identification with Bayesian Networks” Bayesian networks Applications – Identification of Objects: • Object Identity (BN) • Identification Occluded Objects (BN) • Tracking objects (BN, PDBN) – Threat Assessment/Event Prediction: • • • • Identification of IFF Assessment of attack/threats Prediction of consequence of actions (DBN) Assessment of enemy intentions (DBN) Bayesian networks Advantages: – Effective to deal with noisy and uncertain evidence and missing data – Decision theory of risk analysis can be used to choose the action that maximizes the expected utility – Learning algorithms to learn the structures and parameters of a BN – Approximate inference can be made with partial/uncertain information about the scenario Limitations: – Learning the structure and potentials for each node in the net is slow – Requires training data to learn the structure and parameters of the BN – Exact inference for PDBN and BN is intractable, recently research has proposed approximate inference algorithms Outlier Detection Outlier: Objects/Events that differ considerable from the majority of the objects – Objects whose non-spatial attributes differ considerable with non-spatial attributes of its surrounding objects – Objects that are far form other objects Outlier Detection Techniques: • • • • • • • Discordance tests Distance-based Density-based Deviation-based Donoho-Stahel Estimators (DSE) Cluster-Based Graph-based Method – Limitations: • Most of them are not robust to space transformations Trend Analysis Spatial Trend Analysis: – Detection of changes and trends in a spatial dimension • Distribution of enemy troops in relation to the geographical area • Techniques – Spatial regression – Bread-first search for detection of global trends – Depth First Search Local Trend Spatial-Temporal Trend Analysis: – Detection of changes and trends in both space and time in a scenario – New area of research Conclusions In the future, a BS will be a distributed system of sensors. – Sensor data from several sources must be fused and correlated to allow target identification, order of battle, identification of potential threats, enemy intention and prediction of future enemy action that will help commanders and soldiers to have a clear understanding of the battlefield situation Combination of data mining techniques can be applied through all the hierarchy of inference in the data fusion process Current data mining techniques have limitations dealing with uncertain and missing data that may hamper the inference in a data fusion process – Bayesian network is one of the most promising technique to model situations that involves a certain degree of uncertainty and missing data Many data mining techniques applied in the data fusion process are supervised methods, more research should be conducted in the development of unsupervised data mining techniques