Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Topic 11: Level 2 David L. Hall Topic Objectives • Introduce the concept of Level 2 processing • Survey and introduce methods for approximate reasoning – Introduce concepts in probabilistic reasoning and fusion (e.g., Bayes, Dempster-Shafer, etc) • Describe challenges and issues in automated reasoning • Note: this topic will focus on report-level fusion; Topic 12 will introduce reasoning methods such as rule-based systems & intelligent agents Level 2 Processing (Situation Refinement) Level Two Processing: Situation Assessment LEVEL TWO PROCESSING SITUATION ASSESSMENT OBJECT AGGREGATION • Time relationship • Geometrical proximity • Communications • Functional dependence CONTEXTUAL INTERPRETATION/FUSION • Environment • Weather • Doctrine • Socio-political EVENT/ACTIVITY AGGREGATION MULTI-PERSPECTIVE ASSESSMENT • Red/Blue/White Shortfalls in L2/L3 Fusion Research From Valet, L., et al: “A Statistical Overview of Recent Literature in Information Fusion”, FUSION2000, Paris, France, July 2000 (~85% of pubs reviewed at L1) From Nichols, M., “A Survey of the Current State of Data Fusion Systems”, OSD Decision Support Ctr. presentation at SPAWAR San Diego, CA, May 2000. (17 of 100 systems with any L2, L3 at all, mostly basic/simple techniques.) Hierarchy of Inference Techniques Type of Inference Applicable Techniques High - Situation Assessment - Behavior/Relationships of Entities - Identity, Attributes and Location of an Entity - Existence and Measurable Features of an Entity LEVEL INFERENCELEVEL INFERENCE - Threat Analysis Low - Knowledge-Based Techniques - Expert Systems - Scripts, Frames, Templating - Case-Based Reasoning - Genetic Algorithms - Decision-Level Techniques - Neural Nets - Cluster Algorithms - Fuzzy Logic - Estimation Techniques - Bayesian Nets - Maximum A Posteriori Probability (e.g. Kalman Filters, Bayesian) - Evidential Reasoning - Signal Processing Techniques Examples of Data Fusion Inferences Fusion Applications Tactical Situation Assessment (Level 2) Basic Inferences Threat Assessment (Level 3) Complet Equipment Diagnosis Medical Diagnosis (Advisory) Remote Sensing Higher-Level Inferences Identity of complex entities (aggregates of objects/entities) Relationships among objects (time, functions, geographical, etc.) Contextual interpretation Estimation Spatial/temporal reasoning Establishing functional relations Hierarchical Contextual reasoning Location and identity of low-level entities and objects Identification of weapons Prediction of positions Identification of threats Prediction of intent Analysis of threat implications Estimation of capability Pattern matching Prediction Development of scenarios Interpretation Cause-effect Estimation of equipment state parameters Location/identification of faults Identification of abnormal conditions Establishment of cause-effect relationships Development of hierarchical relationships Analysis of process Recommendations of diagnostic tests Recommendations of maintenance Analysis of hierarchies Cause-effect analysis Deduction/induction Determination of key biological parameters Assessment of symptoms Identification of abnormal parameters Location of injuries Analysis of relationships among symptoms Linking symptoms to potential causes Recommendation of diagnostic tests Identification of disease Pattern recognition Deduction/induction Cause-effect analysis Location & identification of crops, minerals, geographical features Location of geographically constrained entities Identification of features of objects Identity of unusual phenomena (e.g. diseased crops, etc.) Determination of relationships among geographical features Interpretation of data Pattern recognition Spatial/temporal reasoning Contextual analysis Location of low-level entities/objects Identification of objects Types of Reasoning Comments on L-2 and L-3 Techniques • Reasoning for level-2 and level-3 processing involves context-based reasoning and high level inferences • Techniques are generally probabilistic and entail representation of uncertainty in data and inferential relationships • Methods represent knowledge at the semantic level – Rule-based methods – Graphical representations – Logical templates, cases, plan hierarchies, agents & others Elements of Artificial Intelligence SYMBOLIC PROCESSING TECHNIQUES Pattern Matching Inference Search Knowledge Representation APPLICATION AREAS Knowledge Acquisition Planning Text Understanding Computer Vision Automatic Programming Learning Natural Language Processing Machine Translation Robotics Speech Understanding Expert Systems Intelligent Assistance Challenges in Symbolic Reasoning • Human Inferencing Capabilities – Continual access to multi-sensory information – Complex pattern recognition (visual, aural) – Semantic level reasoning – Knowledge of “real-world” facts, relationships, interactions – Use of heuristics for rapid assessment & decision-making – Context-based processing • Computer Inferencing Challenges - Lack of real-world knowledge - Inability to deal with the perversity of English or other languages - Requirement for explicit knowledge representation & reasoning methods - Computer advantages - Processing speed & power (use of physics-based models) - Unaffected by fatigue, emotion, bias - Machine learning from large data sets Categories of Representational, Decomposition Techniques TYPES OF RELATIONSHIPS REPRESENTED Physical Constituency Functional Constituency QUESTION ANSWERED EXAMPLES “... is composed of …” System block diagrams, specification trees “… involves/requires/provides …” Functional block diagrams, functional decomposition trees, interpretive structural modeling Process Constituency “… performs the process of …” Mathematical functions, logical operations, rules, procedures Sequential Dependency “… occurs conditional upon …” PERT charts, petri-nets, scripts, operational sequence diagrams Temporal Dependency “… occurs when …” All of the Above All of the Above Event sequences, time lines, scripts, operational sequence diagrams Computer simulations, real-world systems Major reasoning approaches Knowledge representation Reasoning methods & architectures • • • • • • • Rules Frames Scripts Semantic nets Parametric Templates Analogical methods Uncertainty representation • • • • • Confidence factors Probability Dempster-Shafer evidential intervals Fuzzy membership functions Etc. • • • • Implicit methods – Neural nets – Cluster algorithms Pattern templates – Templating methods – Case-based reasoning Process reasoning – Script interpreters – Plan-based reasoning Deductive methods – Decision-trees – Bayesian belief nets – D-S belief nets Hybrid architectures – Agent-based methods – Blackboard systems – Hybrid symbolic/numerical systems Decision-level identity fusion Declaration of Identity (sensor A) Sensor A Declaration Entity, target or activity being observed Sensor B of Identity (sensor B) Decision-Level Identity Fusion • Voting methods • Bayes method • DempsterShafer’s method Fused Declaration of Identity Sensor N Declaration of Identity (sensor N) Target Models of a priori data Sensors Feature Extraction Propagation Media Feature Space Signal Space Target Class A y Classifier Target Class B • Cluster Methods • Neutral Networks • Templating • etc. Decision Space In the last lecture we addressed the magic of pattern recognition! This represents a transition from Level-1 identity fusion to Level-2 fusion related to complex entities, activities, events; the reasoning is performed at the semantic (report) level. Classical Statistical Inference • Based on empirical probabilities • Definitions: – statistical hypothesis: statement about a population which based on information from a sample, one seeks to support or refute – statistical test: a set of rules whereby a decision on H is reached • Measure of test accuracy: – a probability statement re: the decision when various conditions in population are true Classical Statistical Inference • Test logic: • Assume null hypothesis is true (HO) • Examine consequences of HO true in sampling distribution for the statistic • If observations have high P of occurring, data do not contradict HO • Otherwise data tend to contradict HO • Level of Significance: • Define probability level that is considered too low to warrant support of HO if P (obs. data/ HO true) < - > reject HO Emitter Identification: Example E2 E1 ELINT COLLECTOR E2 Forward Edge of Battle Area (FEBA) • Type 1 and Type 2 radars exist on a battlefield – These radars are known to have different PRI ability • Problem: Given an observed PRI have we seen a Type 1 or Type 2 radar? Note: During this presentation we will use an example of emitter identification, e. g. for situation assessment related to a DoD problem. However this can be translated directly into other applications such as medical tests, enfironmental monitoring, or monitoring complex machines; E1 (Radar Class 1) E2 (Radar Class 2) A measure of the probability that radar class 2 will use a PRI in the interval PRIN ≤ PRI ≤ PRIN+1 Pulse repetition-interval (PRI) PRIN PRIN+1 Probability density function Probability density function Classical Inference for Identity Declaration: Example E1 (Radar Class 1) E2 (Radar Class 2) PRI PRIC Issues with Classical Probability • Requires knowledge of a priori probability density functions • (Usually) applied to a hypothesis and it’s alternate • Does not account for a priori information about the “likelihood in nature” of a hypothesis being true • (Strictly speaking) classical probability can only be used with repeatable experiments Bayesian Inference • Can be used on subjective probabilities – Does not necessarily require sampling, etc. • Statement: If H1, H2 --- Hi represent mutually exclusive and exhaustive hypotheses which can explain an event, E, that has just occurred Then P(Hi/E) = P(E/Hi) P(Hi) P(E/Hi) P(Hi) i And i P(Hi) = 1 exhaustivity • Nomenclature – P(Hi/E) = a posteriori probability of Hi true given E – P(Hi) = a priori probability – P(E/Hi) = probability of E given Hi true Bayes Form: Impact on the Examples of Emitter Identification P(EMITTER X PRI0) = P(PRI0 EMITTER X) P(EMITTER X) P(PRI0) EMITTERi) P(EMITTERi) i • In the case of multiple measurements, P(EMITTER X PRI0 and F0) = P(EMITTER X ) P(PRI0EM X) P(F0 EM and PRI0) P(EMITTER X) -----------P(EMITTER Y) -----------P(EMITTER Z) ------------ • Does not require sampling distributions – Analyst can estimate P(PRI/E) – Analysts can include any knowledge they have pertaining to the relative numbers of emitters Bayes Form: Impact on the Examples of Emitter Identification P(EMITTER X PRI0) = P(PRI0 EMITTER X) P(EMITTER X) P(PRI0) EMITTERi) P(EMITTERi) i • In the case of multiple measurements, P(EMITTER X PRI0 ∩F0) = P(EMITTER X ) P(PRI0EM X) P(F0 EM ∩ PRI0) P(EMITTER X) -----------P(EMITTER Y) -----------P(EMITTER Z) ------------ • Does not require sampling distributions – Analyst can estimate P(PRI/E) – Analysts can include any knowledge they have pertaining to the relative numbers of emitters Concept of Identity Declaration by a Sensor SENSOR DECLARATION Friend-DF Neutral-DN Enemy-DE Unknown-DU Friend-TF P(DF/TF) P(DN/TF) P(DE/TF) P(DU/TF) TRUTH Neutral-TN P(DF/TN) P(DN/TN) P(DE/TN) P(DU/TN) Enemy-TE P(DF/TE) P(DN/TE) P(DE/TE) P(DU/TE) • Major Development Issue: ability to model/establish P(D/T) as a function of range, SNR, etc. • Note: Columns of declaration matrix are mutually exclusive, exhaustive hypotheses that explain an observation Identification, Friend, Foe Neutral (IFFN) Bayesian Example Based on empirical probabilities derived from tests, we have for a sensor [P(Di/Oj)] i = 1, n j = fixed Then jP(Di/Oj) = 1 Then construct probability matrix for each sensor D 1 2 3 --- H D1 D2 [P(Di/Oj)] D3 • • Sensor 3 • n Sensor 2 Sensor 1 NOTES: 1. 2. Column SUM =1 Row SUM 1 n (not necessarily) = m IFFN Bayesian Example continued • Individual sensors provide (via a declaration matrix), P(DiOj) – Bayes rule allows conversion of P(DiOj) to P(OjDi) – Multiple sensors provide: {P(O1D1), P(O2D1), … P(OjD1) …} FROM SENSOR 1 {P(O1D2), P(O2D2), … P(OjD2) …} FROM SENSOR 2 • • • • Given multiple evidence (observations) Bayes rule allows fusion of declarations for each object (i.e., hypothesis) P(Oj)[P(D1Oj) P(D2Oj) … P(DkOj) … ] P(OjD1 D2 ...) = i P(Oj)[P(D1Oj) P(D2Oi) P(D2 Oj) … P(DnOi) ] Summary of Bayesian Fusion for Identity SENSOR #1 Observables Classifier Declaration D1 SENSOR #2 ETC. P(D2Oj) D2 • • • • • • SENSOR #n ETC. P(D1Oj) Bayesian Combination Formula P(OjD1 D2 Dn) j = 1, … M P(DnOj) • Fused probability of object j, given D1, D2 …, Dn Dn • Transformation from observation space to declaration • Uncertainty in declaration as expressed in a declaration matrix Decision Logic • MAP • Threshold MAP • etc. • Select highest value of P(Oj) Fused Identity Declaration Bayesian Inference • The good news – – – – Allows incorporation of a priori information about P(Hi) Allows utilization of subjective probability Allows iterative update Intuitive formulation • The bad news – Requires specification of “priors” – Requires identification of exhaustive set of alternative hypotheses, H – Can become complex for dependent evidence – May produce “idiot” Bayes result Dempster-Shafer Theory • Arthur Dempster (1968): Generalization of the Bayesian approach • Glen Shafer (1976): Mathematical theory of evidence • Basic issue: Manner in which belief, derived evidence, is distributed over propositions (hypotheses) EVIDENCE BELIEF DISTRIBUTION OF BELIEF (OVER) PROPOSITIONS ABOUT THE EXHAUSTIVE POSSIBILITIES IN A DOMAIN Distribution of Belief • Operational Mode – Humans seemingly distribute belief (based on evidence) in a fragmentary way, thus, in general, for evidence, E, and propositions, A, B, C, we will have: M(A) = measure of belief that E supports A exactly M(AB) = measure of belief assigned to the disjunction, which includes A etc. Probability and Belief • Probabilities for propositions are induced by the mass distribution PR(A) = M(, 2 ) A • Bayesian mass distributions assign only to the set of single mutually exclusive and exhaustive propositions (in only). PR(A) + PR(~A) = 1 • With the D-S approach, belief can be assigned to a set of propositions that need not be mutually exclusive. This leads to the motion of evidential interval [SPT(A), PLS(A)] Example of probability mass assignment • A single dice can show one of six observable faces (these are the mutually exclusive and exhaustive hypotheses) – The • • • • • • • number showing on the dice is 1 2 3 4 5 6 Propositions can include; – The number showing on the dice is even – The number showing on the dice is 1 or 3 – … – The number showing on the dice is 1 or 2 or 3 or 4 or 5 or 6 (the “I don’t know” proposition); The set of hypotheses is; Θ = {1,2,3,4,5,6} The set of propositions is; 2Θ = {1, 2,3,4,5,6, 1 or 2, 1 or 3, 2 or 3, 3 or 4, ……. 1 or 2 or 3 or 4 or 5 or 6} Probability and Belief Formulae SPT(A) = M(, 2) A PLS(A) = 1 - SPT(~A) = 1- M(, 2 ) A and SPT(A) + SPT(~A) 1 – Uncertainty (A) = PLS(A) - SPT(A) – If for all A, U(A) = 0 Bayesian Adapted from Greer, Thomas H., “Artificial Intelligence: A New Dimension in EW”, Defense Electronics, October, 1985, pp. 108-128. Support and Plausibility • • • Support – The degree to which the evidence supports the proposition – The sum of the probability masses for a proposition and its subjects Plausibility – The extent to which the evidence fails to refute a proposition – P(A) = 1 - S(~A) = 1 Examples – A(0,0) S(A) = 0 no supporting evidence – P(A) = 0 S(~A) = 1 evidence refutes A – A(.25, .85) S(A) = .25 S(~A) = .15 Plausibility Evidential Interval .25 Supporting Evidence Adapted from Greer, Thomas H., “Artificial Intelligence: A New Dimension in EW”, Defense Electronics, October, 1985, pp. 108-128. .85 Refuting Evidence Dempster-Shafer Example D-S Threat Warning Sensor (TWS) RF PRF TWS Belief Distribution SAM-X SAM-X, TTR SAM-X, ACQ UNKNOWN 0.3 0.4 0.2 0.1 Then, the evidential intervals are; SPT (SAM-X, TTR) = 0.4 PLS (SAM-X, TTR) = 1-SPT (SAM-X, TTR) = 1-SPT (SAM-X, ACQ) = 1-0.2 = 0.8 (SAM-X, TTR) = [0.4, 0.8] Similarly, (SAM-X) (SAM-X, ACQ) = [0.9, 1.0] = [0.2, 0.6] Adapted from Greer, Thomas H., “Artificial Intelligence: A New Dimension in EW”, Defense Electronics, October, 1985, pp. 108-128. Composite Uncertainty: Two Source Example SOURCE 1 REPORTED SAM-Y BELIEF SAM-Y TT v SAM-Y ML DISTRIBUTION SAM-T ACQ v SAM-Y TT No emitter at all SAM-Y SAM-Y TT SAM-Y ML CREDIBILITY SAM-T ACQ INTERVALS SAM-Y TT v SAM-Y ML SAM-T ACQ v SAM-Y TT SAM-T ACQ v SAM-Y TT v SAM-Y ML No emitter at all 0.3 0.4 0.2 0.1 [0.9, 1.0] [0.0, 1.0] [0.0, 0.8] [0.0, 0.6] [0.4, 1.0] [0.2, 1.0] [0.6, 1.0] [0.0, 1.0] SOURCE 2 SAM-Y SAM-Y TT SAM-X no emitter at all SAM-Y ACQ SAM-Y ACQ v SAM-Y TT v SAM-Y ML SAM-Y TT v SAM-Y ML SAM-Y ACQ v SAM-Y TT SAM-Y TT SAM-Y ML SAM- X 0.2 0.4 0.2 0.2 [0.0, 0.4] [0.4, 0.8] [0.4, 0.8] [0.4, 0.8] [0.4, 0.8] [0.0, 0.4] [0.2, 0.4] Dempster Rules of Combination 1 The product of mass assignments to two propositions that are consistent leads to another proposition contained within the original (e.g., m1(a1)m2(a1) = m(a1)). 2 Multiplying the mass assignment to uncertainty by the mass assignment to any other proposition leads to a contribution to that proposition (e.g., m1()m2(a2) = m(a2)). 3 Multiplying uncertainty by uncertainty leads to a new assignment to uncertainty (e.g, m1()m2() = m()). 4 When inconsistency occurs between knowledge sources, assign a measure of inconsistency denoted k to their products (e.g., m1(a1)m2(a1) = k). Composite Uncertainty: Computing Belief Distributions for Pooled Evidence 1. Compute Credibility Intervals SOURCE 1 SOURCE 2 Compute all Credibility Intervals Compute all Credibility Intervals 2. Map the Mass of Belief Distribution SOURCE 1 SOURCE 2 A B C D 3. Compute Composite Beliefs K = measure of all mass associated with conflicting reports K = (.2 x .2)+(.4 x .2)+(.3 x .2) = 0.18 1-K = 1 - 0.18 = 0.82 For each proposition, sum all of the masses that support the proposition and divide by 1-K: SAM-Y TT = A + B + C + D 1-K = 0.49 Compute the credibility intervals for pooled evidence Apply Decision Rule Pooled Evidence BELIEF DISTRIBUTION SAM-Y 0.17 SAM-Y, TTr 0.49 SAM-Y, TTR v SAM-Y,TTR 0.20 SAM-Y, ACQ v SAM-Y,TTR 0.10 SAM-X 0.20 UNKNOWN 0.02 CREDIBILITY INTERVALS SAM-Y = [0.96, 0.98] SAM-Y, TTr = [0.49, 0.98] SAM-Y, ML = [0.00. 0.39] SAM-Y, ACQ = [0.00, 0.29] SAM-X = [0.02, 0.04] SAM-Y, TTR v SAM-Y, ML = [0.69, 0.98] SAM-Y, ACQ v SAM-Y,TTR = [0.59, 0.98] SAM-Y, ACQ v SAM-Y,TTR v SAM-Y, ML = [0.79, 0.98] Summary of Dempster-Shafer Fusion for Identity Mi/Oj SENSOR #1 Observables Classifier Declaration Compute or enumerate mass distribution for given declaration SENSOR #2 ETC. • • • • ETC. M(Oj) = F(mi(Oj)) • • • SENSOR #n ETC. Combine/Fuse Mass Distributions via Dempster’s Rules of Combination ETC. • Transformation from observation space to mass distributions mi(Oj) Fused Identity Declaration • Select best combined evidential interval Decision Logic • Fused probability mass for each object hypothesis, Oj • General level of uncertainty leading to • Combined evidential intervals Dempster-Shafer Inference • The good news – Allows incorporation of a priori information about hypotheses and propositions – Allows utilization of subjective evidence – Allows assignment of general level of uncertainty – Allows iterative update • The bad news – – – – – – Requires specification of “priors” Not “intuitive” May result in “weak” decisions More computationally demanding than Bayes Can become complex for dependent evidence May produce “idiot” D-S result Summary of Voting for Identity Estimation SENSOR #1 Observables Classifier Declaration D1 Weight for sensor, S1 SENSOR #2 ETC. D2 • • • Weight for sensor S2 • • • Voting Combination Formula V(Oj) = wiδi(Oj) •Select highest vote (majority, plurality, etc) SENSOR #n ETC. • Transform from observation space to declaration Dn Weight for sensor SN • Uncertainty in declaration as expressed in a weight or confidence for each sensor Decision Logic • Fused decision via weighted voting formula Fused Identity Declaration Some Decision Fusion Techniques Data Fusion Technique Equations N Voting Assumptions y ( j ) X (i, j ) j 1 D i 1 N Weighted Decision y ( j ) wi X (i, j ) j 1 D i 1 d ( k ) arg max y ( j ) j K Bayesian Decision P(O j D1 ,, DK ) P(O j ) P( Dk O j ) k 1 K N P(O ) P( D i 1 i k 1 d ( k ) arg max [ P (O j D1 , , D K ) j k Oi ) All sensors make equal contributions Simplest form of Decision Fusion Weights derived from Sensor performance Expert knowledge of system Adaptive Processing Equal weighting is equivalent to voting Uses a priori probability that a hypothesis exits Addresses sensor reliability in the form of likelihoods Ability to recursively update the prior probabilities Topic 11 Assignments • Preview the on-line topic 11 materials • Read chapter 7 of Hall and McMullen (2004) • Writing assignment 8: One page description of how the level 2 process applies to your selected application • Discussion 5: Discuss the challenges of designing and implementing a data fusion system both in general and for your selected application. Data Fusion Tip of the Week Young researchers in automated reasoning and artificial intelligence wax enthusiastically about the power of computers and their potential to automate human-like reasoning processes (saying in effect “aren’t computers wonderful!”) Later in their careers these same researchers admit that it is a very difficult problem and believe they could make significant progress with increased computer speed and memory Still later, these researchers realize the complexities of the problem and praise human reasoning (saying in effect, “aren’t humans wonderful!”)