Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. 100 “Statistics 102” for Multisource-Multitarget Detection and Tracking Ronald Mahler Abstract— This tutorial paper summarizes the motivations, concepts and techniques of finite-set statistics (FISST), a systemlevel, “top-down,” direct generalization of ordinary single-sensor, single-target engineering statistics to the realm of multisensor, multitarget detection and tracking. Finite-set statistics provides powerful new conceptual and computational methods for dealing with multisensor-multitarget detection and tracking problems. The paper describes how “multitarget integro-differential calculus” is used to extend conventional single-sensor, single-target formal Bayesian motion and measurement modeling to general tracking problems. Given such models, the paper describes the Bayes-optimal approach to multisensor-multitarget detection and tracking: the multisensor-multitarget recursive Bayes filter. Finally, it describes how multitarget calculus is used to derive principled statistical approximations of this optimal filter, such as PHD filters, CPHD filters, and multi-Bernoulli filters. Index Terms— multitarget tracking, multitarget detection, data fusion, finite-set statistics, FISST, random sets. I. I NTRODUCTION This paper is a sequel to, and update of, a tutorial published in 2004 for the IEEE Aerospace and Electronic Systems Magazine [21]. That tutorial described some central ideas of finite-set statistics (FISST). Finite-set statistics is a systematic, unified approach to multisensor-multitarget detection, tracking, and information fusion. It has been the subject of considerable worldwide research interest during the last decade, including more than 600 research publications by researchers in more than a dozen nations. I attribute this interest to the fact that finite-set statistics: • • • • is based on explicit, comprehensive, unified statistical models of multisensor-multitarget systems; unifies two disparate goals of multitarget tracking—target detection and state-estimation—into a single, seamless, Bayes-optimal procedure; results in new multitarget tracking algorithms—PHD filters, CPHD filters, multi-Bernoulli filters, etc.—that do not require measurement-to-track association, while still achieving tracking performance (localization accuracy, speed) comparable to or better than conventional multitarget tracking algorithms; results in promising generalized CPHD and multiBernoulli filters that can operate in unknown, dynamically changing clutter and detection backgrounds; and Copyright (c) 2013 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to [email protected]. R. Mahler is with Lockheed Martin Advanced Techology Laboratories, Eagan, MN. E-mail: [email protected]. more generally, has been a fertile source of fundamentally new approaches in multisource-multitarget tracking and information fusion. The emphasis of the earlier tutorial was on the answers to three questions: 1 • How does one Bayes-optimally detect and track multiple noncooperative targets using multiple, imperfect sensors? • How does one correctly model multisensor-multitarget systems so that Bayes-optimality is possible? • How does one accomplish this using a “Statistics 101”like formalism that is specifically designed for solving multitarget tracking and data fusion problems? The answer to the first question—the multisourcemultitarget Bayes recursive filter—is computationally intractable in all but the simplest problems. The answers to the second and third questions—multitarget formal Bayes modeling and multitarget integro-differential calculus, respectively— were addressed only at a very high level. Thus this paper begins where the previous one left off, with emphasis on answers to the following, consequent questions: • How does one actually construct faithful Bayesian models of multisensor-multitarget systems? • How does one approximate the optimal multisourcemultitarget Bayes recursive filter in a principled statistical manner—meaning that the underlying models and their relationships are preserved as faithfully as possible? • What mathematical machinery—what specific “multitarget Statistics 101” methodology—makes this possible? The earlier tutorial paper was written at a very elementary level (it was presumed that even Bayes’ rule might be an unfamiliar concept). This paper continues along the same path, but does presume some basic knowledge. This includes undergraduate probability and calculus, motion and measurement models, probability density functions, likelihood functions, Markov transition densities, and so on. It should also be emphasized that the paper is a tutorial introduction to, not a survey of, finite-set statistics. It includes pointers to the some of the most significant developments, but these are by no means exhaustive. The paper is organized as follows. Section II describes the engineering philosophy that motivates finite set statistics. Section III presents a review of “Statistics 101” for single-sensor, single-target systems, focusing on the single-sensor, singletarget recursive Bayes filter. Section IV summarizes its generalization to “multisensor-multitarget Statistics 101,” focusing • 1 By “Bayes-optimal,” I mean that target state(s) are determined by a state estimator, applied to the posterior distribution of a Bayes filter, that minimizes the Bayes risk corresponding to some cost function (see [20], p. 63). IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. on motion and measurement modeling and the multisensormultitarget recursive Bayes filter. Section V provides an overview of the primary approximations of this filter, the PHD, CPHD, and multi-Bernoulli filters. Section VI summarizes the principled statistical approximation methodology that leads to these filters. Conclusions can be found in Section VII. II. T HE P HILOSOPHY OF F INITE -S ET S TATISTICS Multisensor, multitarget systems introduce a major complication that is absent from single-sensor, single-target problems: they are comprised of randomly varying numbers of randomly varying objects of various kinds. These include varying numbers of targets; varying numbers of sensors with varying number of sensor measurements collected by each sensor; and varying numbers of sensor-carrying platforms. A rigorous mathematical foundation for stochastic multiobject problems—point process theory [4], [37]—has been in existence for a half-century. However, this theory has traditionally been formulated with the requirements of mathematicians rather than tracking and data fusion engineers in mind. The formulation usually preferred by mathematicians—random counting-measures—is inherently abstract and complex (especially in regard to probabilistic foundations) and not easily assimilable with engineering physical intuition. A. Motivations and Objectives The fundamental motivation for finite-set statistics is: • tracking and information fusion R&D engineers should not have to be virtuoso experts in point process theory in order to produce meaningful engineering innovations. As was emphasized in [21], engineering statistics is a tool and not an end in itself. It must have two qualities: • Trustworthiness: Constructed upon a systematic, reliable mathematical foundation, to which we can appeal when the going gets rough. • Fire and forget: This foundation can be safely neglected in most situations, leaving a serviceable mathematical machinery in its place. These two qualities are inherently in tension. If foundations are so mathematically complex that they cannot be taken for granted in most engineering situations, then they are shackles and not foundations. But if they are so simple that they repeatedly result in engineering blunders, then they are simplistic rather than simple. This inherent gap between trustworthiness and engineering pragmatism is what finite-set statistics attempts to bridge. Four objectives are paramount: • Directly generalize familiar single-sensor, single-target Bayesian “Statistics 101” concepts to the multisourcemultitarget realm. • Avoid all avoidable abstractions. • As much as possible, replace theorem-proving with “mechanical,” “turn-the-crank,” purely algebraic procedures. • Nevertheless retain all mathematical power necessary for effective engineering problem-solving. 101 B. Overview It is worthwhile to begin by first comparing the FISST “random finite set” (RFS) paradigm with the ubiquitous conventional paradigm: report-to-track association (MTA). 1) The “Standard” Measurement Model : The most familiar tracking algorithms presume the following measurement model, one that has its origins in radar tracking. A radar amplitude-signature for a given range-bin , azimuth , and elevation is subjected to a thresholding procedure such as CFAR (constant false alarm rate). If the amplitude exceeds the threshold, there are two possible reasons for the existence of this “blip.” First, it was caused by an actual target—in which case a “target detection” has occurred at z = ( ). Second, it was caused by a momentary surge of background noise—in which case a “false detection” or “false alarm” has occurred at z. A third possibility—that a target was present but was not detected—is referred to as a “missed detection.” For target-generated detections, the “small target” model is presumed. Targets are distant enough (relative to the radar’s resolution capability) that a single target generates a single detection. But they are also near enough that only a single target is responsible for any detection. 2) Measurement-to-Track Association (MTA): Because of the small-target assumption, a bottom-up, “divide and conquer” strategy can be applied to the multitarget detection and tracking problem ([20]. pp. 321-335). Suppose that, at time , we are in posession of “tracks” | | | | (1 x1 1 ) ( x )—i.e., possible targets where, for the track, x is its state (position, velocity), its error covariance matrix, and its “track label.” The | Gaussian distribution (x) = | (x − x ) is the “track density” of the track. Next, suppose that at time +1 we collect detections = {z1 z }. Typically, because of false alarms. The prediction step of an extended Kalman filter (EKF) is used to construct predicted tracks +1| +1| +1| +1| 1 ) ( x ). We can then (1 x1 construct the following hypothesis : for each , the target +1| +1| ) generated the detection z () ; or, alter( x natively, generated no detection. The excess measurements {z1 z }−{z (1) z () } are interpreted as false alarms or as having been generated by previously undetected targets. The hypothesis is a MTA. Taking all possibilities +1|+1 +1|+1 into account, we end up with a list 1 +1|+1 of MTAs. For each , we can apply the update step of an EKF to use z () to construct a revised track +1|+1 +1|+1 ( () x () () ). Multi-hypothesis trackers (MHTs) are currently the dominant tracking algorithms based on MTA. 3) Association-Free Multitarget Detection and Tracking : In contrast to MTA, FISST employs a top-down paradigm grounded in point process theory—specifically, in the theory of random finite sets (RFS’s). In place of the hypothesis-list +1|+1 +1|+1 , one has a probability distribution 1 () | (| ) on the finite-set variable = {x1 x } with ≥ 0, where () : 1 is the time-history of measurement-sets at time . Instead of the standard IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. measurement model just described, one constructs from it a multitarget likelihood function () = +1 (|). The value () is the likelihood that a measurement-set will be generated, if targets with state-set are present. Given this, a multitarget version of the recursive Bayes filter (Section IV-C) is applied instead of the MTA procedure. Since this Bayes filter will in general be computationally intractable, it must be approximated, resulting in the PHD, CPHD, multiBernoulli and other filters (Section V). The following point should be emphasized: RFS algorithms are capable of “true tracking.” It is often, to the contrary, asserted that RFS algorithms are inherently incapable of constucting time-sequences of labeled tracks, because finite sets are order-independent. This is not the case. As was explained in [20], pp. 505-508 target states have, in general, the form x = ( x), where is a identifying label unique to each track. Given this, the multitarget Bayes filter—as well as any RFS approximation of it, including PHD and CPHD filters— can maintain temporally-connected tracks. In particular, Vo and Vo have used this approach to devise an exact, closed-form, computationally tractable solution to the multitarget recursive Bayes filter [42], [43]. Because the solution is exact, this filter’s track-management scheme is provably Bayes-optimal. 4) “Nonstandard” Measurement Models: However ubiquitous the standard model may be, it is actually an approximation—the result of applying a detection process to a sensor signature. RFS models and filters are being developed for “nonstandard” sensor sources that supply “raw” signature information. Perhaps the two most interesting instances are: • • RFS models and multi-Bernoulli track-before-detect (TBD) filters for pixelized image data [11], [12]. These filters have been shown to outperform the previously-best TBD filter, the histogram-PMHT filter. RFS models and CPHD filters for superpositional sensors [24], [35], [47]. These filters have been shown to significantly outperform a conventional MCMC (Markov chain Monte Carlo) approach, while also being much faster. III. S INGLE -S ENSOR , - TARGET S YSTEMS The purpose of this section is to summarize the basic elements of the conventional “Statistics 101” toolbox. 102 through time: −→ | (x| ) corrector −→ predictor −→ +1| (x| ) +1|+1 (x| +1 ) −→ ↓estimator x̂+1|+1 Here, | (x| ) is the probability (density) that the target has state x, given the accumulated information . The predictor (time-update) step accounts for the increase in uncertainty in the target state between measurement collections. The corrector (measurement-update) step permits fusion of the newest measurement z+1 with previous measurements . These steps are defined by the time-prediction integral Z +1| (x| ) = +1| (x|x0 ) · +1| (x0 | )x0 (1) and Bayes’ rule +1 (z+1 |x) · | (x| ) +1 (z+1 | ) (2) +1 (z+1 |x) · | (x| )x (3) +1|+1 (x| +1 ) = where +1 (z+1 | ) = Z is the Bayes normalization factor. The estimator step consists of a Bayes-optimal state estimator, such as the maximum a posteriori (MAP) estimator: +1 ) xMAP +1|+1 = arg sup +1|+1 (x| x (4) (“Bayes-optimal” means that the estimator minimizes the Bayes risk corresponding to some cost function [20], p. 63.) The Bayes filter formulas require knowledge of two a priori density functions: the target Markov transition density +1| (x|x0 ) and the sensor likelihood function +1 (z|x). The former, +1| (x|x0 ), is the probability (density) that the target will have state x at time +1 if it had state x0 at time . The latter, +1 (z|x), is the probability (density) that the sensor will collect measurement z at time +1 if a target with state x is present. By “true” formulas for +1| (x|x0 ) and +1 (z|x) is meant: 0 • +1| (x|x ) and +1 (z|x) faithfully incorporate the motion and measurement models; and • no extraneous information has inadvertently been introduced into them. B. Moment Approximations of the Bayes Filter A. Single-Sensor, Single-Target Recursive Bayes Filter The primary tool is the recursive Bayes filter—the foundation for optimal single-sensor, single-target tracking. At various times 1 , a single sensor with unity probability of detection and no clutter, interrogates a single noncooperative target. The time-sequence of collected measurements is : z1 z and the state of the target—the information about it that we wish to know (position, velocity, type, etc.)—is x. The Bayes filter propagates a Bayes posterior distribution Historically, the Bayes filter has typically been implemented using moment approximations. Let (x) = 0 (x − x0 ) denote a Gaussian distribution with mean x0 (first-order moment of (x)) and covariance matrix 0 (a second-order moment of (x)). Assume that signal-to-noise ratio (SNR) is large enough that the track distributions can be approximately characterized by their first- and second-order moments: | (x| ) ∼ = | (x − x| ) +1| (x| ) ∼ = +1| (x − x+1| ) (5) (6) IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. Then the Bayes filter can be replaced by a filter—the extended Kalman filter (EKF)—that propagates the first-and secondorder moments: x+1| x+1|+1 x| → → → → | +1| +1|+1 Similarly, assume that SNR is large enough that the track distributions can be approximately characterized by their firstorder moments: (8) for some fixed covariance . Then the Bayes filter can be replaced by a filter—for example, an - filter—that propagates only the first-order moment: → x| → x+1| → x+1|+1 → C. Single-Target Motion Modeling Target modeling is schematically summarized in Figure 1. At the top, interim target motion is mathematized as a statistical motion model. The function x = (x0 ) states that the target will have state x at time +1 if it had state x0 at time . Since this equation is typically just a guess, it is randomly perturbed by the motion noise (“plant noise”) W with probability distribution W (x). The information contained in this model is equivalent to that contained in the next line, the probability mass function (p.m.f.) +1| (|x0 ) = Pr(X+1| ∈ |x0 ) (9) +1| (x|x0 ) = W (x − (x0 )) (10) The p.m.f. is equivalent to the probability density function (p.d.f.) +1| (x|x0 ): This formula is a standard result easily found in standard textbooks. It is a consequence of the following equation: Z 0 +1| (x|x0 )x (11) +1| (|x ) = The validity of Eq. (11) ensures that Eq. (10) is “true” because it means that +1| (|x0 ) and +1| (x|x0 ) are entirely equivalent statistical descriptors of X+1| . motion model X+1| = (x0 ) + W |{z} | {z } | {z } predicted target deterministic plant noise probabilitymass function “true” Markov density Bayes filter predictor D. Single-Sensor, Single-Target Measurement Modeling Sensor modeling is schematically summarized in Figure 2. We begin with a statistical measurement model. The function z = +1 (x) states that the sensor will collect measurement z at time +1 if a target with state x is present. Because of sensor noise, this formula must be randomly perturbed by a noise-vector V with distribution V+1 (z). The information in this model is equivalent to that in the p.m.f. (7) | (x| ) ∼ = (x − x| ) +1| (x| ) ∼ = (x − x+1| ) ⇓⇑ +1| (|x0 ) = Pr(X+1| ∈ |x0 ) R +1| x ⇓⇑ x This p.m.f.—and thus the original measurement model—is equivalent to the p.d.f. +1 (z|x): The fact that this formula is “true” is assured by the equation Z +1 ( |x) = +1 (z|x)z (14) measurement model Z+1 = +1 (x) + V+1 | {z } | {z } | {z } measurement deterministic sensor noise ⇓⇑ probabilitymass function +1 ( |x) = Pr(Z+1 ∈ |x) R +1 z ⇓⇑ z single-object calculus “true” likelihood function z (x) = +1 (z|x) ⇓ single-target Bayes’ rule Bayes filter corrector +1| (x| ) → +1|+1 (x| +1 ) Figure 2: Single-Sensor, Single-Target Measurement Modeling E. Single-Target, Multisensor Data Fusion Suppose that we have two sensors, with—as in the singlesensor case—unity probability of detection and no clutter. Measurement-collection times are identical (synchronous), so 1 2 that the sensors collect measurement-streams and 1 2 with z and z collected simultaneously at time for any = 1 . Let the respective likelihood functions be 1 1 2 2 z1 (x) abbr. = (z|x), (15) z2 (x) abbr. = (z|x) 1 2 If the sensors are independent, then their joint likelihood function has the form 12 1 2 (16) z 1 2 (x) = 1 (x) · 2 (x) z z z Measurements are optimally fused using Bayes’ rule: ⇓ Figure 1: Single-Sensor, Single-Target Motion Modeling (13) +1 (z|x) = V+1 (z − +1 (x)) 1 2 (17) | (x| ) single-target prediction integral | (x| ) → +1| (x| ) (12) +1 ( |x) = Pr(Z+1 ∈ |x) single-object calculus x (x0 ) = +1| (x|x0 ) 103 12 = z1 1 2 z (x) · |−1 (x| 1 2 1 −1 2 2 (z z | −1 −1 ) −1 ) IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. where 1 2 (18) (z z | −1 −1 ) Z 12 1 2 = z1 z2 (x) · |−1 (x| −1 −1 )x 1 2 The Bayes-optimal fusion approach, Eq. (16), employs a product likelihood. Consider the counterproposal in [38] that one should, instead, use an average: 2 1 1 (z1 (x) + z2 (x)) z 1 2 (x) = z 2 12 (19) One reviewer of this paper objected that this method hardly merits mention, because “data fusion averaging is not a sensible approach.” Readers’ patience is nevertheless requested, because it is being explicitly or implicitly promoted by rather powerful individuals. Eq. (19) is problematic because whereas product-likelihoods inherently improve target localization, average-likelihoods inherently worsen it. Consider the following simple example: two bearing-only sensors in the plane with respective Gaussian likelihood functions 1 ( ) = 2 ( − ), 2 (21) +1| ( ) = 20 ( − 0 ) · 20 ( − 0 ) where 20 is arbitrarily large—so that +1| ( ) is effectively uniform. Let 1 2 be the measurements collected by the sensors. Then Bayes’ rule yields Bayes ( ) ∼ +1|+1 = 2 ( − 1 ) · 2 ( − 2 ) (22) This results in a triangulated localization at (1 2 ) with variance ∼ = 2 2 . But with the average likelihood, 1 ∼ = 2 ∙ 2 ( − 1 ) · 20 ( − 0 ) +20 ( − 0 ) · 2 ( − 2 ) x ∼ = +1| (x|x0 ) · |x | (25) +1| (x |x0 ) |x | |x |&0 (26) +1| (x|x0 ) = lim This, the Lebesgue differentiation theorem, provides one (but not the only) way of deriving +1| (x|x0 ) from +1| (|x0 ) using a constructive Radon-Nikodým derivative ([9], pp. 144-150). For the model in Figure 1 we have +1| (x |x0 ) = Pr( (x0 ) + W ∈ x ) ∼ = W (x − (x0 )) · |x | (27) (28) from which Eq. (10) follows. IV. M ULTISENSOR -M ULTITARGET S YSTEMS The purpose of this section is to summarize the basic elements of “multisensor-multitarget Statistics 101.” (20) ( ) = 2 ( − ) That is, the sensors are oriented so as to triangulate the position of a target located at ( ). For conceptual clarity, let the prior distribution be av ( ) +1|+1 This can be accomplished as follows. Let x be an arbitrarily small region around x of size |x |. Then Z 0 +1| (x |x ) = +1| (x|x0 )x (24) and therefore F. Data Fusion Via Averaging? 104 ¸ (23) This distribution has four “tails” whose lengths increase with the size of its variance, which is ∼ = 20 → ∞. Now apply additional bearing-only sensors, all with orientations different from the first two and each other. The variance increases with the number of averaged sensors—whereas it greatly decreases if Bayes’ rule is used instead. As we shall see in Section V-C, a generalization of Eq. (19) is what leads to the very poor performance of the averagebased multisensor PHD filter proposed in [38]. G. Constructing Markov Densities and Likelihoods Eqs. (10,13) do not tell us how to construct explicit formulas for +1| (x|x0 ) and +1 (z|x) from explicit formulas for +1| (|x0 ) and +1 ( |x). A. Random Finite Sets (RFS’s) This section introduces the concept of an RFS as the multitarget analog of a random vector. 1) Random Single-Target States : The state x of a singletarget system may (as an example) have the form x = ( ) where are position variables, are velocity variables, and ∈ is a discrete identity variable (which could be, for instance, a track label). In a Bayesian approach, the state at time must be a random state X| . The precise mathematical definition of a random state X| requires that it actually be a “measurable mapping” from a “probability space” to the state space. In turn, the state space must be equipped with a “topology,” typically (but not always) a Euclidean topology. While such details are mathematically necessary, for engineering purposes they can usually be taken for granted. 2) Random Multitarget States : The state of a multitarget system, on the other hand, is most accurately represented as a finite set of the form = {x1 x }. Here, not only the individual target states x1 x are random but also their number (cardinality) . This includes the possibility = 0 (no targets are present), in which case we write = ∅ (the null set). The finite-set representation is most natural because—given that each target already has its own unique identity, as indicated by a variable such as —from a physical point of view the targets have no inherent order. Thus in a Bayesian formulation, a state-set is actually a random state-set Ξ| —it is a random finite set (RFS). Similar comments apply to the measurements collected from the targets by a sensor. These also usually have no inherent physical ordering. They thus have the form = {z1 z }, where not only the individual measurements z1 z are random, but also their number ≥ 0. Thus IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. in a Bayesian development a measurement-set is actually a random measurement-set Σ+1 —an RFS. The RFS representation is more “engineering friendly” than the random-measure representation of standard point process theory. A finite set {x1 x } is easily visualizable as a point pattern—for example, in the plane or in three dimensions. Similarly, an RFS is easily visualizable as a random point pattern. An everyday example of an RFS: the stars in a night sky, with many stars winking in and out and/or slightly varying in their apparent position ([20], pp. 349-356). 3) “Fire-and-Forget” Foundations of RFS’s : If we are to have a trustworthy mathematical foundation for multisensormultitarget systems, we must precisely define RFS’s. This forces us to define topologies on so-called hyperspaces—that is, spaces X∞ whose “points” are subsets (in our case finite subsets) of some other space X. Two hyperspaces are of engineering interest: the hyperspace of finite subsets of the state space, and the hyperspace of finite subsets of the measurement space. If we employed the random-measure formulation of point process theory, we would be forced to work with abstract probability measures defined on measurable subsets of an abstract space whose “points” are counting measures. In an arbitrary RFS formulation, this would be replaced by equally abstract probability measures ∞ | () = Pr(Ξ| ∈ ) defined on measurable subsets of X∞ , with any “point” of being a finite subset of X. Luckily, a simpler “stochastic geometry” formulation [37] is available. Its hyperspace topology—the to be equivaFell-Matheron topology—allows ∞ | () lently replaced by the multitarget analog of a conventional probability-mass function | () = Pr(X| ∈ ).2 This is the belief-mass function (b.m.f.) | () = Pr(Ξ| ⊆ ) (29) which is defined on (closed) subsets of X. 4) “Fire-and-Forget” Foundations of Multitarget Tracking : The upshot of Eq. (29) is that, in finite-set statistics, it is usually possible to entirely avoid abstractions such as topologies, measurable mappings, and the “randomness” of finite sets in the formal sense. More generally, finite-set statistics is intentionally formulated as a stripped-down version of point process theory—one in which we attempt to avoid all avoidable abstractions. As an illustration, concepts such as “thinning” and “marking” are basic to purely mathematical treatments of the subject. But in multitarget detection and tracking, these concepts appear only in a few, concrete contexts that can be adequately addressed at a purely engineering level. Missed detections and disappearing targets can both be described as forms of thinning; and target identity as a form of marking. But from an engineering perspective, does the imposition of such concepts represent an increase of content—or of pedantry? 2 The reason is as follows. Consider the set function | () = 1 − | ( ) = Pr(Ξ| ∩ 6= ∅). Then the Choquet-Matheron capacity theorem states that the additive measure | () is equivalent to the nonadditive measure | (), in the sense that both completely characterize the probabilistic behavior of Ξ| (see [9], p. 6 or [20], p. 713). 105 As a second illustration, in FISST density functions are systematically used in place of measures, except when this is not possible. Thus the Dirac delta function is employed even though it produces engineering-heuristic abbreviations of rigorous expressions (as in Eq. (81), for example). B. Probability Distributions of RFS’s Just as a random state-vector X| has a probability density | (x) = X| (x), so an RFS has a multitarget probability density function (m.p.d.f.) (30) | () = Ξ| () Its form varies with the number of targets: ⎧ | (∅) if =∅ ⎪ ⎪ ⎪ ⎨ | ({x1 }) if = {x1 } | () = | ({x1 x2 }) if || = 2, = {x1 x2 } ⎪ ⎪ ⎪ .. .. ⎩ . . (31) where || denotes the number of elements in . Also, its units of measurement vary with target number: if are the units of x, then the units of | () are −|| . In general, a function () that satisfies the same property with respect to units is a multitarget density function. Its set integral in the region is defined to be Z () (32) = (∅) Z ∞ X 1 | ({x1 x })x1 · · · x + ! × × ≥1 | {z } times where, as a convention, define | ({x1 x }) = 0 whenever |{x1 x }| 6= . The probability that there are elements in Ξ| is Z | () = () (33) ||= Z 1 | ({x1 x })x1 · · · x (34) = ! Thus | () for ≥ 0 is a probability distribution on the number of targets—the cardinality distribution of Ξ| . C. Multisensor-Multitarget Recursive Bayes Filter This filter is the theoretical foundation for optimal singlesensor, single-target tracking. At times 1 , one or more of sensors interrogate an unknown number of unknown, noncooperative target. The time-sequence of collected measurement-sets is () : 1 . The multisensormultitarget Bayes filter propagates a multitarget Bayes posterior distribution through time: −→ | (| () ) corrector −→ predictor −→ +1| (| () ) +1|+1 (| (+1) ) −→ ↓multitarget estimator ̂+1|+1 IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. These steps are defined by multitarget analogs of the timeprediction integral Z +1| (| () ) = +1| (| 0 ) · +1| ( 0 | () ) 0 (35) and of Bayes’ rule +1|+1 (| (+1) ) = where +1 (+1 | () )= Z +1 (+1 |) · | (| () ) (36) +1 (+1 | () ) +1 (+1 |) · | (| () • • x0 will persist into time +1 and transition to some other (random) state X . Then x0 will transition to ½ ∅ if disappears (prob. 1 − (x0 )) persists {X } if (38) Suppose that is the set of targets that newly appear at time +1 . Then the RFS motion model has the form (x0 ) = Ξ+1| = (x01 ) ∪ ∪ (x00 ) ∪ (39) ) (37) The estimator step consists of a multitarget Bayes-optimal state estimator. As was explained in [21], multitarget versions of the maximum a posteriori (MAP) and expected a posteriori (EAP) estimators do not exist. Rather, one must use alternative estimators ([20], pp. 494-508). As in the single-sensor, single-target case, the multisensormultitarget Bayes filter requires two a priori density functions: the multitarget Markov transition density +1| (| 0 ) and the multisensor-multitarget likelihood function +1 (|). Here, +1| (| 0 ) is the probability (density) that the targets will have state-set at time +1 if they had stateset 0 at time . Also, +1 (|) is the probability (density) that the sensors will jointly collect measurement-set at time +1 if targets with state-set are present. In the single-sensor, single-target case, the formulas for the Markov density and likelihood function—Eqs. (10,13)—are never derived but, rather, simply looked up. In multisensormultitarget problems, this is not possible because no standard references exist. Thus one must ask: • 106 How does one construct statistical multitarget motion models for any given application? In particular, how does one model phenomena such as target disappearance and target appearance? How does one construct statistical multisensormultitarget measurement models for any particular set of sensors? In particular, how does one model phenomena such as sensor fields of view and clutter? Given such models, how does one construct formulas for the “true” multitarget Markov density and the “true” multisensor-multitarget likelihood function? That is, how does one know that they are not heuristic contrivances, or that no extraneous information has inadvertently been introduced? D. Multitarget Motion Modeling This is summarized in Figure 3. At the top, interim target motions are represented as a RFS motion model. As an example, consider the most commonly assumed multitarget motion model—the “standard” such model. At time suppose that the target state-set is 0 = {x01 x00 } with | 0 | = 0 . At time +1 , either each of the targets persists or disappears. Let (x0 ) be the probability that The set-theoretic union symbol ‘∪’ indicates that at time +1 , targets will be either persisting targets or new targets. It is assumed that (x01 ) (x00 ) are independent. The information contained in this model is equivalent to that contained in the next line of Figure 3, the b.m.f. +1| (| 0 ). Because of independence, it is +1| (| 0 ) = (x01 ) () · · · (x0 0 ) () · () (40) where the b.m.f. of (x0 ) is (41) (x0 ) () Pr( (x0 ) Pr( (x0 ) ⊆ ) = ∅) + Pr( (x0 ) 6= ∅ (x0 ) ⊆ )(42) Z = 1 − (x0 ) + (x0 ) +1| (x|x0 )x (43) = = Also, the b.m.f. of is normally chosen to be Poisson: µ ¶ Z () = exp −+1| + +1| (x)x +1| (44) is the expected number of appearing targets; and Here +1| (x), the “spatial distribution,” is the probabilty (density) that an appearing target will have state x. +1| A central aspect of finite-set statistics is a set of procedures for deriving the formula for the multitarget Markov density +1| (| 0 ) from the formula for +1| (| 0 ). These two statistical descriptors are related by the equation +1| (| 0 ) = Z +1| (| 0 ) (45) for all . The validity of this equation ensures that the formula for +1| (| 0 ) is “true.” This is because Eq. (45) states that +1| (| 0 ) and +1| (| 0 ) are equivalent. The explicit formula for +1| (| 0 ) is too complicated to state here—see [20], pp. 466-473. IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. multitarget motion model belief-mass function “true” multitarget Markov density multitarget Bayes filter predictor The multisensor-multitarget likelihood function +1 (|) is related to the belief-mass function by the equation Z +1| (|) (51) +1 ( |) = Ξ+1| = ( 0 ) ∪ |{z} | {z } | {z } predicted targets surviving 107 appearing ⇓⇑ The validity of this equation ensures that the formula for +1 (|) is “true,” because it shows that +1 ( |) and +1| (|) are equivalent. The explicit formula for +1| (|) is too complicated to reproduce here—see [20], pp. 408-421. +1| (| 0 ) = Pr(Ξ+1| ⊆ | 0 ) R +1| ⇓⇑ multiobject calculus multitarget meas’t model ( 0 ) = +1| (| 0 ) Σ+1 = +1 (x) ∪ +1 | {z } | {z } | {z } measurements ⇓ belief-mass function +1 ( |x) = Pr(Σ+1 ⊆ |x) R +1 ⇓⇑ Figure 3: Multisensor-Multitarget Motion Modeling E. Multisensor-Multitarget Measurement Modeling This is summarized in Figure 4. We begin with a RFS multisensor-multitarget measurement model. Consider the “standard” such model. Suppose that the state-set for the targets at time +1 is = {x1 x } with || = . It is assumed that each target generates at most a single measurement, and that any measurement is generated by at most a single target. Let (x ) be the probability that the target x generates a measurement. Then the set of measurements Υ+1 (x ) generated by the target with state x will have at most a single element: ½ ∅ if undetected (prob. 1 − (x )) Υ (x ) = detected {Z } if (46) Suppose that +1 is the set of measurements that are generated by no target—i.e., the clutter measurements. Then the RFS measurement model has the form Σ+1 = Υ (x1 ) ∪ ∪ Υ (x ) ∪ +1 (47) The symbol ‘∪’ indicates that measurements consist of target-generated measurements or clutter measurements. It is assumed that Υ (x1 ) Υ (x ) +1 are statistically independent. Because of independence, the b.m.f. of the RFS model is +1 ( |) = Υ+1 (x1 ) ( ) · · · Υ+1 (x ) ( ) · +1 ( ) (48) where the b.m.f. of Υ+1 (x0 ) is Z +1 (z|x )z (49) Υ+1 (x ) ( ) = 1 − (x ) + (x ) and the b.m.f. of +1 is usually chosen to be Poisson: µ ¶ Z +1 ( ) = exp −+1 + +1 +1 (z)z (50) Here +1 (z) is the spatial distribution of clutter measurements, and +1 is the expected number of clutter measurements at time +1 (the “clutter rate”). multiobject calculus “true” multitarget likelihood function multitarget Bayes filter corrector clutter ⇓⇑ multitarget prediction integral | (| () ) → +1| (| () ) targets () = +1 (|) ⇓ multitarget Bayes’ rule +1| (| () ) → +1|+1 (| (+1) ) Figure 8: Multisensor-Multitarget Measurement Modeling F. Multitarget Calculus for Modeling Eqs. (45,51) do not tell us how to construct formulas for +1| (| 0 ) and +1| (|) from formulas for +1| (| 0 ) and +1 ( |). This is the purpose of multitarget calculus, which generalizes the reasoning used in Section III-G. Let x and x0 be arbitrarily small regions surrounding = − x0 , where x and, for any closed subset , let x0 def. ‘−’ indicates set-theoretic difference. Then x0 and x are disjoint and so +1| (x0 ∪ x |x0 ) − +1| (x0 |x0 ) = +1| (x |x0 ) (52) and so from Eq. (26), (53) +1|+1 (x|x0 ) +1| (x0 ∪ x |x0 ) − +1| (x0 |x0 ) lim = lim 0 |&0 | |&0 |x | |x x 1) Set Derivatives : For any real-valued set function () define the generalized Radon-Nikodým derivative ([9], pp. 144-150): () = lim 0 |&0 x |x lim |x |&0 (x0 ∪ x ) − (x0 ) |x | (54) Extend this definition as follows. For any = {x1 x } with || = ≥ 1, define () = () x1 · · · x (55) IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. and, if = ∅, define () = () (56) Then Eqs. (54-56) define the set derivative of () with respect to ([9], p. 150-151). The set derivative is the inverse operation of the set integral: Z (∅) (57) () = µZ ¶ ( ) = () (58) 2) Construction of “True” Markov Densities and Likelihood Functions : It is possible to derive “turn-the-crank” rules of differentiation for set derivatives: sum rules, power rules, product rules, chain rules, etc. ([20], pp. 383-395). These rules permit the explicit construction of formulas for multitarget Markov densities and multitarget likelihood functions. This is because of the following two formulas, which are direct consequences of Eqs. (57,58): ∙ ¸ +1| (| 0 ) (59) +1| (| 0 ) = =∅ ∙ ¸ +1 ( |) (60) +1 (|) = =∅ V. P RINCIPLED A PPROXIMATE M ULTITARGET F ILTERS The multisensor-multitarget recursive Bayes filter of Section IV-C is usually computationally intractable. How can we approximate it in a manner that preserves, as faithfully as possible, the underlying models and their interrelationships? This question is answered by assuming that the m.p.d.f.’s | (| () ) and/or +1| (| () ) have a particular simplified form—one that permits approximate closed-form solution of the multitarget Bayes filter. Three types of approximation have been extensively investigated in the literature thus far: Poisson, independent identically distributed cluster (i.i.d.c.), and multi-Bernoulli. They result in, respectively, PHD filters, CPHD filters, and multi-Bernoulli filters. The purpose of this section is to summarize these filters, as well as their extensions to unknown clutter and detection profiles. A. Probability Hypothesis Density (PHD) Filters Recall from Section III-B that - filters are based on first-moment approximation of the single-sensor, single-target Bayes filter as in Eqs. (7,8). By analogy, we assume that SNR is large enough that the multitarget Bayes filter can be approximated by its first-order statistical moments. But first one must ask: What is the first-order moment of a multitarget probability distribution? 1) Probability Hypothesis Density Functions : The naïve definition of the first-order moment (expected value) of a multitarget distribution would be Z (61) | = · | (| () ) However, it is mathematically undefined for the simple reason that addition and subtraction ± of finite sets 108 is undefined. Thus instead we must employ an alternative strategy: replace by some doppelgänger for which addition and subtraction is definable. In point process theory, one (intuitively speaking) chooses to be X y (x) (62) (x) = y∈ where y (x) is the Dirac delta function concentrated at y. In this case Eq. (61) is replaced by ([20], pp. 581-582) Z (x) · | (| () ) (63) | (x| () ) = Z (64) = | ({x} ∪ | () ) This function is called a probability hypothesis density (PHD) or first-moment density function.3 It is completely characterized by the following property: Z | (x| () )x = expected no. of targets in (65) Thus the number | (x| () ) can be understood as the track density at x. A target with state x is more likely to be present in the scene when | (x| () ) is large than when it is small. Consequently, it is possible to use | (x| () ) to estimate the number and states of the targets. Let Z (66) | = | (x| () )x be the total expected number of targets in the scene. Round | off to the nearest integer , and then determine those values x1 x of x that correspond to the highest “peaks” of | (x| () ). Then ̂| = {x1 x } is an estimate of the number of targets and their states. 2) Example of a PHD : Suppose that, in a one-dimensional scenario, the multitarget distribution (m.p.d.f.) corresponds to two targets located at = 1 and = 2 : | ({ }) = 2| ( − 1 ) · 2| ( − 2 ) (67) +2| ( − 2 ) · 2| ( − 1 ) Then the corresponding PHD can be shown to be: | () = 2| ( − 1 ) + 2| ( − 2 ) (68) 3) PHD Filters in the General Sense : The m.p.d.f. of a Poisson process has the form Y (x) (69) () = − · x∈ where (x) is a density function with integral = R (x)x. In analogy with constant-gain Kalman filters— Eqs. (7,8)—assume that the multitarget densities in the multitarget Bayes filter are all approximately Poisson. Then this 3 PHDs are also known as intensity functions. I avoid this terminology because of the potential for confusion with the many alternative meanings of “intensity” in tracking and information fusion applications. “PHD” is a historical usage [18]. IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. filter can be approximately replaced by a first-order moment filter of the form −→ | (x| () ) corrector −→ predictor −→ +1| (x| () ) +1|+1 (x| (+1) ) −→ Such a filter is called a PHD filter in the general sense. 4) The “Classical” PHD Filter : The “classical” PHD filter is a PHD filter with these specific modeling assumptions [18], [20]: (1) a single sensor; (2) all target motions are independent; (3) measurements are conditionally independent of the target states; (4) the clutter process is Poisson in the sense of Eq. (50) and is independent of other measurements; (5) target-generated measurements are Bernoulli in the sense of Eq. (46); (6) the surviving-target process is Bernoulli in the sense of Eq. (38), and independent of persisting targets; and (7) the appearingtarget process is Poisson in the sense of Eq. (44). Using the methodology in Section VI, it can be shown that the exact4 time-update equation for the classical PHD filter is ([18], [20], Chapter 16):5 +1| (x| () ) (70) = +1| (x) Z + (x0 ) · x (x0 ) · +1| (x| () )x0 and, if +1 is the currently-collected measurement-set, that the approximate measurement-update equation is +1|+1 (x| (+1) ) ∼ = (x) · +1| (x| () ) (71) +1 Here the PHD “pseudolikelihood” is X (x) · z (x) +1 (x) = 1 − (x) + +1 +1 (z) + +1 (z) z∈+1 (72) and +1 (z) is the clutter spatial distribution and +1 is the clutter rate. Also, Z (73) +1 (z) = (x) · z (x) · +1| (x| () )x The classical PHD filter has attractive computational characteristics: its order of complexity is () where is the current number of measurements and is the current number of tracks. The major limitation of the PHD filter is that | is not a stable instantaneous estimate of target number— its variance is typically large. Thus in practice, | must be averaged over a time-window in order to get stable targetnumber estimates. B. Cardinalized PHD (CPHD) Filters The CPHD filter is a generalization of the PHD filter. It has low-variance instantaneous estimates of target number and better tracking performance, but at the cost of greater computational complexity. Its computational order is (3 ), although this can be reduced to (2 ) using numerical balancing techniques [10]. 4 There seems to be a misconception in some quarters that Eq. (70) is approximate—i.e., requires the assumption that the prior multitarget distribution | (| () ) be Poisson. This is not true. 5 For clarity, the “target-spawning” model is neglected. 109 1) CPHD Filters in the General Sense: The cardinality distribution | (| () ) of a multitarget track density | (| () ) was defined in Eq. (33). For each , | (| () ) is the probability that there are targets in the scene. The multitarget density of an independently identically distributed cluster (i.i.d.c.) process has the form Y (x) (74) () = ||! · (||) · x∈ where () is a cardinality distribution and (x) is a probability density (a “spatial distribution”). Assume that the multitarget distribuions in the multitarget Bayes filter are approximately i.i.d.c. Then this filter can be approximately replaced by a higher-order moment filter of the form ½ ½ +1| (x| () ) | (x| () ) predictor −→ −→ () | (| ) +1| (| () ) corrector −→ ½ +1|+1 (x| (+1) ) +1|+1 (| (+1) ) −→ Any such filter is a CPHD filter in the general sense. 2) The “Classical” CPHD Filter : The “classical” CPHD filter is a CPHD filter with these specific modeling assumptions [19], [20]: (1) single sensor; (2) independent target motions; (3) conditionally independent measurements; (4) the clutter process is i.i.d.c. and independent of other measurements; (5) target-generated measurements are Bernoulli in the sense of Eq. (46); (6) surviving targets are Bernoulli in the sense of Eq. (38), and independent of persisting targets; and (7) the appearing-target process is i.i.d.c. Given these assumptions and the methodology in Section VI, one can derive the time- and measurement-update equations for the classical CPHD filter. These equations are beyond the scope of this paper (see [20], Chapter 16). The classical PHD and CPHD filters are most commonly implemented using either Gaussian mixture techniques (assuming moderate motion and/or measurement nonlinearities) or particle methods (for stronger nonlinearities). See [20], Chapter 16, for more details. The classical PHD and CPHD filters have also been employed in hundreds of research papers addressing dozens of applications, far too many to address here. A few diverse examples: ground-target tracking using GMTI (ground moving target indicator) radar [41]; passive-RF air-target racking [40]; satellite-borne optical satellite tracking [6]; audio speaker tracking [13]; underwater monostatic-active sonar [1], [2]; and underwater multistatic active sonar [8]. C. Multisensor Classical PHD/CPHD Filters The classical PHD and CPHD filters can be extended to the multisensor case. However, these extensions are nontrivial and require special theoretical analysis. The exact approach [3], [17] is combinatorial, and therefore is appropriate only for a small number of sensors. Moratuwage, Vo, and Danwei Wang have successfully employed the exact multisensor PHD filter in a robotics SLAM (simultaneous localization and mapping) application [30]. IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. The most common approach, the iterated-corrector method, is heuristic. The PHD filter or CPHD filter corrector equation is applied successively for each sensor. This approach depends on sensor order, but performs well when the sensors’ probabilities of detection are not too dissimilar. Otherwise, larger- sensors should be applied before smaller- sensors [34]. The parallel combination approximate multisensor (PCAM) approach [14] does not depend on sensor order, is computationally tractable, and results in good tracking performance. Another approach has been proposed [38], in which the PHD pseudolikelihoods of Eq. (72) are averaged. Given the discussion in Section III-F, this is an obviously problematic approach that should result in poor target localization—which indeed it does. Nagappa and Clark have conducted simulations comparing the approaches just summarized. In a first set of three-sensor simulations, two sensors had = 095 and the third = 09. In decreasing order of performance: PCAM-CPHD, PCAM-PHD, iterated-corrector CPHD, iterated-corrector PHD, averaged-pseudolikelihood PHD. The performance of the averaged-pseudolikelihood PHD filter was particularly bad, with the iterated-corrector PHD filter having intermediate performance. Similar results were observed when the probability of detection of the third sensor was decreased to = 085 and again to = 07. D. Multi-Bernoulli Filters Suppose that we have target tracks, and that each track has a track distribution (x) and a probability of existence , for = 1 . Then the multitarget density of a multiBernoulli process has the form, for = {x1 x } with || = , X () = 1≤1 6=6= ≤ where = Y Y · (x ) 1 − =1 (75) (76) (1 − ) =1 Assume that the multitarget distributions in the multitarget Bayes filter are approximately multi-Bernoulli. Then it can be approximately replaced by a multi-Bernoulli filter in the general sense: | | +1| → { (x)} +1|+1 → { → { +1|+1 (x)} +1| → (x)} The first such filter was proposed in [20], Chapter 17 but, because of an ill-conceived linearization step, exhibited a pronounced bias in the target-number estimate. Vo, Vo, and Cantoni corrected this bias with their cardinality-balanced multitarget multi-Bernoulli (CBMeMBer) filter [44]. The CBMeMBer filter appears to be well-suited for applications in which motion and/or measurement nonlinearities are strong and therefore sequential Monte Carlo (a.k.a. particle) implementation techniques must be used [44]. Dunne and Kirubarajan have devised a jump-Markov version of this filter 110 [5], and Shanhung Wong, Vo, Vo, and Hoseinnezhad have applied it to road-constrained ground-target tracking [36]. Vo, Vo, Pham, and Suter subsequently devised a multiBernoulli track-before-detect (TBD) filter for tracking in pixelized image data, assuming that targets have physical extent and thus cannot overlap [47]. As already noted, this filter outperforms the previously best-known TBD filter, the histogramPMHT filter. It has been successfully applied to challenging real videos, e.g., hockey and soccer matches [12], [11]. E. “Background-Agnostic” PHD and CPHD Filters The PHD, CPHD, and multi-Bernoulli filters all require a priori models of both the clutter (in the form of a clutter spatial distribution +1 (z) and clutter rate +1 ) and the detection profile (in the form of a state-dependent probability of detection (x)). A series of “second-generation” PHD and CPHD filters do not require a priori models. Any PHD, CPHD, or multi-Bernoulli filter can be transformed into a filter that can operate when the probability of detection is unknown and/or dynamic [25]. The basic idea is simple: replace the target state x by an augmented target state x̊ = ( x), where 0 ≤ ≤ 1 is the unknown probability of detection associated with a target with state x. The resulting “probability of detection agnostic” (PDAG) PHD, CPHD, and multi-Bernoulli filters have the same form as before, except that the PHDs and spatial distributions have the form ̊| ( x) and ̊| ( x). The PDAG-CPHD filter has been successfully implemented in [28]. Any PHD, CPHD, or multi-Bernoulli filter can be transformed into one that can operate when the clutter background is unknown and/or dynamic [26]. One simply replaces the target state space by a state space that includes both targets and “clutter generators.” The clutter generators are assumed to be target-like in that their measurement-generation process is Bernoulli in the sense of Eq. (46). In this case, any state x̊ of the joint target-clutter system can have two forms: x̊ = x or x̊ = c where c is the state of a clutter generator. One defines suitable extensions ̊z (x̊) and ˚+1| (x̊|x̊0 ) of the likelihood function and Markov density. Targets must not transition to target generators, and vice-versa: (77) ˚+1| (x|c0 ) = 0, ˚+1| (c|x0 ) = 0 Otherwise, target statistics and clutter statistics would be inherently intermixed, thus making it more difficult to distinguish targets from clutter. The filtering equations for “clutter agnostic” (CAG) versions of the PHD, CPHD, and multiBernoulli filters can then be derived using simple algebra. A version of the CAG-CPHD filter was successfully implemented in [28]. The PDAG and CAG approaches can be combined, thus allowing any PHD, CPHD, or multi-Bernoulli filter to operate under “general background-agnostic” (GBAG) conditions. Various versions of the GBAG-CPHD filter have been successfully implemented in [26], and likewise for a GBAG-CMeMBer filter in [45]. A related development is the “multitarget intensity filter” (MIF) or “iFilter” [39]. It is actually a CAG-PHD filter, except that Eqs. (77) are violated: clutter generators are (problematically) allowed to become targets, and vice-versa. IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. In [39], the authors claimed that: (1) the PHD filter can be derived as a special case of the MIF using only “elementary” PPP (Poisson point process) theory; and (2) the MIF can simultaneously estimate the clutter process and target-appearance process. However, simple counterexamples demonstrate that—because Eqs. (77) are violated—these claims are false. First, the MIF cannot always estimate the target-appearance process, because when there is no clutter (and thus no clutter generators), the target-birth rate is always estimated to be 0 [15]. Second, the MIF cannot always estimate the clutter process, since when the probability of birth target and probability of target death are “conjugate,” its estimate of the clutter rate is always a fixed multiple of the current number of measurements [15]. Third, the PHD filter is not a special case of the MIF because, when there is no clutter, the MIF predictor has no target-appearance term—unlike the PHD filter predictor [15]. Fourth, the derivation of the MIF (and along with it, the claimed elementary derivation of the PHD filter) has serious mathematical errors and problematic assumptions.6 VI. FISST A PPROXIMATION M ETHODOLOGY [] = [] x1 · · · x (80) Intuitively speaking, Eq. (78) is a “functional derivative” [48], which in physics is defined as a Gâteaux derivative in the direction of the Dirac delta function x (y) [7]: [ + · x ] − [] [] = lim →0 x (81) (Note: While Eq. (81) is conceptually equivalent to Eq. (78) and is a useful engineering heuristic, it is not—like Eq. (78)— mathematically rigorous.) Set derivatives of functionals can be derived using a toolbox of “turn-the-crank” differentiation rules ([20], pp. 383-395), including a powerful general chain rule due to Clark [3]. Let | (| () ) be the multitarget probability distribution. Then its p.g.fl. is the functional defined by [4], [48] Z = | [| () ] = · | (| () ) (82) | [] abbr. where the power functional of the function 0 ≤ (x) ≤ 1 is defined by = 1 if = ∅ and, otherwise, by = Q x∈ (x). The advantage of the p.g.fl. representation is that mathematical formulas that are complicated at the m.p.d.f. level often greatly simplify at the p.g.fl. level, thus facilitating the derivation of approximate filters—see Section VI-D below. The cardinality distribution of of | (| () ) was defined in Eq. (33). It can be directly derived from the probability generating function £ ¤ (83) | () = | [] = using the formula A. Functional Calculus A functional is a real-valued function [] whose argument is a conventional function: (x). The generalized Radon-Nikodým derivative of [] with respect to x is ∙ ¸ [] = () (78) x x =∅ where the derivative on the right was defined in Eq. (54) and the set function () is defined by [ + · 1 ] − [] →0 where 1 (x) is the indicator function of the set .7 If = {x1 x } with || = then the set derivative of [] with respect to is B. Probability Generating Functionals (p.g.fl.’s) The approximation methodology used for PHD, CPHD, and multi-Bernoulli filters consists of the following steps: 1) Construct RFS motion and measurement models for the targets and sensor. 2) Use multitarget calculus to convert these models into multitarget Markov densities and likelihood functions. 3) From these, construct the optimal approach for the application: a multitarget Bayes filter. 4) Convert the multitarget Bayes filter into probability generating functional (p.g.fl.) form. 5) Use simplifying approximations (Poisson, i.i.d.c., multiBernoulli) and multitarget calculus to derive PHD, CPHD, and/or multi-Bernoulli filters for the application. The first, second, and third steps have already been summarized in Section IV. The purpose of this section is to summarize the fourth and fifth steps. () = lim 111 (79) 6 The authors implicitly assume that detected targets are well-separated (which obviates their claim to have a “multitarget” filter). Also, because of an erroneous assumption, the MIF (and along with it the claimed elementary derivation of the PHD filter) is invalid for arbitrary sample paths. For details, see [16], Appendix A. | (| () ) = ∙ ¸ 1 | () ! =0 (84) The PHD of | (| () ) was defined in Eq. (64). It also can be derived from the p.g.fl.: ¸ ∙ | () [] (85) | (x| ) = x =1 The belief-mass function of Eq. (29) can also be directly derived from the p.g.fl.: | () = | [1 ] (86) 7 Eq. (54) exists for all x if () is a countably additive measure that is absolutely continuous with respect to the base measure. IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. C. p.g.fl. Form of the Multitarget Bayes Filter The multitarget Bayes filter can be equivalently expressed as a filter on p.g.fl.’s: −→ | [| corrector −→ () ] predictor −→ +1|+1 [| +1| [| (+1) () ] ] −→ The predictor step is Z () +1| [| ] = +1| [| 0 ] · | ( 0 | () ) 0 (87) where 0 +1| [| ] = Z 0 · +1| (| ) +1|+1 [| (+1) ] = +1 [0 ] +1 [0 1] (88) (89) where the bivariate functional [ ] is defined by Z [ ] = · +1 [|] · +1 (| () ) +1 [|] = Z (90) (91) · +1 (|) is the p.g.fl. of the multitarget likelihood +1 (|). D. Deriving Approximate Filters PHD, CPHD, and multi-Bernoulli filters are derived from the p.g.fl. Bayes filter as follows. For illustrative purposes, consider the PHD filter. Given the standard multitarget models—Eqs. (43-40) and Eqs. (49-48)—it can be shown that the p.g.fl. forms of the multitarget Markov density (Eq. (88)) and multitarget likelihood function (Eq. (91)) are (92) +1| [| 0 ] = h+1| −1i · (1 − + · ) +1 [|] = h+1 −1i · (1 − + · ) 0 (93) where the power-functional notation was defined following Eq. (82); and where Z (x) · +1| (x|x0 )x (94) (x0 ) = Z (x) = (z) · +1 (z|x)z (95) and h+1| − 1i = h+1 − 1i = Z Z From Eqs. (92,93) it follows that Eqs. (87,90) reduce to the simplified form +1| [] = h+1| −1i ·| [(1 − + )] [ ] = h+1 −1i ·+1| [(1 − + )] (98) (99) The classical PHD filter arises if we assume that +1| [] (but not | []) is Poisson. That is, set +1| [] = h+1| −1i (100) and apply the “turn-the-crank” rules for the set derivative. is the p.g.fl. form of the multitarget Markov transition density +1| (| 0 ). The corrector step (Bayes’ rule) is and where 112 ((x) − 1) · +1| (x)x (96) ((z) − 1) · +1 +1 (z)z (97) VII. C ONCLUSIONS Finite-set statistics is a conceptually parsimonious, “Statistics 101”-style foundation for multisensor-multitarget detection, tracking, and data fusion. This tutorial article has summarized the basic tools necessary for reliably deriving useful new multitarget tracking and data fusion algorithms, without virtuoso-level expertise in point process theory. Because of space limitations, many other significant RFS tracking topics have been neglected, including: • SLAM (simultaneous localization and mapping) for robotics applications [31], [32]. In this work, an RFS-based SLAM filter has been shown to significantly outperform standard methods such as MHT-FASTSLAM. • Multitarget smoothing [29], [46]. • Bayes-optimal processing of nontraditional data such as attributes, features, natural-language statements, and inference rules (see [27] and [20], Chapters 3-6). Additional developments include unified approaches for sensor management [23] and track-to-track fusion [22]. R EFERENCES [1] D. Clark and J. Bell, “Bayesian multiple target tracking in forward scan sonar images using the PHD filter,” IEE Proc. Radar, Sonar and Navigation, Vol. 152, No. 5, pp. 327 - 334, 2005. [2] D. Clark, I. Ruiz, Y. Petillot, and J. Bell, “Particle PHD filter multiple target tracking in sonar image,” IEEE Trans. Aerospace and Electronic Systems, Vol. 43, No. 1, pp. 409-416, 2007. [3] D. Clark and R. Mahler, “Generalized PHD filters via a general chain rule,” Proc. 15 Int’l Conf. on Information Fusion, Singapore, July 9-12, 2012. [4] D. Daley and D. Vere-Jones, An Introduction to the Theory of Point Processes, First Edition, Springer-Verlag, New York, 1988. [5] D. Dunne and T. Kirubarajan, “Multiple model tracking for multitarget multi-Bernoulli filters,” in I. Kadar (ed.), Signal Processing, Sensor Fusion, and Target Recognition XXI, SPIE Proc. Vol. 8392, Baltimore, MD, April 23-27, 2012. [6] A. El-Fallah, A. Zatezalo, R. Mehra, R. Mahler, and K. Pham, “Joint search and sensor management of space-based EO/IR sensors for LEO threat estimation,” in J. Cox and P. Motaghedi, (eds.), Sensors and Sys. for Space Applications III, SPIE Proc. Vol. 7330, 2009. [7] E. Engel and R. Dreizler, Density Functional Theory, Springer, 2011. [8] R. Georgescu and P. Willett, “The GM-CPHD applied to real and realistic multistatic sonar data,” in O. Drummond (ed.), Sign. and Data Proc. of Small Targets 2010, SPIE Proc. Vol. 7698, 2010. [9] I. R. Goodman, R. P. S. Mahler, and H. T. Nguyen, Mathematics of Data Fusion, Kluwer Academic Publishers, New York, 1997. [10] J. Guern, “Method and System for Calculating Elementary Symmetric Functions of Subsets of a Set,” U.S. Patent No. 20110040525, Feb. 2, 2011, http://www.faqs.org/patents/app/20110040525. IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, ISSUE 3, JUNE 2013, PP 376-389. [11] R. Hoseinnezhad, B.-N. Vo, and B.-T. Vo, “Visual tracking in background subtracted image sequence via multi-Bernoulli filtering,” IEEE Trans. Sign. Proc., 61(2): 392-397 2012. [12] R. Hoseinnezhad, B.-N.Vo, B.-T. Vo, and D. Suter, “Visual tracking of numerous targets via multi-Bernoulli filtering of image data,” Pattern Recognition, 45(10): 3625-3635, 2012. [13] W. -K. Ma, B.-N. Vo, S. Singh and A. Baddeley, “Tracking an unknown time-varying number of speakers using TDOA measurements: A random finite set approach,” IEEE Trans. Sign. Proc., Vol. 54, No. 9, pp. 32913304, 2006. [14] R. Mahler, “Approximate multisensor CPHD and PHD filters,” Proc. 13 Int’l Conf. on Information Fusion, Edinburgh, Scotland, July 2629, 2010. [15] R. Mahler, “A comparison of ‘clutter-agnostic’ PHD/CPHD filters,” in I. Kadar (ed.), Signal Processing, Sensor Fusion, and Target Recognition XXI, SPIE Proc. Vol. 8392, Baltimore, MD, April 23-27, 2012. [16] R. Mahler, “Linear-complexity CPHD filters,” Proc. 13 Int’l Conf. on Information Fusion, Edinburgh, Scotland, July 26-29, 2010. [17] R. Mahler, “The multisensor PHD filter, I: General solution via multitarget calculus,” in I. Kadar (ed.), Sign. Proc., Sensor Fusion, and Targ. Recogn. XVIII, SPIE Proc. Vol. 7336, 2009. [18] R. Mahler, “Multitarget filtering via first-order multitarget moments,” IEEE Trans. Aerospace and Electronics Systems, Vol. 39, No. 4, pp. 1152-1178, 2003. [19] R. Mahler, “PHD filters of higher order in target number,” IEEE Trans. Aerospace and Electronic Systems, Vol. 43, No. 4, pp. 1523-1543, 2007. [20] R. Mahler, Statistical Multisource-Multitarget Information Fusion, Artech House, Norwood, MA, 2007. [21] R. Mahler, “‘Statistics 101’ for Multisensor, Multitarget Data Fusion,” IEEE Aerospace & Electronics Sys. Mag., Part 2: Tutorials, Vol. 19 No. 1, pp. 53-64, 2004. [22] R. Mahler, “Toward a theoretical foundation for distributed fusion,” Chapter 8 in D. Hall, M. Liggins II, C.-Y. Chong, and J. Linas (eds.), Distributed Data Fusion for Network-Centric Operations, CRC Press, Boca Raton, 2012. [23] R. Mahler, “A unified approach to sensor and platform management,” Proc. 2011 Nat’l Symp. on Sensor and Data Fusion, Washington D.C., October 24-26, 2011. [24] R. Mahler and A. El-Fallah, “An approximate CPHD filter for superpositional sensors,” in I. Kadar (ed.), Signal Processing, Sensor Fusion, and Target Recognition XXI, SPIE Proc. Vol. 8392, Baltimore, MD, April 23-27, 2012. [25] R. Mahler and A. El-Fallah, “CPHD filtering with unknown probability of detection,” in I. Kadar (ed.), Sign. Proc., Sensor Fusion, and Targ. Recogn. XIX, SPIE Proc. Vol. 7697, 2010. [26] R. Mahler and A. El-Fallah, “CPHD and PHD filters for unknown backgrounds, III: Tractable multitarget filtering in dynamic clutter,” in O. Drummond (ed.), Sign. and Data Proc. of Small Targets 2010, SPIE Proc. Vol. 7698, 2010. [27] R. Mahler and A. El-Fallah, “The random set approach to nontraditional measurements is rigorously Bayesian,” in I. Kadar (ed.), Signal Processing, Sensor Fusion, and Target Recognition XXI, SPIE Proc. Vol. 8392, Baltimore, MD, April 23-27, 2012. [28] R. Mahler, B.-T. Vo, and B.-N. Vo, “CPHD filtering with unknown clutter rate and detection profile,” IEEE Trans. Sign. Proc., Vol. 59, No. 6, pp. 3497-3513, 2011. [29] R. Mahler, B.-T. Vo, and B.-N. Vo, “Forward-backward probability hypothesis density smoothing,” IEEE Trans. Electronic & Aerospace Systems, Vol. 48, No. 1, pp. 707 - 728 , 2012. [30] D. Moratuwage, B.-N. Vo, and Danwei Wang, “A hierarchical approach to the multi-vehicle SLAM problem,” Proc. 15 Int’l Conf. on Information Fusion, Singapore, July 9-12, 2012. [31] J. Mullane, B.-N. Vo, M. Adams, and B.-T. Vo, "A random-finite-set approach to Bayesian SLAM," IEEE Trans. Robotics, 27(2): 268-282, 2011. [32] J. Mullane, B.-N. Vo, M. Adams and B.-T. Vo, Random Finite Sets in Robotic Map Building and SLAM, Springer, 2011. [33] S. Nagappa, D. Clark, and R. Mahler, “Incorporating track uncertainty into the OSPA metric,” Proc. 14th Int’l Conf. on Information Fusion, Chicago, July 5-8, 2011. [34] S. Nagappa and D. Clark, “On the ordering of the sensors in the iteratedcorrector probability hypothesis density (PHD) filter,” in I. Kadar (ed.), Signal Processing, Sensor Fusion, and Target Recognition XX, SPIE Proc. Vol. 8050, Orlando, FL, April 26-28, 2011. [35] S. Nannuru, M. Coates, and R. Mahler, “Computationally-tractable approximate PHD and CPHD filters for superpositional sensors,” IEEE J. on Special Topics in Sign. Proc., (???)???: ???-???, 2013. 113 [36] Shanhung Wong, B-T. Vo, B.-N. Vo, and R. Hoseinnezhad, “MultiBernoulli based track-before-detect with road constraints,” Proc. 15 Int’l Conf. on Information Fusion, Singapore, July 9-12, 2012. [37] D. Stoyan, W. S. Kendall, and J. Meche, Stochastic Geometry and Its Applications, Second Edition, John Wiley & Sons, 1995. [38] R. Streit, “Multisensor multitarget intensity filter,” Proc. 11th Int’l Conf. on Information Fusion, pp. 1694-1701, Cologne, Germany, June 30-July 3, 2008. [39] R. Streit and L. Stone, “Bayes derivation of multitarget intensity filters,” Proc. 11th Int’l Conf. on Information Fusion, pp. 1686-1693, Cologne, Germany, June 30-July 3, 2008. [40] R. Tharmarasa, M. McDonald, T. Kirubarajan, “Passive tracking with sensors of opportunity using passive coherent location,” in O.E. Drummond (ed.), Signal and Data Processing of Small Targets 2008, SPIE Proc. Vol. 6969, 2008. [41] M. Ulmke, O. Erdinc, and P. Willett, “Gaussian mixture cardinalized PHD filter for ground moving target tracking,” Proc. 10 Int’l Conf. on Information Fusion, Quebec City, Canada, July 9-12, 2007. [42] B.-T. Vo and V.-N. Vo, “Labeled random finite sets and multi-object conjugate priors,” submitted to IEEE Trans. Sign. Proc. [43] B.-T. Vo and B.-N. Vo, “A random finite set conjugate prior and application to multi-target tracking,” Proc. 2011 Int’l Conf. on Intelligent Sensors, Sensor Networks, and Information Processing (ISSNIP2011), Adelaide, Australia, 2011. [44] B.-T. Vo, B.-N. Vo, and A. Cantoni, “The cardinality balanced multitarget multi-Bernoulli filter and its implementations,” IEEE Trans. Aerospace and Electronic Sys.., Vol. 57, No. 2, pp. 409-423, 2009. [45] B.-T. Vo, B.-N. Vo, R. Hoseinnezhad, and R. Mahler, “Multi-Bernoulli filtering with unknown clutter intensity and sensor field-of-view,” Proc. 45 Conf. on Information Sciences and Systems (CISS2011), Baltimore, MD, Mar. 23-25, 2011. [46] B.-N. Vo, B.-T. Vo, and R. Mahler, “Closed-form solutions to forwardbackward smoothing,” IEEE Trans. Sign. Proc., 60(1): 2-17, 2012. [47] B.-N. Vo, B.-T. Vo, N.-T. Pham, and D. Suter, “Joint detection and estimation of multiple objects from image observations,” IEEE Trans. Signal Processing, Vol. 58, No. 10, pp. 5129-5241, 2010. [48] V. Volterra, Theory of Functionals and of Integral and IntegroDifferential Equations, (trans. M. Long), Blackie and Son, Ltd., London and Glasgow, 1930. R onald Mahler was born in Great Falls, MT, in 1948. He received the B.A. degree in mathematics from the University of Chicago, Chicago, IL, in 1970, the Ph.D. in mathematics from Brandeis PLACE University, Waltham, MA, in 1974, and the B.E.E. PHOTO in electrical engineering from the University ofMinHERE nesota, Minneapolis, in 1980. He was an Assistant Professor of Mathematics at the University of Minnesota from 1974 to 1979. Since 1980, he has been employed at Lockheed Martin, Eagan, MN, where currently he is a Senior Staff Research Scientist at Lockheed Martin Advanced Technology Laboratories. His research interests include information fusion, expert systems theory, multitarget tracking, and sensor management. He is the author, coauthor, or coeditor of over 70 publications, including 12 articles in refereed journals, two books, and a hardcover conference proceedings. He received the 2004 and 2008 Author of the Year Awards from Lockheed Martin MS2, the 2007 Mignogna Data Fusion Award, the 2005 IEEE AESS Harry Rowe Mimno Award, and the 2007 IEEE AESS Barry Carlton Award.