Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 Sensor Scheduling for Observation of a Markov Process Mohammad Rezaeian, Sofia Suvorova, Bill Moran Department of Electrical and Electronic Eng. University of Melbourne Victoria, 3010, Australia Email: rezaeian,s.suvorova,[email protected] Abstract We study the optimal scheduling of a set of sensors for observation of a Markov process. The Markov process is called the state process which in conjunction with the measurement processes as the readout of sensors defines a generalized hidden Markov process. The dynamics of state process is characterized by a transition probability matrix P , while various sensors for the observation of state are characterized through a set of observation probability matrices T k (k = 1, 2, ..., K). The criterion for optimality is to minimize the conditional entropy of state given past measurements for sufficiently large number of measurements, thus the optimal scheduling provides the minimum ambiguity about the state given all the past read-out of sensors selected based on that schedule. A schedule is characterized by a particular partitioning of the probability simplex associated with the state estimation. The selection of sensors based on a schedule at each epoch depends on the partition that contains the probability of state by the current state estimate. We seek a stationary solution to this scheduling problem. I. I NTRODUCTION For the purpose of this paper, a Hidden Markov Process {Sn }∞ n=0 is generalized to be a process defined by [S, P, T k (k = 1, 2, . . . , K), Z], where S and Z are sets of possible states and measurement outcomes, respectively, P is a transition probability matrix, and T k (k = 1, 2, . . . , K) are observation probability matrices. In contrast to the usual hidden Markov process [1]-[2], which has only one observation probability matrix, here the measurement Zn at time n is related to the state Sn through the observation probability matrix T (kn ) which varies with time index n. The purpose of this paper is to find an optimal policy to chose the measurement sensor kn based on current estimate of state and with the goal to achieve minimum entropy for state. This problem arises in applications for optimal usage of a set of sensors for observation of a Markov process where the systems or the resources management can only afford one sensor at a time. For example, in a radar system only one waveform out of a set can be used at each pulse transmission [3]. For a hidden Markov process defined as above, let ∆ be the space of probability measures on the state space S, ie: the set of vectors π of positive real numbers with sum equal to 1 and let P(∆) be the space of probability distributions on ∆. In this paper the probability P r(X = x) is shown by p(x) (similarly for conditional probabilities), whereas p(X) represents a row vector as the distribution of X, ie: the k-th element of the vector p(X) is P r(X = k). We also denote the history of all measurements up to and including time n − 1 by Z n−1 . We adopt the concept of the information state from the Partially Observed Markov Decision Processes (POMDP) [4],[5]. We denote the information state by πn as a random variable on ∆ given by πn (Z n−1 ) = p(Sn |Z n−1 ), (1) This work was supported in part by the Defense Advanced Research Projects Agency of the US Department of Defense and was monitored by the Office of Naval Research under Contract No. N00014-04-C-0437. 2 and µn ∈ P(∆) as the distribution of πn . For any z ∈ Zn the information state can be obtained recursively by Baum-equation, πn Dk (z)P πn+1 = f k (z, πn ) = , (2) πn Dk (z)1 where Dk (z) is a diagonal matrix with dii (z) = T k [i, z], i = 1, . . . , |S|. Accordingly, the information process {πn }∞ n=0 is a Markov process with state space ∆. For a Hidden Markov Process [S, P, T k (k = 1, 2, . . . , K), Z], we define a stationary policy to be a M S partition τ = {B1 , B2 , . . . BM } of the state space ∆ by Borel sets, so that Bi = ∆. Accordingly, a i=1 stationary policy can also be considered as a mapping τ : ∆ → {1, 2, ..., K}. A key property for the information state is that πn is a sufficient statistics for all the past observation Z n−1 [4], therefore it encapsulates all the information that can be obtained from Z n−1 for estimating the state Sn . As a result, in each epoch we can update our probabilistic estimation of the state using the recursive formula (2) and the observation of Zn at that epoch. The choice of the sensor for the next measurement can only depend on the updated information state πn+1 . Since π lives on ∆, a policy, as a particular partitioning of ∆ defines a rule for selecting sensors based on all the past observation on hand, under the restriction that we can only use one sensor at a time. Now the question is how to find the best policy to get the most benefit of the set of sensor for acquisition of information from the state process. We wish to find a policy that in the long run provides the best observability of the state process, or in another word the least ambiguity about the state process as the observations Zn reveal. To this end we consider the following limiting conditional entropy as a measure of ambiguity about state given past observations Ĥ = lim H(Sn |Z n−1 ). n→∞ (3) We call Ĥ the entropy rate of the state process. The conditional entropy H(Sn |Z n−1 ) depends on the policy for selection of sensors, thus Ĥ is a function of policy τ . Our goal is to solve the following minimization problem, τ ∗ = arg min Ĥ(τ ) (4) τ A characterization of the entropy rate of state process by integral expressions similar to results of [6] and [7] has been discussed in Section II. Based on this characterization, an expression for finding the optimal policy has been obtained in Section III, followed by the simulation results for two example hidden Markov processes in Section IV. R EFERENCES [1] Y. Ephraim and N. Merhav, ”Hidden Markov Processes,” IEEE Trans. Inform. Theory, vol. IT-48 No.6 , pp. 1518-1569, June 2000. [2] L.R. Rabiner, ”A totorial on hidden Markov models and selected applications in speech recognition”, Proceedings of the IEEE, vol 77, No 2, February 1989, pp. 257-286. [3] B. La Scala, M. Rezaeian, and B. Moran, ”Optimal Adaptive Waveform Scheduling for Target Tracking”, in proceedings of International Conference on Information Fusion, July 2005, Philadelphia, PA, USA. [4] R.D. Smallwood and E.J. Sondik, ”Optimal control of partially observed Markov processes over a finite horizon,” Operation Research, vol.21, pp. 1071-1088, 1973. [5] W.S. Lovejoy, ”A survey of algorithmic methods for partially observed Markov decision processes”, Operation Research, vol.39, no. 1, pp. 162-175, Jan 1991. [6] D.Blackwell, ”The entropy of functions of finite-state Markov chains”, Trans. First Prague Conf. Inf. Th., Statistical Decision Functions, Random Processes, page 13-20, 1957. [7] M. Rezaeian, ”The entropy rate of the hidden Markov process”, submitted to IEEE Trans. Inform. Theory, May 2004.