Download Sensor Scheduling for Observation of a Markov Process

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
1
Sensor Scheduling for Observation of a Markov
Process
Mohammad Rezaeian, Sofia Suvorova, Bill Moran
Department of Electrical and Electronic Eng.
University of Melbourne
Victoria, 3010, Australia
Email: rezaeian,s.suvorova,[email protected]
Abstract
We study the optimal scheduling of a set of sensors for observation of a Markov process. The Markov process
is called the state process which in conjunction with the measurement processes as the readout of sensors defines
a generalized hidden Markov process. The dynamics of state process is characterized by a transition probability
matrix P , while various sensors for the observation of state are characterized through a set of observation probability
matrices T k (k = 1, 2, ..., K). The criterion for optimality is to minimize the conditional entropy of state given past
measurements for sufficiently large number of measurements, thus the optimal scheduling provides the minimum
ambiguity about the state given all the past read-out of sensors selected based on that schedule. A schedule
is characterized by a particular partitioning of the probability simplex associated with the state estimation. The
selection of sensors based on a schedule at each epoch depends on the partition that contains the probability of
state by the current state estimate. We seek a stationary solution to this scheduling problem.
I. I NTRODUCTION
For the purpose of this paper, a Hidden Markov Process {Sn }∞
n=0 is generalized to be a process defined
by [S, P, T k (k = 1, 2, . . . , K), Z], where S and Z are sets of possible states and measurement outcomes,
respectively, P is a transition probability matrix, and T k (k = 1, 2, . . . , K) are observation probability matrices. In contrast to the usual hidden Markov process [1]-[2], which has only one observation probability
matrix, here the measurement Zn at time n is related to the state Sn through the observation probability
matrix T (kn ) which varies with time index n. The purpose of this paper is to find an optimal policy to
chose the measurement sensor kn based on current estimate of state and with the goal to achieve minimum
entropy for state. This problem arises in applications for optimal usage of a set of sensors for observation
of a Markov process where the systems or the resources management can only afford one sensor at a time.
For example, in a radar system only one waveform out of a set can be used at each pulse transmission
[3].
For a hidden Markov process defined as above, let ∆ be the space of probability measures on the state
space S, ie: the set of vectors π of positive real numbers with sum equal to 1 and let P(∆) be the space
of probability distributions on ∆. In this paper the probability P r(X = x) is shown by p(x) (similarly
for conditional probabilities), whereas p(X) represents a row vector as the distribution of X, ie: the k-th
element of the vector p(X) is P r(X = k). We also denote the history of all measurements up to and
including time n − 1 by Z n−1 .
We adopt the concept of the information state from the Partially Observed Markov Decision Processes
(POMDP) [4],[5]. We denote the information state by πn as a random variable on ∆ given by
πn (Z n−1 ) = p(Sn |Z n−1 ),
(1)
This work was supported in part by the Defense Advanced Research Projects Agency of the US Department of Defense and was monitored
by the Office of Naval Research under Contract No. N00014-04-C-0437.
2
and µn ∈ P(∆) as the distribution of πn . For any z ∈ Zn the information state can be obtained recursively
by Baum-equation,
πn Dk (z)P
πn+1 = f k (z, πn ) =
,
(2)
πn Dk (z)1
where Dk (z) is a diagonal matrix with dii (z) = T k [i, z], i = 1, . . . , |S|. Accordingly, the information
process {πn }∞
n=0 is a Markov process with state space ∆.
For a Hidden Markov Process [S, P, T k (k = 1, 2, . . . , K), Z], we define a stationary policy to be a
M
S
partition τ = {B1 , B2 , . . . BM } of the state space ∆ by Borel sets, so that
Bi = ∆. Accordingly, a
i=1
stationary policy can also be considered as a mapping τ : ∆ → {1, 2, ..., K}.
A key property for the information state is that πn is a sufficient statistics for all the past observation
Z n−1 [4], therefore it encapsulates all the information that can be obtained from Z n−1 for estimating
the state Sn . As a result, in each epoch we can update our probabilistic estimation of the state using
the recursive formula (2) and the observation of Zn at that epoch. The choice of the sensor for the next
measurement can only depend on the updated information state πn+1 . Since π lives on ∆, a policy, as a
particular partitioning of ∆ defines a rule for selecting sensors based on all the past observation on hand,
under the restriction that we can only use one sensor at a time.
Now the question is how to find the best policy to get the most benefit of the set of sensor for acquisition
of information from the state process. We wish to find a policy that in the long run provides the best
observability of the state process, or in another word the least ambiguity about the state process as the
observations Zn reveal. To this end we consider the following limiting conditional entropy as a measure
of ambiguity about state given past observations
Ĥ = lim H(Sn |Z n−1 ).
n→∞
(3)
We call Ĥ the entropy rate of the state process. The conditional entropy H(Sn |Z n−1 ) depends on the
policy for selection of sensors, thus Ĥ is a function of policy τ . Our goal is to solve the following
minimization problem,
τ ∗ = arg min Ĥ(τ )
(4)
τ
A characterization of the entropy rate of state process by integral expressions similar to results of [6]
and [7] has been discussed in Section II. Based on this characterization, an expression for finding the
optimal policy has been obtained in Section III, followed by the simulation results for two example hidden
Markov processes in Section IV.
R EFERENCES
[1] Y. Ephraim and N. Merhav, ”Hidden Markov Processes,” IEEE Trans. Inform. Theory, vol. IT-48 No.6 , pp. 1518-1569, June 2000.
[2] L.R. Rabiner, ”A totorial on hidden Markov models and selected applications in speech recognition”, Proceedings of the IEEE, vol 77,
No 2, February 1989, pp. 257-286.
[3] B. La Scala, M. Rezaeian, and B. Moran, ”Optimal Adaptive Waveform Scheduling for Target Tracking”, in proceedings of International
Conference on Information Fusion, July 2005, Philadelphia, PA, USA.
[4] R.D. Smallwood and E.J. Sondik, ”Optimal control of partially observed Markov processes over a finite horizon,” Operation Research,
vol.21, pp. 1071-1088, 1973.
[5] W.S. Lovejoy, ”A survey of algorithmic methods for partially observed Markov decision processes”, Operation Research, vol.39, no. 1,
pp. 162-175, Jan 1991.
[6] D.Blackwell, ”The entropy of functions of finite-state Markov chains”, Trans. First Prague Conf. Inf. Th., Statistical Decision Functions,
Random Processes, page 13-20, 1957.
[7] M. Rezaeian, ”The entropy rate of the hidden Markov process”, submitted to IEEE Trans. Inform. Theory, May 2004.