Download here

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computational complexity theory wikipedia , lookup

Theoretical computer science wikipedia , lookup

Lateral computing wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Selection algorithm wikipedia , lookup

Fast Fourier transform wikipedia , lookup

Smith–Waterman algorithm wikipedia , lookup

Genetic algorithm wikipedia , lookup

Simplex algorithm wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Algorithm characterizations wikipedia , lookup

Dijkstra's algorithm wikipedia , lookup

Operational transformation wikipedia , lookup

Agent-based model wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Algorithm wikipedia , lookup

Time complexity wikipedia , lookup

Transcript
Decentralised Coordination of Information Gathering Agents
Ruben Stranders
[email protected]
http://users.ecs.soton.ac.uk/rs06r/
This is an offline version of http://users.ecs.soton.ac.uk/rs06r/thesis.html without videos.
1
Introduction
Unmanned sensors are rapidly becoming the de facto means of achieving situational awareness—the
ability to make sense of, and predict what is happening in an environment—in disaster management,
military reconnaissance, space exploration, and climate research. In these domains, and many
others besides, their use reduces the need for exposing humans to hostile, impassable or polluted
environments. Currently, these sensors are often pre-programmed or remotely controlled by human
operators. However, there is a clear trend toward making these sensors fully autonomous, thus
enabling them to make decisions without human intervention.
Full autonomy has two clear benefits over pre-programming and human remote control. First, in
contrast to sensors with pre-programmed motion paths, autonomous sensors are better able to adapt
to their environment, and react to a priori unknown external events or hardware failure. Second,
autonomous sensors can operate in large teams that would otherwise be too complex to control
by human operators. The key benefit of this is that a team of cheap, small sensors can achieve
through cooperation the same results as individual large, expensive sensors—with more flexibility
and robustness.
2
Information Gathering Agents
In light of the importance of autonomy and cooperation, the multi-agent paradigm is a suitable
framework for modelling the operation of these sensors and to control them in a decentralised
fashion. Within this paradigm, each sensor becomes an information gathering agent. As a team,
these agents direct their collective activity towards collecting information from their environment
with the aim of providing accurate and up-to-date situational awareness.
Against this background, the central problem I addressed in my thesis is that of achieving
accurate situational awareness through the coordination of multiple information gathering agents.
To achieve general and principled solutions to this problem, I formulate a generic problem definition,
which captures the essential properties of dynamic environments, the capabilities of the agents, and
the interactions between them. Specific instantiations of this generic problem span a broad spectrum
of concrete application domains, of which I consider four canonical examples:
• Monitoring dynamic and uncertain environmental conditions, such as radiation, temperature
and gas concentrations.
• Pursuit-evasion: finding and capturing a moving target, be it cooperative (i.e. a wounded
civilian in a disaster scenario), or uncooperative (i.e. an attacker in a military scenario).
• Patrolling a building or perimeter to prevent or detect intrusions.
• Performing surveillance in a large area by detecting, classifying and tracking events.
In all these application domains, it is imperative that the agents exhibit behaviour that satisfy
the following design requirements:
• Accuracy: maximise the quality of situational awareness.
• Adaptiveness: respond to a priori unknown events.
1
(a) A Floodnet Sensor
(b) The Mars Rover
(c) The Predator UAV
(d) The Talisman UUV
Figure 1: Some examples of fixed and mobile sensors.
• Robustness: degrade gracefully in the face of component failures. The hostility of the agents’
environment can cause individual agents to fail, which should not lead to a detrimental decrease
in the performance of the remaining agents.
• Autonomy: make decisions without the intervention of a centralised controller. In practical
terms, this means that agents coordinate using local computation and communication with
their direct neighbours.
• Modularity: support heterogeneous agents. For instance, agents equipped with different types
of sensors, which are either ground based or airborne should be able to interoperate in a single
team.
• Performance guarantees: provide a lower bound on the quality of the achieved situational
awareness. This is particularly important in safety critical scenarios, where pathological behaviour needs to be ruled out.
2.1
Thesis Contributions
To this end, my main contribution consists of decentralised coordination algorithms that satisfy these
requirements through the decentralised coordination of teams of information gathering agents, with
additional constraints and requirements (such as motion, communication, and energy constraints).
These algorithms can be grouped into two categories, which in turn can be divided into two main
contributions each:
• Decentralised Coordination of Fixed Agents
A Decentralised Coordination of Fixed Agents during Deployment. A coordination algorithm that allows fixed agents (for example, those that are part of a fixed wireless
sensor network) to establish a reliable communication network while simultaneously maximising the quality of situational awareness. BAE Systems, one of the sponsors of the
2
Table 1: The key properties of the four main contributions.
ALADDIN1 project, has applied for a patent on this algorithm (British Patent application
reference GB1001732.5) [6]
B Decentralised Coordination for Fixed Agents during Operation. Two novel coordination algorithms for solving Distributed Constraint Optimisation Problems (DCOPs)
with continuous variables. These algorithms are applied to coordinate the actions (for
example, the viewing angles of cameras or radars) of a team of fixed agents. Part of this
work was shortlisted for a best paper award at ECAI 2010. [7]
• Decentralised Coordination of Mobile Agents
C Decentralised Receding Horizon Control of Mobile Agents. A decentralised coordination algorithm that allows mobile agents to provide accurate situational awareness
in environments that constrain their motion (e.g. those imposed by the turning radius
of a UAV or the layout of a building). Using this algorithm, agents attempt to maximise the accuracy of situational awareness over a receding planning horizon (i.e. every
m timesteps, the agents compute a path of length l > m). It is the first online agentbased algorithm for the domains of pursuit-evasion, patrolling and monitoring uncertain
environmental phenomena.
D Non-Myopic Control of Mobile Sensors with Performance Guarantees. A novel
algorithm for multi-agent patrolling in continuously changing environments. This algorithm uses sequential decision making techniques to compute patrols for each agent,
which are subsequently improved upon using decentralised coordination. As a result,
this algorithm is capable of providing strong performance guarantees.
Table 1 summarises the main properties of these four contributions. In what follows, each of
these are discussed in more detail.
3
A: Decentralised Coordination of Fixed Agents during Deployment
We derive a decentralised algorithm [6] that maximises the accuracy of situational awareness, while
simultaneously constructing a reliable communication network between the agents. Specifically,
we present a novel solution to the frequency allocation problem. Instead of solving this NP-hard
problem (it is equivalent to graph colouring) for the communication network that exists among
all agents directly, the idea is to deactivate certain agents such that the problem can be provably
solved in polynomial time. We show that the selecting the agents that should be deactivated while
simultaneously maximising accuracy is NP-hard, and develop an efficient approximation algorithm
that carefully selects which agents should be deactivated in order to maximise the accuracy achieved
by the remaining (active) agents.
We empirically show that this algorithm is no more than 10% away from the optimal solution,
and is thus provides high quality situational awareness. Moreover, it is robust—it is capable of
replacing failed agents with deactivated ones, maintains the autonomy of the agents, is very scalable
(the computational overhead of an agent grows polynomially with the number of neighbours, not
with the total number of agents), and we prove that a centralised version of this algorithm provides
theoretical performance guarantees.
1 See
http://www.aladdinproject.org/
3
Video 1 on the website shows the operation of the algorithm on an example random deployment
of fixed agents with limited communication and sensing range.
4
B: Decentralised Coordination for Fixed Agents during
Operation
We develop the first algorithms for distributed constraint optimisation problems (DCOPs) with
continuous variables. In these DCOPs, the agents’ collective goal is to maximise a global objective
function that can be factorised into a sum of local functions that represent the interactions between
agents. While existing algorithms are only capable of solving DCOPs with discrete variables, our
algorithms, called CPLF-MS [2] (for linear local functions) and HCMS [7] (for non-linear local functions), significantly increase the expressiveness and applicability of the DCOP formalism by allowing
for continuous variables. This is a more natural assumption, since many multi-agent coordination
problems take place over continuous values. Figure 1 shows two examples of domains with fixed information gathering agents. Two additional examples include the problem of coordinating multiple
UAVs (whose control variables include pitch, roll and thrust), and controlling flows in a network
(Internet traffic, electricity grids, etc.), both of which are governed by variables with continuous
domains.
Both CPLF-MS and HCMS are based on the state of the art max-sum algorithm, whose applicability was thus far limited to domains with discrete action variables. We study the application of
these algorithms on two information gathering settings with fixed agents, and empirically demonstrate their effectiveness.
Specifically, we show that these algorithms respectively improve the solution quality by up to
40% compared to the standard (discrete) max-sum algorithm, and therefore improve the accuracy
of situational awareness in these domains. Moreover, they scale well: their computational and
communication overhead grows exponentially in the number of neighbours of an agent, but linearly
in the total number of agents. They are adaptive, since they can be effectively and efficiently run
continuously to respond to changes in the agents environments.
5
C: Decentralised Receding Horizon Control of Mobile
Agents
We develop an decentralised control algorithm for mobile agents [1, 3, 4, 5] whose motion is subject
to constraints. These motion constraints can be used to model the physical layout of the environment (such as the floor map of a building), as well as the intrinsic movement constraints of the
agent itself (e.g. the minimum turning radius of a UAV). This algorithm establishes decentralised
adaptive receding horizon control, meaning that agents periodically coordinate as a team to maximise the accuracy of situational awareness for a fixed number of time steps in the future. Thus,
agents coordinate their plans (i.e. finitely long paths in their environment), which they (partially)
implement, after which coordination takes place again.
We benchmark this algorithm against the state-of-the-art and demonstrate that it significantly
increases the accuracy of situational awareness in three highly dynamic domains: monitoring spatial
phenomena (see Video 2 on the website), pursuit evasion (see Video 3 on the website), and patrolling.
Furthermore, it is scalable since it is based on the max-sum algorithm (for which we develop several
techniques to reduce computational overhead [3], so that the computational overhead of a single
agent grows with the number of neighbours, not the size of the team; it is adaptive, since the receding
horizon technique revises the paths of the agents frequently based on newly observed events; and
it is modular, since it supports agents of different types and corresponding movement constraints.
Most importantly, this algorithm is the first online agent-based algorithm for the domains of pursuitevasion, patrolling, and monitoring environmental phenomena.
6
D: Non-Myopic Control of Mobile Agents with Performance Guarantees
We present a novel algorithm for patrolling—continuously monitoring a dynamic environment—
using a team of mobile agents. It is the first algorithm that takes the property of temporality of
4
the environment into account, which models a continuous rate of change. As a result, the patrols computed by the algorithm are designed to monitor continuously changing environments, and
thus periodically (and infinitely often) return to the same location to provide up-to-date situational
awareness. The algorithm operates in the same type of environments as the receding horizon control algorithm of contribution C. However, this algorithm is non-myopic, and as such has an infinite
planning horizon. Because of this, the algorithm is able to provide strong performance guarantees.
Moreover, the algorithm is a hybrid between offline preprocessing and online decentralised coordination, where the latter is used to improve the solution quality of the former, and provide a more
adaptive and robust solution.
In more detail, the algorithm follows a three-step computation: decompose the environment into
connected clusters, compute accurate subpatrols within each cluster, and concatenate these subpatrols to form the desired patrol. The third phase uses a Markov Decision Process to optimally solve
the simplified problem of finding an sequence of visiting the cluster. Video 4 on the website demonstrates these three steps in operation in an environment with three agents tasked with minimising
the intra-visit time of the locations of the map.
7
Summary
The four contributions discussed above represent an advance in the state of the art of decentralised
coordination in general, and decentralised coordination of information gathering agents in particular.
Taken together, they constitute a step towards achieving situational awareness through the use of
teams of autonomous unmanned sensors.
7.1
References
1. R. Stranders, A. Rogers, and N. R. Jennings. A Decentralised, Online Coordination Mechanism for
Monitoring Spatial Phenomena with Mobile Sensors. In Proceedings of the Second International
Workshop on Agent Technology for Sensor Networks (ATSN), Estoril, Portugal, 2008, pp. 9–15.
2. R. Stranders, A. Farinelli, A. Rogers, and N. R. Jennings. Decentralised Coordination of Continuously
Valued Control Parameters using the Max-Sum Algorithm. In Proceedings of the Eighth International
Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Budapest, Hungary, 2009,
pp. 601–608.
3. R. Stranders, A. Farinelli, A. Rogers, and N. R. Jennings. Decentralised Coordination of Mobile
Sensors using the Max-Sum Algorithm. In Proceedings of the 21st International Joint Conference on
AI (IJCAI), Pasadena, USA, 2009, pp. 299–304.
4. A. Rogers, A. Farinelli, R. Stranders, and N. R. Jennings. Decentralised Coordination for Embedded
Agents using the Max-Sum Algorithm. Artificial Intelligence Journal (AIJ) 172(2), 2011.
5. R. Stranders, F. M. Delle Fave, A. Rogers, and N. R. Jennings. A Decentralised Coordination Algorithm for Mobile Sensors. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial
Intelligence (AAAI), Atlanta, USA, 2010, pp. 874–880.
6. R. Stranders, A. Rogers, and N. R. Jennings. A Decentralised Coordination Algorithm for Maximising
Sensor Coverage in Large Sensor Networks. In Proceedings of the Ninth International Conference on
Autonomous Agents and Multi-Agent Systems (AAMAS), Toronto, Canada, 2010, pp. 1165–1172.
BAE Systems, one of the sponsors of the ALADDIN project, has applied for a patent on the algorithms
in this paper (British Patent application reference GB1001732.5).
7. T. Voice, R. Stranders, A. Rogers, and N. R. Jennings. A Hybrid Continuous Max-Sum Algorithm
for Decentralised Coordination. In: Proceedings of the Nineteenth European Conference on Artificial
Intelligence (ECAI), Lisbon, Portugal, 2010, pp. 61–66. Shortlisted for best paper award.
8. R. Stranders, E. Munoz de Cote, A. Rogers, and N. R. Jennings. Non-Myopic Bounded Approximation for Infinite Horizon Patrolling with Mobile Sensors. Artificial Intelligence Journal (AIJ). In
preparation. (Chapter 7 of my thesis)
9. F.M. Delle Fave, R. Stranders, A. Rogers, and N. R. Jennings. Bounded Decentralised Coordination
over Multiple Objectives. In: Proceedings of the Tenth International Conference on Autonomous
Agents and Multiagent Systems (AAMAS), Taipei, Taiwan, 2011. (In Press)
10. R. Stranders, F.M. Delle Fave, A. Rogers and N. R. Jennings. UGDL: A Decentralised Algorithm
for DCOPs with Uncertainty. Submitted to the Twenty-Fifth Conference on Artificial Intelligence
(AAAI), 2011.
5