* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download here
Computational complexity theory wikipedia , lookup
Theoretical computer science wikipedia , lookup
Lateral computing wikipedia , lookup
K-nearest neighbors algorithm wikipedia , lookup
Selection algorithm wikipedia , lookup
Fast Fourier transform wikipedia , lookup
Smith–Waterman algorithm wikipedia , lookup
Genetic algorithm wikipedia , lookup
Simplex algorithm wikipedia , lookup
Expectation–maximization algorithm wikipedia , lookup
Algorithm characterizations wikipedia , lookup
Dijkstra's algorithm wikipedia , lookup
Operational transformation wikipedia , lookup
Agent-based model wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
Decentralised Coordination of Information Gathering Agents Ruben Stranders [email protected] http://users.ecs.soton.ac.uk/rs06r/ This is an offline version of http://users.ecs.soton.ac.uk/rs06r/thesis.html without videos. 1 Introduction Unmanned sensors are rapidly becoming the de facto means of achieving situational awareness—the ability to make sense of, and predict what is happening in an environment—in disaster management, military reconnaissance, space exploration, and climate research. In these domains, and many others besides, their use reduces the need for exposing humans to hostile, impassable or polluted environments. Currently, these sensors are often pre-programmed or remotely controlled by human operators. However, there is a clear trend toward making these sensors fully autonomous, thus enabling them to make decisions without human intervention. Full autonomy has two clear benefits over pre-programming and human remote control. First, in contrast to sensors with pre-programmed motion paths, autonomous sensors are better able to adapt to their environment, and react to a priori unknown external events or hardware failure. Second, autonomous sensors can operate in large teams that would otherwise be too complex to control by human operators. The key benefit of this is that a team of cheap, small sensors can achieve through cooperation the same results as individual large, expensive sensors—with more flexibility and robustness. 2 Information Gathering Agents In light of the importance of autonomy and cooperation, the multi-agent paradigm is a suitable framework for modelling the operation of these sensors and to control them in a decentralised fashion. Within this paradigm, each sensor becomes an information gathering agent. As a team, these agents direct their collective activity towards collecting information from their environment with the aim of providing accurate and up-to-date situational awareness. Against this background, the central problem I addressed in my thesis is that of achieving accurate situational awareness through the coordination of multiple information gathering agents. To achieve general and principled solutions to this problem, I formulate a generic problem definition, which captures the essential properties of dynamic environments, the capabilities of the agents, and the interactions between them. Specific instantiations of this generic problem span a broad spectrum of concrete application domains, of which I consider four canonical examples: • Monitoring dynamic and uncertain environmental conditions, such as radiation, temperature and gas concentrations. • Pursuit-evasion: finding and capturing a moving target, be it cooperative (i.e. a wounded civilian in a disaster scenario), or uncooperative (i.e. an attacker in a military scenario). • Patrolling a building or perimeter to prevent or detect intrusions. • Performing surveillance in a large area by detecting, classifying and tracking events. In all these application domains, it is imperative that the agents exhibit behaviour that satisfy the following design requirements: • Accuracy: maximise the quality of situational awareness. • Adaptiveness: respond to a priori unknown events. 1 (a) A Floodnet Sensor (b) The Mars Rover (c) The Predator UAV (d) The Talisman UUV Figure 1: Some examples of fixed and mobile sensors. • Robustness: degrade gracefully in the face of component failures. The hostility of the agents’ environment can cause individual agents to fail, which should not lead to a detrimental decrease in the performance of the remaining agents. • Autonomy: make decisions without the intervention of a centralised controller. In practical terms, this means that agents coordinate using local computation and communication with their direct neighbours. • Modularity: support heterogeneous agents. For instance, agents equipped with different types of sensors, which are either ground based or airborne should be able to interoperate in a single team. • Performance guarantees: provide a lower bound on the quality of the achieved situational awareness. This is particularly important in safety critical scenarios, where pathological behaviour needs to be ruled out. 2.1 Thesis Contributions To this end, my main contribution consists of decentralised coordination algorithms that satisfy these requirements through the decentralised coordination of teams of information gathering agents, with additional constraints and requirements (such as motion, communication, and energy constraints). These algorithms can be grouped into two categories, which in turn can be divided into two main contributions each: • Decentralised Coordination of Fixed Agents A Decentralised Coordination of Fixed Agents during Deployment. A coordination algorithm that allows fixed agents (for example, those that are part of a fixed wireless sensor network) to establish a reliable communication network while simultaneously maximising the quality of situational awareness. BAE Systems, one of the sponsors of the 2 Table 1: The key properties of the four main contributions. ALADDIN1 project, has applied for a patent on this algorithm (British Patent application reference GB1001732.5) [6] B Decentralised Coordination for Fixed Agents during Operation. Two novel coordination algorithms for solving Distributed Constraint Optimisation Problems (DCOPs) with continuous variables. These algorithms are applied to coordinate the actions (for example, the viewing angles of cameras or radars) of a team of fixed agents. Part of this work was shortlisted for a best paper award at ECAI 2010. [7] • Decentralised Coordination of Mobile Agents C Decentralised Receding Horizon Control of Mobile Agents. A decentralised coordination algorithm that allows mobile agents to provide accurate situational awareness in environments that constrain their motion (e.g. those imposed by the turning radius of a UAV or the layout of a building). Using this algorithm, agents attempt to maximise the accuracy of situational awareness over a receding planning horizon (i.e. every m timesteps, the agents compute a path of length l > m). It is the first online agentbased algorithm for the domains of pursuit-evasion, patrolling and monitoring uncertain environmental phenomena. D Non-Myopic Control of Mobile Sensors with Performance Guarantees. A novel algorithm for multi-agent patrolling in continuously changing environments. This algorithm uses sequential decision making techniques to compute patrols for each agent, which are subsequently improved upon using decentralised coordination. As a result, this algorithm is capable of providing strong performance guarantees. Table 1 summarises the main properties of these four contributions. In what follows, each of these are discussed in more detail. 3 A: Decentralised Coordination of Fixed Agents during Deployment We derive a decentralised algorithm [6] that maximises the accuracy of situational awareness, while simultaneously constructing a reliable communication network between the agents. Specifically, we present a novel solution to the frequency allocation problem. Instead of solving this NP-hard problem (it is equivalent to graph colouring) for the communication network that exists among all agents directly, the idea is to deactivate certain agents such that the problem can be provably solved in polynomial time. We show that the selecting the agents that should be deactivated while simultaneously maximising accuracy is NP-hard, and develop an efficient approximation algorithm that carefully selects which agents should be deactivated in order to maximise the accuracy achieved by the remaining (active) agents. We empirically show that this algorithm is no more than 10% away from the optimal solution, and is thus provides high quality situational awareness. Moreover, it is robust—it is capable of replacing failed agents with deactivated ones, maintains the autonomy of the agents, is very scalable (the computational overhead of an agent grows polynomially with the number of neighbours, not with the total number of agents), and we prove that a centralised version of this algorithm provides theoretical performance guarantees. 1 See http://www.aladdinproject.org/ 3 Video 1 on the website shows the operation of the algorithm on an example random deployment of fixed agents with limited communication and sensing range. 4 B: Decentralised Coordination for Fixed Agents during Operation We develop the first algorithms for distributed constraint optimisation problems (DCOPs) with continuous variables. In these DCOPs, the agents’ collective goal is to maximise a global objective function that can be factorised into a sum of local functions that represent the interactions between agents. While existing algorithms are only capable of solving DCOPs with discrete variables, our algorithms, called CPLF-MS [2] (for linear local functions) and HCMS [7] (for non-linear local functions), significantly increase the expressiveness and applicability of the DCOP formalism by allowing for continuous variables. This is a more natural assumption, since many multi-agent coordination problems take place over continuous values. Figure 1 shows two examples of domains with fixed information gathering agents. Two additional examples include the problem of coordinating multiple UAVs (whose control variables include pitch, roll and thrust), and controlling flows in a network (Internet traffic, electricity grids, etc.), both of which are governed by variables with continuous domains. Both CPLF-MS and HCMS are based on the state of the art max-sum algorithm, whose applicability was thus far limited to domains with discrete action variables. We study the application of these algorithms on two information gathering settings with fixed agents, and empirically demonstrate their effectiveness. Specifically, we show that these algorithms respectively improve the solution quality by up to 40% compared to the standard (discrete) max-sum algorithm, and therefore improve the accuracy of situational awareness in these domains. Moreover, they scale well: their computational and communication overhead grows exponentially in the number of neighbours of an agent, but linearly in the total number of agents. They are adaptive, since they can be effectively and efficiently run continuously to respond to changes in the agents environments. 5 C: Decentralised Receding Horizon Control of Mobile Agents We develop an decentralised control algorithm for mobile agents [1, 3, 4, 5] whose motion is subject to constraints. These motion constraints can be used to model the physical layout of the environment (such as the floor map of a building), as well as the intrinsic movement constraints of the agent itself (e.g. the minimum turning radius of a UAV). This algorithm establishes decentralised adaptive receding horizon control, meaning that agents periodically coordinate as a team to maximise the accuracy of situational awareness for a fixed number of time steps in the future. Thus, agents coordinate their plans (i.e. finitely long paths in their environment), which they (partially) implement, after which coordination takes place again. We benchmark this algorithm against the state-of-the-art and demonstrate that it significantly increases the accuracy of situational awareness in three highly dynamic domains: monitoring spatial phenomena (see Video 2 on the website), pursuit evasion (see Video 3 on the website), and patrolling. Furthermore, it is scalable since it is based on the max-sum algorithm (for which we develop several techniques to reduce computational overhead [3], so that the computational overhead of a single agent grows with the number of neighbours, not the size of the team; it is adaptive, since the receding horizon technique revises the paths of the agents frequently based on newly observed events; and it is modular, since it supports agents of different types and corresponding movement constraints. Most importantly, this algorithm is the first online agent-based algorithm for the domains of pursuitevasion, patrolling, and monitoring environmental phenomena. 6 D: Non-Myopic Control of Mobile Agents with Performance Guarantees We present a novel algorithm for patrolling—continuously monitoring a dynamic environment— using a team of mobile agents. It is the first algorithm that takes the property of temporality of 4 the environment into account, which models a continuous rate of change. As a result, the patrols computed by the algorithm are designed to monitor continuously changing environments, and thus periodically (and infinitely often) return to the same location to provide up-to-date situational awareness. The algorithm operates in the same type of environments as the receding horizon control algorithm of contribution C. However, this algorithm is non-myopic, and as such has an infinite planning horizon. Because of this, the algorithm is able to provide strong performance guarantees. Moreover, the algorithm is a hybrid between offline preprocessing and online decentralised coordination, where the latter is used to improve the solution quality of the former, and provide a more adaptive and robust solution. In more detail, the algorithm follows a three-step computation: decompose the environment into connected clusters, compute accurate subpatrols within each cluster, and concatenate these subpatrols to form the desired patrol. The third phase uses a Markov Decision Process to optimally solve the simplified problem of finding an sequence of visiting the cluster. Video 4 on the website demonstrates these three steps in operation in an environment with three agents tasked with minimising the intra-visit time of the locations of the map. 7 Summary The four contributions discussed above represent an advance in the state of the art of decentralised coordination in general, and decentralised coordination of information gathering agents in particular. Taken together, they constitute a step towards achieving situational awareness through the use of teams of autonomous unmanned sensors. 7.1 References 1. R. Stranders, A. Rogers, and N. R. Jennings. A Decentralised, Online Coordination Mechanism for Monitoring Spatial Phenomena with Mobile Sensors. In Proceedings of the Second International Workshop on Agent Technology for Sensor Networks (ATSN), Estoril, Portugal, 2008, pp. 9–15. 2. R. Stranders, A. Farinelli, A. Rogers, and N. R. Jennings. Decentralised Coordination of Continuously Valued Control Parameters using the Max-Sum Algorithm. In Proceedings of the Eighth International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Budapest, Hungary, 2009, pp. 601–608. 3. R. Stranders, A. Farinelli, A. Rogers, and N. R. Jennings. Decentralised Coordination of Mobile Sensors using the Max-Sum Algorithm. In Proceedings of the 21st International Joint Conference on AI (IJCAI), Pasadena, USA, 2009, pp. 299–304. 4. A. Rogers, A. Farinelli, R. Stranders, and N. R. Jennings. Decentralised Coordination for Embedded Agents using the Max-Sum Algorithm. Artificial Intelligence Journal (AIJ) 172(2), 2011. 5. R. Stranders, F. M. Delle Fave, A. Rogers, and N. R. Jennings. A Decentralised Coordination Algorithm for Mobile Sensors. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI), Atlanta, USA, 2010, pp. 874–880. 6. R. Stranders, A. Rogers, and N. R. Jennings. A Decentralised Coordination Algorithm for Maximising Sensor Coverage in Large Sensor Networks. In Proceedings of the Ninth International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Toronto, Canada, 2010, pp. 1165–1172. BAE Systems, one of the sponsors of the ALADDIN project, has applied for a patent on the algorithms in this paper (British Patent application reference GB1001732.5). 7. T. Voice, R. Stranders, A. Rogers, and N. R. Jennings. A Hybrid Continuous Max-Sum Algorithm for Decentralised Coordination. In: Proceedings of the Nineteenth European Conference on Artificial Intelligence (ECAI), Lisbon, Portugal, 2010, pp. 61–66. Shortlisted for best paper award. 8. R. Stranders, E. Munoz de Cote, A. Rogers, and N. R. Jennings. Non-Myopic Bounded Approximation for Infinite Horizon Patrolling with Mobile Sensors. Artificial Intelligence Journal (AIJ). In preparation. (Chapter 7 of my thesis) 9. F.M. Delle Fave, R. Stranders, A. Rogers, and N. R. Jennings. Bounded Decentralised Coordination over Multiple Objectives. In: Proceedings of the Tenth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Taipei, Taiwan, 2011. (In Press) 10. R. Stranders, F.M. Delle Fave, A. Rogers and N. R. Jennings. UGDL: A Decentralised Algorithm for DCOPs with Uncertainty. Submitted to the Twenty-Fifth Conference on Artificial Intelligence (AAAI), 2011. 5