Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
An Information Sharing Algorithm For Large Dynamic Mobile Multi-agent Teams (Extended Abstract) † Linglong Zhu† ,Yang Xu† , Paul Scerri‡ , Han Liang† University of Electronic Science and Technology of China, ‡ Robotics Institute of Carnegie Mellon University [email protected], [email protected] ABSTRACT In large-scale multi-agent systems, communicating effectively is necessary for agents to cooperatively achieve joint goals. Despite significant progress on the multi-agent information sharing problem, existing research has not adequately dealt with the case of very large teams coordinating using a wireless network with changing team structure and density, where messages are broadcast to multiple members of the team. In this paper, we developed a compact and effective information sharing approach for teams with a dynamically changing, broadcast communication medium. By using a matrix representation of information status, the network structure and information needs, the model allows efficient reasoning about communication in a single computation. Empirical simulation results show that the approach performs well in large team, and effectively balances sharing key information with minimizing communication costs. model the information sharing problem on an ad hoc wireless network with dynamically changing network structure and density. A large number of distributed agents {a1 , .., ai , ...} in a team A are required to move around to observe or gather information in environment to act towards their common goal. I = {I1 , .., Ij , ...} represents the available discrete pieces of information. Agents communicate via a wireless network N (t) = ∪ n(a, t), where n(a, t) is defined as all a∈A(t) I.2.11 [ARTIFICIAL INTELLIGENCE]: Distributed Artificial Intelligence agents b who have positive probability P r(a, b) of getting a broadcast message from agent a at time t which depends on the communication medium, signal strength and physical distance between agents but is independent of the information being communicated. When a set of related information gi = {Ii1 , Ii2 , ..., Iik } comes to a single agent, a rational joint activities can be carried out in the team toward a reward R(gi ). Information sharing is when an agent gets some information, how team members decide whether to broadcast or rebroadcast it on the network to make the best tradeoff between sharing information to get team reward and minimizing the communication cost. General Terms 2. Algorithms Appears in: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2012), Conitzer, Winikoff, Padgham, and van der Hoek (eds.), To share information over broadcast media in large multiagent teams, agents independently make decision on broadcasting information they have so that the team reward can be maximized. We use a simple matrix-based calculation, called State-Communication-Reward (SCR) that can be done distributedly, but approximates the complex decision calculation. In this matrix model, one matrix encodes the state of the team (S), one encodes the communication network (C) and one encodes the rewards for agents receiving specific information (R). A single multiplication of these matrices and a comparison to the current communication cost is all an agent needs to do to decide what to communicate. Instead of the traditional decisions which have to decide whether to broadcast the information in the sending queue piece by piece, this is a lightweight way of making the complex communication calculation for each agent. Agent a’s local model of deciding whether to broadcast I a piece of information Ih is written as < S, Σah , T, R >. State matrix S : A × Ih models information distribution over the team. Specifically, the global team state consists of all the local states of each agent, S = ∪ La where La 4-8 June 2012, Valencia, Spain. c 2012, International Foundation for Autonomous Agents and Copyright ⃝ Multiagent Systems (www.ifaamas.org). All rights reserved. represents the local state of information agent a has received I or sensed. Σah : Ih → {1, 0} denotes the action that agent Categories and Subject Descriptors Keywords Communication, Broadcast, Teamwork, Decision-making. 1. MOTIVATION Large teams of mobile robots are an attractive, emerging approach to a range of interesting applications. Typically, communication is required for best performance especially in complex environments. The exact medium that robots use to communicate varies from domain to domain, but will typically consist of some sorts of wireless broadcast within a local area, with robots required to rebroadcast messages to have the information reach its consumers. In many robot teams, the available bandwidth will be dramatically less than the volume of potentially useful messages. In this paper, we SHARING MODEL WITH BROADCAST a∈A I a broadcasts Ih . The value of Σah is 1 if the information is I broadcasted by a, otherwise, the value is 0. T : S×Σah ×S → ′ [0, 1] models the transition function to S when a executes ΣIah on S. The transition probability is purely based on how agents are connected, whatever the information content I I is, so T (Lai , Σah , L′ai ) = P r(L′ai |Lai , Σah ) = P r(a, ai ). To capture agents’ view of the network, we define the matrix C : A × A → [0, 1], where each element C[ai ][aj ] represents agent a’s estimate of whether a link exists between ai and aj , C[ai ][aj ] = P r(ai , aj ). Agent a’s decision is to take an optimal policy to the next team states that can maximize the team utility, π ∗ = argmaxΣIh (EU (S ′ )−EU (S)). When a a broadcasts Ih , only agents in a’s coverage can potentially get it, and the expected utility depends on the needs of all of potential receivers ai which is based on what information ai has. EU (ΣIah ) = ∑ casting within a circle. The receive probability was inversely proportional to the distance between two robots, Prec = 1 − d/120 where d ∈ [0, 120]. The relationship between pieces of information and their corresponding reward values were predefined according to power law distribution1 . Moreover, we set sendcost = 10, rececost = 10. P r(a, ai ) · (EU (Lai ∪ Ih ) − EU (Lai )) ai ∈n(a,t) For example, g5 = {I3 , I6 , I7 , I9 }, R(g5 ) = 100, Lai = {I3 , I6 , I9 } and the expected utility of a given information set is a value iteration of R(gi ), say, EU (Lai ) = 60 is a value iteration of R(g5 ) = 100. EU (Lai ∪ I7 ) − EU (Lai ) = 40 denotes that the expected utility credited to the team is 40 when ai receives I7 . Therefore, we setup a reward matrix R : I × S → R where each element R[Ih ][La ] defines the expected reward of receiving Ih when agent a’s local information set is La . By using C to determine the state transition function T , the compact decision model SCR is written as: < S, C, R, Σ >. The complex information sharing decisions can be substituted with a simple matrix computation of S, C and R. The expected utilities of all possible broadcast decisions for agents in team A can be calculated as: U = C · S · R where each element U [a][Ih ] denotes the expected utility for agent a to send Ih . On broadcast media, agents must balance better between providing useful information and causing network congestion. The actual cost of communication is very small, written as sendcost, and the real cost is in overloading the network and preventing other information getting through. To address this problem, we consider message collision caused by heavy traffic and model the cost as: commcostt = reccost · ptcoll + sendcost where sendcost and reccost are constants predefined according to the actual medium. According to the research in literature [1], we assume that agent can locally estimate the collision probability ptcoll based on the number of messages it currently receives. A piece of information will be broadcasted if its expected utility is higher than the cost it occurs. When the information utilities are fixed, our dynamic model of communication cost makes the sender keep silent if the network is busy but broadcast the information when the network is relatively idle. In this way, SCR model can be adaptive to the dynamically changing network traffic and make the most use of limited network bandwidth. 3. EXPERIMENTAL RESULT Our experiment simulates a group of 100 decentralized robots executing tasks. These robots expanded from a 1502 to a 4002 units area of region. 500 pieces of randomly distributed information were available to be sensed and communicated. Robots communicated with each other by broad- Figure 1: Sharing information with 100 robots. (a) The efficiencies of different algorithms. (b) The efficiencies of SCR algorithm when robots have one, five, ten and twenty different types. The ratio of accumulative reward and corresponding number of messages sent by the team, which is ef f iciency = Rget /numsend , measures the algorithm’s performance of balancing between sharing information and minimizing communication costs. Figure 1a describes the sharing performances of 100 robots with 20 different types by using the Flooding [2], SBA [3] and our SCR algorithm. The reward matrixes were constructed differently for robots with different capabilities according to P (k2 )1 . As shown in Figure 1a, the efficiency of SCR algorithm is far better than Flooding and SBA. The reason is that robots’ local SCR models effectively constrain broadcast of useless messages. Figure 1b shows that the efficiency of SCR algorithm changes little with more and more heterogeneous robot teams. 4. ACKNOWLEDGEMENTS This research has been sponsored in part by National Natural Science Foundation of China 60905042 and 60950110354, and Research Funds for Central Universities ZYGX2011X013. 5. REFERENCES [1] Y. Tay and Chua C. A capacity analysis for the IEEE 802.11 MAC protocol. Wireless networks, 7, 2001. [2] C. Ho and et al. Flooding for reliable multicast in multi-hop ad hoc networks. In Discrete algorithms for mobile computing and communications, 1999. [3] W. Peng and X.C. Lu. On the reduction of broadcast redundancy in mobile ad hoc networks. In Mobile and Ad Hoc Networking and Computing, 2002. Distribution P (k1 ) = 0.3868k1 −1.3 describes the percent of information which has relation with k1 pieces of other information and k1 ∈ [1, 20]. The information sets can be constructed according to this distribution when the number of information is known. The reward value of information sets k2 is deferred to distribution P (k2 ) = 0.3769k2 −1.3 where k2 ∈ [1, 200]. 1