Download An Information Sharing Algorithm For Large Dynamic

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
An Information Sharing Algorithm For Large Dynamic
Mobile Multi-agent Teams
(Extended Abstract)
†
Linglong Zhu† ,Yang Xu† , Paul Scerri‡ , Han Liang†
University of Electronic Science and Technology of China,
‡
Robotics Institute of Carnegie Mellon University
[email protected], [email protected]
ABSTRACT
In large-scale multi-agent systems, communicating effectively
is necessary for agents to cooperatively achieve joint goals.
Despite significant progress on the multi-agent information
sharing problem, existing research has not adequately dealt
with the case of very large teams coordinating using a wireless network with changing team structure and density, where
messages are broadcast to multiple members of the team. In
this paper, we developed a compact and effective information sharing approach for teams with a dynamically changing, broadcast communication medium. By using a matrix
representation of information status, the network structure
and information needs, the model allows efficient reasoning
about communication in a single computation. Empirical
simulation results show that the approach performs well in
large team, and effectively balances sharing key information
with minimizing communication costs.
model the information sharing problem on an ad hoc wireless network with dynamically changing network structure
and density.
A large number of distributed agents {a1 , .., ai , ...} in a
team A are required to move around to observe or gather
information in environment to act towards their common
goal. I = {I1 , .., Ij , ...} represents the available discrete
pieces of information. Agents communicate via a wireless
network N (t) = ∪ n(a, t), where n(a, t) is defined as all
a∈A(t)
I.2.11 [ARTIFICIAL INTELLIGENCE]: Distributed Artificial Intelligence
agents b who have positive probability P r(a, b) of getting
a broadcast message from agent a at time t which depends
on the communication medium, signal strength and physical
distance between agents but is independent of the information being communicated. When a set of related information
gi = {Ii1 , Ii2 , ..., Iik } comes to a single agent, a rational joint
activities can be carried out in the team toward a reward
R(gi ). Information sharing is when an agent gets some information, how team members decide whether to broadcast
or rebroadcast it on the network to make the best tradeoff
between sharing information to get team reward and minimizing the communication cost.
General Terms
2.
Algorithms
Appears in: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems
(AAMAS 2012), Conitzer, Winikoff, Padgham, and van der Hoek (eds.),
To share information over broadcast media in large multiagent teams, agents independently make decision on broadcasting information they have so that the team reward can
be maximized. We use a simple matrix-based calculation,
called State-Communication-Reward (SCR) that can be done
distributedly, but approximates the complex decision calculation. In this matrix model, one matrix encodes the state of
the team (S), one encodes the communication network (C)
and one encodes the rewards for agents receiving specific information (R). A single multiplication of these matrices and
a comparison to the current communication cost is all an
agent needs to do to decide what to communicate. Instead
of the traditional decisions which have to decide whether
to broadcast the information in the sending queue piece by
piece, this is a lightweight way of making the complex communication calculation for each agent.
Agent a’s local model of deciding whether to broadcast
I
a piece of information Ih is written as < S, Σah , T, R >.
State matrix S : A × Ih models information distribution
over the team. Specifically, the global team state consists
of all the local states of each agent, S = ∪ La where La
4-8 June 2012, Valencia, Spain.
c 2012, International Foundation for Autonomous Agents and
Copyright ⃝
Multiagent Systems (www.ifaamas.org). All rights reserved.
represents the local state of information agent a has received
I
or sensed. Σah : Ih → {1, 0} denotes the action that agent
Categories and Subject Descriptors
Keywords
Communication, Broadcast, Teamwork, Decision-making.
1. MOTIVATION
Large teams of mobile robots are an attractive, emerging
approach to a range of interesting applications. Typically,
communication is required for best performance especially in
complex environments. The exact medium that robots use
to communicate varies from domain to domain, but will typically consist of some sorts of wireless broadcast within a local
area, with robots required to rebroadcast messages to have
the information reach its consumers. In many robot teams,
the available bandwidth will be dramatically less than the
volume of potentially useful messages. In this paper, we
SHARING MODEL WITH BROADCAST
a∈A
I
a broadcasts Ih . The value of Σah is 1 if the information is
I
broadcasted by a, otherwise, the value is 0. T : S×Σah ×S →
′
[0, 1] models the transition function to S when a executes
ΣIah on S. The transition probability is purely based on
how agents are connected, whatever the information content
I
I
is, so T (Lai , Σah , L′ai ) = P r(L′ai |Lai , Σah ) = P r(a, ai ). To
capture agents’ view of the network, we define the matrix
C : A × A → [0, 1], where each element C[ai ][aj ] represents
agent a’s estimate of whether a link exists between ai and
aj , C[ai ][aj ] = P r(ai , aj ). Agent a’s decision is to take an
optimal policy to the next team states that can maximize
the team utility, π ∗ = argmaxΣIh (EU (S ′ )−EU (S)). When
a
a broadcasts Ih , only agents in a’s coverage can potentially
get it, and the expected utility depends on the needs of all
of potential receivers ai which is based on what information
ai has.
EU (ΣIah ) =
∑
casting within a circle. The receive probability was inversely proportional to the distance between two robots,
Prec = 1 − d/120 where d ∈ [0, 120]. The relationship between pieces of information and their corresponding reward
values were predefined according to power law distribution1 .
Moreover, we set sendcost = 10, rececost = 10.
P r(a, ai ) · (EU (Lai ∪ Ih ) − EU (Lai ))
ai ∈n(a,t)
For example, g5 = {I3 , I6 , I7 , I9 }, R(g5 ) = 100, Lai =
{I3 , I6 , I9 } and the expected utility of a given information
set is a value iteration of R(gi ), say, EU (Lai ) = 60 is a value
iteration of R(g5 ) = 100. EU (Lai ∪ I7 ) − EU (Lai ) = 40 denotes that the expected utility credited to the team is 40
when ai receives I7 . Therefore, we setup a reward matrix
R : I × S → R where each element R[Ih ][La ] defines the
expected reward of receiving Ih when agent a’s local information set is La .
By using C to determine the state transition function T ,
the compact decision model SCR is written as: < S, C, R, Σ >.
The complex information sharing decisions can be substituted with a simple matrix computation of S, C and R.
The expected utilities of all possible broadcast decisions
for agents in team A can be calculated as: U = C · S · R
where each element U [a][Ih ] denotes the expected utility for
agent a to send Ih . On broadcast media, agents must balance better between providing useful information and causing network congestion. The actual cost of communication
is very small, written as sendcost, and the real cost is in
overloading the network and preventing other information
getting through. To address this problem, we consider message collision caused by heavy traffic and model the cost
as: commcostt = reccost · ptcoll + sendcost where sendcost
and reccost are constants predefined according to the actual medium. According to the research in literature [1], we
assume that agent can locally estimate the collision probability ptcoll based on the number of messages it currently
receives. A piece of information will be broadcasted if its
expected utility is higher than the cost it occurs. When the
information utilities are fixed, our dynamic model of communication cost makes the sender keep silent if the network
is busy but broadcast the information when the network is
relatively idle. In this way, SCR model can be adaptive to
the dynamically changing network traffic and make the most
use of limited network bandwidth.
3. EXPERIMENTAL RESULT
Our experiment simulates a group of 100 decentralized
robots executing tasks. These robots expanded from a 1502
to a 4002 units area of region. 500 pieces of randomly distributed information were available to be sensed and communicated. Robots communicated with each other by broad-
Figure 1: Sharing information with 100 robots. (a) The
efficiencies of different algorithms. (b) The efficiencies
of SCR algorithm when robots have one, five, ten and
twenty different types.
The ratio of accumulative reward and corresponding number of messages sent by the team, which is ef f iciency =
Rget /numsend , measures the algorithm’s performance of balancing between sharing information and minimizing communication costs. Figure 1a describes the sharing performances of 100 robots with 20 different types by using the
Flooding [2], SBA [3] and our SCR algorithm. The reward
matrixes were constructed differently for robots with different capabilities according to P (k2 )1 . As shown in Figure
1a, the efficiency of SCR algorithm is far better than Flooding and SBA. The reason is that robots’ local SCR models
effectively constrain broadcast of useless messages. Figure
1b shows that the efficiency of SCR algorithm changes little
with more and more heterogeneous robot teams.
4.
ACKNOWLEDGEMENTS
This research has been sponsored in part by National Natural Science Foundation of China 60905042 and 60950110354,
and Research Funds for Central Universities ZYGX2011X013.
5.
REFERENCES
[1] Y. Tay and Chua C. A capacity analysis for the IEEE
802.11 MAC protocol. Wireless networks, 7, 2001.
[2] C. Ho and et al. Flooding for reliable multicast in
multi-hop ad hoc networks. In Discrete algorithms for
mobile computing and communications, 1999.
[3] W. Peng and X.C. Lu. On the reduction of broadcast
redundancy in mobile ad hoc networks. In Mobile and
Ad Hoc Networking and Computing, 2002.
Distribution P (k1 ) = 0.3868k1 −1.3 describes the percent of
information which has relation with k1 pieces of other information and k1 ∈ [1, 20]. The information sets can be constructed according to this distribution when the number of
information is known. The reward value of information sets
k2 is deferred to distribution P (k2 ) = 0.3769k2 −1.3 where
k2 ∈ [1, 200].
1