Download Slide 1 - CS, Technion

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Remote Desktop Services wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Lag wikipedia , lookup

CAN bus wikipedia , lookup

Spanning Tree Protocol wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

Transcript
Erlang routing mesh overview and
implementation details.
Industrial project 234313
Sergey Semenko &
Ivan Nesmeyanov
Under supervision of Eliezer Levy
Node structure. User processes
 NM – Node Manager,
process in charge of
communicating with mesh
and dispatching jobs to
Workers. NM can play two
different roles in mesh: Root
and Leaf. More about this in
mesh topology description
Virtual machine
NM
W
W
W
 W – Worker, process that
executes jobs. Capable of
executing one job at a time.
When done, notifies NM and
client.
Node structure. Supervision tree
 S – Supervisor, is a
NM
w
w
w
system process in
charge of restarting
crashed processes.
Doesn’t implement user
logic.
 NM and W are user
processes
Node Manager’s roles. Leaf
job
done
result
 Leaf Node manager
NM
W
W
W
dispatches received jobs to
local workers gets “done”
notifications and notifies its
root.
 Worker sends job result
directly to client and notifies
Node manager.
Node manager’s roles. Root
 Roots are responsible for
R
L
L
L
getting jobs from web
server and forwarding
them to the least
occupied Leaf they have.
 Roots are also
responsible for accepting
join mesh requests from
node managers and
assigning roles to them.
 Roots do not execute jobs
on their local workers.
Mesh topology overview
HTTP requests
 Each HTTP request is
Web Server
Mesh job requests
R
R
R
R
R
forwarded by web server
to a randomly chosen
Root.
 Each Root forwards jobs
to the least occupied Leafs
registered at them
 Results are sent directly to
web server from workers.
Mesh topology. Join protocol
 Each root has at most MaxChildNum leaves. Root
that has maximum children is called saturated.
Root with less children is called hungry.
 There are always at least MinRootNum roots in
operating mesh. If there are less, each new node
is assigned a root role.
 If all roots are saturated, a new node is assigned to
be a Root. Otherwise it becomes hungry root’s
leaf.
Recovery protocol.
 When a leaf crashes, its root is notified and its jobs are
reassigned.
 When root crashes, all its leafs perform a join request.
If such a leaf gets root role, it reassigns its pending
jobs.
 When both leaf and its root crash and there is no info
about the job in mesh, web server resends the job
after a timeout.
Sending job protocol
 Web server randomly chooses one of the roots and
sends the request to it. If that root has no leaves
registered, web server is notified and request is
resent. Otherwise the job is forwarded to the least
occupied leaf of the root.
 If all leaf’s workers are occupied the job is stored in
the pending jobs list. When worker becomes available
it is assigned one of the pending jobs
Implementation details.
Process groups
 There are two registered pg2 groups: root_group




and hungry_group.
When a node manager becomes a root, it joins
root_group and hungry_group.
When root has maximum number of children, it
leaves the hungry_group.
When a saturated node looses a leaf, it rejoins
hungry_group.
Access to groups is synchronized by a global lock
to avoid race conditions.
Locking clarification
 An Erlang way to synchronize access to a shared
resource is by implementing a “resource manager”
process that would get access requests and execute
them.
 Due to requirement to have no single point of failure
we decided not to implement such a process to sync
access to root groups.
 Hence we were forced to implement a locking
primitive which is not an Erlang way to solve the
problem.
Implementation details.
Monitoring
 Each leaf monitors its root (erlang:monitor() function).
 Each root monitors its leaves.
Implementation details.
Node manager module
 Implements gen_server behavior.
 Role is preserved in the state. When the role is
changed only the corresponding field in the state
changes. Node manager is capable of processing other
roles’ messages (this is useful when leaf turns into a
root and might still get job_done messages from its
workers)
Implementation details.
Worker module
 Implements gen_server behavior.
 The only function is to execute jobs.
 Job execution is not of our interest. It is simulated by
sleeping a certain amount of time.
Implementation details.
Mesh module
 Provides interface for sending jobs protocol.
 Interface for joining mesh protocol.