Download Geo-distributed Messaging with RabbitMQ

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computer security wikipedia , lookup

Backpressure routing wikipedia , lookup

Airborne Networking wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Distributed operating system wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Everything2 wikipedia , lookup

CAN bus wikipedia , lookup

Kademlia wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Computer cluster wikipedia , lookup

Transcript
Real World Case Study
Geo-distributed Messaging with
RabbitMQ
by Andriy Shapochka
System Architect @ SoftServe
Context
Security Management SaaS Solution with availability and scalability issues
:Managed Serv er
Jav a Client
:Agent
Web Client
Linux Instance
«REST» Security Services
:Managed Serv er
Security
Management
:Agent
Python Client
Mobile Client
:Managed Serv er
:Agent
Goal
To build highly available, geo-distributed clustered solution based on the
original implementation.
Clients can have their servers in the US, Europe, and elsewhere.
Amazon and data centers are to be supported
The data should always stay consistent in the sense of CAP theorem.
Architecture Drivers
Nodes communicate over WAN and must account for high latencies and
possible connectivity interruptions.
The main quality attributes to achieve are:
High Availability
Eventual state consistency on each active node (primary and replicas)
Inter-node communication security (transport protocol encryption
and authentication)
The secondary quality attributes to achieve are:
Performance
Thin maintenance windows
CAP Theorem by Brewer
– Consistency – roughly meaning that all clients of a data store get responses to
requests that ‘make sense’. For example, if Client A writes 1 then 2 to location
X, Client B cannot read 2 followed by 1.
– Availability – all operations on a data store eventually return successfully. We
say that a data store is ‘available’ for, e.g. write operations.
– Partition tolerance – if the network stops delivering messages between two
sets of servers, will the system continue to work correctly?
Consistency, Availability, Partition Tolerance: Pick two only.
In a distributed
system that may drop
messages we cannot
have both consistency
and availability of the
data – only one of
them!
Partition Tolerance
means data copying
strategy for
consistency or for
availability
Partition is network
property – not our
choice
Primary Decision
To use RabbitMQ as a platform for the data bus between the nodes in the
cluster.
AMQP Broker with extensions implemented in Erlang.
Clients in Java, Python, .NET, REST, etc.
Performance, HA, Federation, Clustering, Flexible Routing, Security
Messaging in RabbitMQ
Exchanges: fan-out, direct, topics; can be bound to exchanges
Queues bound to exchanges
Routing keys
Acknowledgements
RPC with correlation ids
Geo-Distribution in RabbitMQ
Federation plugin to the rescue
Design Strategy: Components
Application Node – the nodes are geo-distributed and can play a role of
primary node - single node serving user requests in the entire cluster
replicas - all the other nodes updated with the changes on the
primary node
Cluster Controller – a single node controlling the cluster state and assigning
the primary and replica roles to the application nodes.
Design Strategy: Decisions - 1
All the application nodes are equivalent in the sense each of them can
become primary or a replica at the runtime.
The application node status (active, primary, replica, etc.) is controlled by
the cluster monitor.
The replicas receive updates from the current primary node by means of
exchange federation.
Each replica monitors its transaction flow state and validates it against
every new incoming primary update.
Design Strategy: Decisions - 2
When the replica finds its transaction flow state to become inconsistent it
switches to the catch-up mode which involves two steps in order to offload
additional processing and communication from the primary node:
To request the missing transactions from the other replicas.
In case no replica succeeds in replying with the requested transactions
to fall back to the catch-up request toward the primary node.
The bus construction is based on the official RabbitMQ Federation plugin
which works by way of propagating the messages published in the local
upstream exchanges to the federated exchanges owned by the remote
brokers.
replica1 :Node
«catchup»
replica2 :Node
«fallback»
primary :Node
«fallback»
«fallback»
«catchup»
«catchup»
«catchup»
«catchup»
replica3 :Node
«fallback»
«catchup»
replica4 :Node
Arrow direction indicates the source and
destination of the catch-up/fallback request
Design Strategy: RPC Catch-up
requestor picks senders
to ask for catch up
response based on some
logic or configuration
params
catch-up sender :Node
catch-up requestor :Node
request message path
catch-up-request-exchange-<sender-id>
request-federated
request-upstream
response message path
catch-up-response-exchange-<requestor-id>
response-upstream
sender picks this
exchange based on the
value in reply-to of the
request message
catch-up-request-exchange-<sender-id>
catch-up-response-exchange-<requestor-id>
response-federated
It is not strictly necessary to declare separate
response exchanges per all other replicas on
the requestor side. we can bind a single
catch-all catch-up-response-exchange to all
the corresponding <requestor-id> exchanges
on the senders. In this case we will have
different names of upstream and federated
exchanges
Design Strategy: Extra Mile
Security – communication is secured by means of the https protocol, server
and client certificates supported by RabbitMQ Federation. It is configured
in the upstream part.
Cluster Configuration – runtime roulette. The upstream configuration
occurs at the deployment time. The exchange creation does at runtime.
Cluster Controller selects and promotes the new primary and notifies
replicas.
Cluster Controller – non-trivial task to build a cluster controller which
would be HA and would avoid the split-brain issue. Zookeeper and other
controller distribution options were evaluated. At the end it was decided to
build it on top of AWS infrastructure using Multi-AZ RDS as a configuration
storage.
Thank You! Questions, please!