Download Building and Configuring Slony

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Open Database Connectivity wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Database model wikipedia , lookup

Versant Object Database wikipedia , lookup

Transcript
Published on ONLamp.com (http://www.onlamp.com/)
http://www.onlamp.com/pub/a/onlamp/2004/11/18/slony.html
See this if you're having trouble printing code examples
Introducing Slony
by A. Elein Mustain
11/18/2004
Slony is the Russian plural for elephant. It is also the name of the new replication project being
developed by Jan Weick. The mascot for Slony, Slon, is a good variation of the usual Postgres elephant
mascot, created by Jan.
Figure 1. Slon, the Slony mascot.
Slony-I, the first iteration of the project, is an asynchronous replicator of a single master database to
multiple replicas, which in turn may have cascaded replicas. It will include all features required to
replicate large databases with a reasonable number of replicas. Jan has targeted Slony-I toward data
centers and backup sites, implying that all nodes in the network are always available.
The master is the primary database with which the applications interact. Replicas are replications, or
copies of the primary database. Since the master database is always changing, data replication is the
system that enables the updates of secondary, or replica, databases as the master database updates. In
synchronous replication systems, the master and the replica are consistent exact copies. The client does
not receive a commit until all replicas have the transaction in question. Asynchronous replication
loosens that binding and allows the replica to copy transactions from the master, rolling forward, at its
own pace. The server issues a commit to the master client based on the state of the master database
transaction.
Cascading replicas over a WAN minimizes bandwidth, enabling better scalability and also enables readonly (for example, reporting) applications to take advantage of replicas.
10/05/2017
1/28
769875629
Figure 2. Cascading replicas
Assume you have a primary site, with a database server and a replica as backup server. Then you create
a remote backup center with its own main server and its backup replica. The remote primary server is a
direct replica, replicating from the master over the WAN, while the remote secondary server is a
cascaded replica, replicating from the primary server via the LAN. This avoids transferring all of the
transactions twice over the WAN. More importantly, this configuration enables you to have a remote
backup with its own local failover already in place for cases such as a data center failure.
Slony's design goals differentiate it from other replication systems. The initial plan was to enable a few
very important key features as a basis for implementing these design goals. An underlying theme to the
design is to update only that which changes, enabling scalable replication for a reliable failover strategy.
The design goals for Slony are:
1. The ability to install, configure, and create a replica and let it join and catch up with a running
database.
This allows the replacement of both masters and replicas. This idea also enables
cascading replicas, which in turn adds scalability, limitation of bandwidth, and proper
handling of failover situations.
2. Allowing any node to take over for any other node that fails.
In the case of a failure of a replica that provides data to other replicas, the other replicas
can continue to replicate from another replica or directly from the master.
Figure 3. Replication continues after a failure
10/05/2017
2/28
769875629
In the case where
master. Any other
asynchronous, the
replica becomes a
replica.
a master node fails, a replica can receive a promotion to become a
replicas can then replicate from the new master. Because Slony-I is
different replicas may be ahead of or behind each other. When a
master, it synchronizes itself with the state of the most recent other
In other replication solutions, this roll forward of the new master is not possible. In those
solutions, when promoting a replica to master, any other replicas that exist must rebuild
from scratch in order to synchronize with the new master correctly. A failover of a 1TB
database leaves the new master with no failover of its own for quite a while.
The Slony design handles the case where multiple replicas may be at different
synchronization times with the master and are able to resynchronize when a new master
arises. For example, different replicas could logically be in the future, compared to the
new master. There is a way to detect and correct this. If there weren't, you would have to
dump and restore the other replicas from the new master to synchronize again.
It's possible to roll forward the new master, if necessary, from other replicas because of
the packaging and saving of the replication transactions. Replication data is packaged
into blocks of transactions and sent to each replica. Each replica knows what blocks it has
consumed. Each replica can also pass those blocks along to other servers--this is the
mechanism of cascading replicas. A new master may be on transaction block 17 relative
to the old master, when another replica is on transaction block 20 relative to the old
master. Switching to the new master causes the other replicas to send blocks 18, 19, and
20 to the new master.
Jan, said, "This feature took me a while to develop, even in theory."
3. Backup and point-in-time capability with a twist.
It is possible, with some scripting, to maintain a delayed replica as a backup that might,
for example, be two hours behind the master. This is done by storing and delaying the
application of the transaction blocks. With this technique, it is possible to do a point-intime recovery anytime within the last two hours on this replica. The time it takes to
recover only depends on the time to which you choose to recover. Choosing "45 minutes
ago" would take about one hour and 15 minutes, for example, independent of database
size.
4. Hot PostgreSQL installation and configuration.
For failover, it must be possible to put a new master into place and reconfigure the
system to allow the reassignment of any replica to the master or to cascade from another
replica. All of this must be possible without taking down the system.
This means that it must be possible to add and synchronize a new replica without
disrupting the master. When the new replica is in place, the master switch can happen.
This is particularly useful when the new replica is a different PostgreSQL version than
the previous one. If you create an 8.0 replica from your 7.4 master, it now is possible to
promote the 8.0 to master as a hot upgrade to the new version.
5. Schema changes.
10/05/2017
3/28
769875629
Schema changes require special consideration. The bundling of the replication
transactions must be able to join all of the pertinent schema changes together, whether or
not they took place in the same transaction. Identifying these change sets is very difficult.
In order to address this issue, Slony-I has a way to execute SQL scripts in a controlled
fashion. This means that it is even more important to bundle and save your schema
changes in scripts. Tracking your schema changes in scripts is a key DBA procedure for
keeping your system in order and your database recreatable.
The first part of Slony-I also does not address any of the user interface
features required to set up and configure the system. After the core engine of
Slony-I becomes available, development of the configuration and maintenance
interface can begin. There may be multiple interfaces available, depending on
who develops the user interface and how.
Related Reading
Jan points out that "replication will never be something where you type SETUP
and all of a sudden your existing enterprise system will nicely replicate in a
disaster recovery scenario." Designing how to set up your replication is a
complex problem.
The user interface(s) will be important to clarify and simplify the
configuration and maintenance of your replication system. Some of the issues
to address include the configuration of which tables to replicate, the
requirement of primary keys, and the handling of sequence and trigger
coordination.
The Slony-I release does not address the issues of multi-master, synchronous
replication or sporadically synchronizable nodes (the "sales person on the
road" scenario). However, Jan is considering these issues in the architecture of
the system so that future Slony releases may implement some of them. It is
critical to design future features into the system; analysis of existing
replication systems has shown that it is next to impossible to add fundamental
features to an existing replication system.
Practical PostgreSQL
By John C. Worsley,
Joshua D. Drake
Table of Contents
Index
Sample Chapter
The primary question to ask regarding the requirements for a failover system is how much down time
can you afford. Is five minutes acceptable? Is one hour? Must the failover be read/write, or is it
acceptable to have a read-only temporary failover? The second question you must ask is whether you are
willing to invest in the hardware required to support multiple copies of your database. A clear
cost/benefit analysis is necessary, especially for large databases.
References






General Bits Slony Articles on Tidbits
The Slony-I Project documentation on GBorg
Slonik Commands
Jan Wieck's Original Slony-I Talk and Scripts, July 2004 in Portland, OR, sponsored by Affilias
Global Registry Services
Information from IRC's #slony on freenode.net
The [email protected] mailing list
A. Elein Mustain has more than 15 years of experience working with databases, 10 of those working
exclusively with object relational database systems.
10/05/2017
4/28
769875629
Building and Configuring Slony
by A. Elein Mustain
12/16/2004
Figure 1. Slony mascot
Editor's note: in Introducing Slony, A. Elein Mustain explained the goals of Slony, the replication
project for PostgreSQL. This follow-up explains how to install, configure, and start using it.
Building
I am pleased to report that the basic instructions for the download, build, and install of Slony-I release
1.0.5 were perfect.
Slony-I is fairly version independent, but you still need to build it for each PostgreSQL version (7.3 or
later) and installation on each machine participating in the replication. The same technique applies when
the installations live on different machines.
On one machine, I run several versions of PostgreSQL, each built from source. My plan is to replicate
between my 7.4 installation and my 8.0 installation, so I configured and built Slony-I against each of
those source trees. That took less than a minute for both.
Repeat these steps for each source tree and installation:


./configure -with-pgsourcetree=/local/src/postgresql-version
make all
sudo make install
Setting Up Slony-I
This step-by-step reading of instructions will be applied to replicate a small database named gb. The
plan is to replicate from a PostgreSQL 7.4 installation to a PostgreSQL 8.0 installation, making it
possible to upgrade the database.
Slonik is the command-line interface that defines the replication system. There will be Slonik scripts to
create, update, and change the replication cluster for gb. There are also tools under development to
simplify the creation of replication systems with Slony-I; however, this description will explore the
underlying Slonik requirements. It is important to learn the basic Slonik commands.
About the database
gb
is a simple eight-table database containing issues and articles for my General Bits web site. The
database is normalized, and all tables have natural primary keys.
There are several prerequisites:
10/05/2017
5/28
769875629


Each installation that will participate in replication must have Slony-I built and installed. The
Slony-I Project on GBorg gives instructions for building and installing Slony-I. My experience
with building Slony-I from source against PostgreSQL 7.4 and 8.0Beta3 was very good.
Following the instructions provided a clean and fast builds.
You need a set of master database tables to replicate and at least one other installation containing
the same schema objects. The other installation will be the replica. To achieve this initially, I
dumped and restored the schema for the master database on 7.4 into the 8.0 installation:
pg_dump -p 5434 -C -s gb | pgsql -p 5430
As you can see, these installations are on the same host and have different port numbers.


The real-time clocks of the servers hosting the nodes must be in sync. I recommend using NTP.
The pg_hba.conf files on each installation must allow each machine to contact the other.
Slonik
Slonik is a command-line interface for Slony-I. It can connect to the various databases involved in the
replication scheme to perform specific actions. It is an independent helper of Slony-I and of
PostgreSQL.
The first commands for most Slonik scripts constitute the identity of a group of databases and servers
and the connection parameters for accessing each database in the group. Each database and Slony-I
connection is a numbered node. The numbers are simply identifiers. The next parameter is the action
you wish to process.
Slonik commands work well when they are embedded in shell scripts, as in this example. (The next
section covers the commands to identify the cluster and node connection information.)
#!/bin/bash
slonik << _END_
cluster name = gb_cluster;
node 1 admin connifo = 'dbname=db host=localhost port=5432 user=postgres';
node 2 admin connifo = 'dbname=db host=localhost port=5430 user=postgres';
...additional nodes...
...slonik commands...
_END_
Both Slonik commands and Slony-I have full sets of commands.
Node networks
A node is the combination of a database in an installation and one slon process "belonging to" that
database. A cluster is a set of nodes cooperating in a replication scheme.
The documentation suggests that all nodes have a path to all other nodes. With only two nodes, this is
simple to describe. With more nodes, be sure to include a path to all other nodes, regardless of whether
you expect replication to take the paths.
10/05/2017
6/28
769875629
Figure 2. A Slony-I cluster
Our first Slonik script initializes the cluster, defines each node, and defines the paths from each node to
every other node. Notice that each node has an identifying number. init cluster defines the cluster on
the first node. store node adds each subsequent node. The user is the slony superuser--in this case,
postgres. You can choose any privileged user established as the Postgres superuser on each
installation.
The path is defined by designating one node as a server and the other as a client for messaging. The
terminology does not relate to the replicator/replica relationship; instead it references the possible
network path. The connection information in each command belongs to the server node. The client's
slon daemon will connect to the server node using that connection information.
#!/bin/bash
#
# 01: Initialize Cluster
#
slonik << _END_
cluster name = gbcluster;
node 1 admin conninfo = 'dbname=gb host=localhost port=5434 user=postgres';
node 2 admin conninfo = 'dbname=gb host=localhost port=5430 user=postgres';
#
# Initialize the cluster and create the second node
#
init cluster (id=1, comment='gb 7.4 5434');
echo 'Initializing gb cluster';
echo 'Node 1 on pgsql74 port 5434 defined';
store node (id=2, comment='gb 8.0 5430');
echo 'Node 2 on pggsql80b port 5430 defined';
#
# create paths in both directions
#
store path (server=1, client=2, conninfo='dbname=gb host=localhost
port=5434 user=postgres');
store path (server=2, client=1, conninfo='dbname=gb host=localhost
port=5430 user=postgres');
echo 'path from server node 1 to client node 2 created.';
echo 'path from server node 2 to client node 1 created.';
_END_
Using Slonik's echo command can help log and track the commands in any Slonik script.
10/05/2017
7/28
769875629
Listening for events
Events will occur throughout the cluster, and you must tell Slony-I what nodes listen to what nodes to
receive these events. The events may be replication information or administrative information that
requires propagation throughout the cluster.
In the simple case of two nodes, they listen to each other. In any case, all nodes should be able to listen
to all other nodes. The paths' definitions intentionally make this possible.
Specifying the origin identifies which node the receiver is listening for. The origin of an event may or
may not provide the event to the receiver; however, the default is to do so. It is possible for node 3 to
listen for events initiated on node 1 and have those events provided by node 2 (which, one assumes, is
also listening for events from node 1).
In our case, we are having both nodes listen for events on the other, with the events provided by the
origin node.
#!/bin/bash
#
# 02: Listen
#
slonik << _END_
cluster name = gbcluster;
node 1 admin conninfo = 'dbname=gb host=localhost port=5434 user=postgres';
node 2 admin conninfo = 'dbname=gb host=localhost port=5430 user=postgres';
#
# make the nodes listen on the paths
# in both directions
#
store listen (origin=1, receiver=2, provider=1);
store listen (origin=2, receiver=1, provider=2);
_END_
Starting the slon processes
Once the nodes can listen to each other for events, start slon. Each database participating in the
replication needs a slon process. Give slon a chance to start itself and its threads.
The output in our example goes to two logs, which you can tail to watch the activity and look for
errors.
slon
is essentially an event and messaging system. The events involve the replication of data and
administrative information to facilitate the replication of data.
#!/bin/sh
#
# 02: Start up Slon processes
#
#
# Start Slony for each node
#
slon gbcluster "dbname=gb user=postgres port=5434 host=localhost" >
slon_gb_74.out 2>&1 &
slon gbcluster "dbname=gb user=postgres port=5430 host=localhost" >
10/05/2017
8/28
769875629
slon_gb_80.out 2>&1 &
Creating sets
Replication in Slony-I works by subscribing to sets of tables. The set usually should comprise the group
of related tables for an application or an entire schema.
To make this work, first define a set and designate the origin for the set. Then add the tables by naming
the set ID, the origin of the set, a table ID, the fully qualified table name, and an optional alternate key.
Make sure to enter the origin of the set as it was in the set creation (redundantly). All of the tables
participating in the replication must have a primary key. If the table does not have one, you can have
Slony-I add one for replication purposes only.
Be careful when setting the ID number of a table; it also designates the order in which Slony will lock
the tables. This means that master tables should have IDs lower than those of detail tables. The
relationship hierarchy of your schema should help you determine the order of the numbers. If the
ordering of the table IDs is backward or incorrect, there may be problems with deadlocking the slon
process or PostgreSQL.
In our example, the issues table is the topmost master, followed by articles. Each of the other tables
are lookup tables for those, so their numbers are higher, accordingly.
Figure 3. General Bits schema
You can create a set only once, without any active subscribers. To add tables to replication set, create a
new set. You can later combine two sets by using Slonik's MERGE SET command.
#!/bin/sh
#
10/05/2017
9/28
769875629
# 03: Create Set
#
slonik << _END_
#
# Define cluster namespace and node connection information
#
cluster name = gbcluster;
node 1 admin conninfo = 'dbname=gb host=localhost port=5434 user=postgres';
node 2 admin conninfo = 'dbname=gb host=localhost port=5430 user=postgres';
create set (id=1, origin=1, comment='gb tables');
echo 'Set created';
set add table (set id=1, origin=1, id=1,
full qualified name = 'public.issues', comment='Issues table');
set add table (set id=1, origin=1, id=2,
full qualified name = 'public.articles', comment='Articles table');
set add table (set id=1, origin=1, id=3,
full qualified name = 'public.arttext', comment='Article Text table');
set add table (set id=1, origin=1, id=4,
full qualified name = 'public.sqlbase', comment='Full SQL keywords');
set add table (set id=1, origin=1, id=5,
full qualified name = 'public.whoall', comment='All contributors');
set add table (set id=1, origin=1, id=6,
full qualified name = 'public.contrib', comment='Contributors by Article');
set add table (set id=1, origin=1, id=7,
full qualified name = 'public.keywords', comment='Keywords by Article');
set add table (set id=1, origin=1, id=8,
full qualified name = 'public.sqlkw', comment='Subset of SQL keywords');
echo 'set 1 of gb tables created';
_END_
Subscribing to sets
The nodes can now subscribe to the newly created sets. To subscribe to a set, identify the set, the node
that can provide the set, the receiver of the set, and whether the receiver of this set should be able to
forward the set to another node. In our case, the origin node of the set is the same as the provider of the
set, but for cascading subscriptions that is not necessarily the case.
Even though this replication system has only two nodes, we are saying that the receiving node may
forward the set. This is for the case in which we may want to switch masters or add other nodes to the
cluster. Here, node 2 is subscribing to set 1. originating on node 1 and provided by node 1.
#!/bin/sh
#
# gb_subscribeset.sh
#
slonik << _END_
#
# Define cluster namespace and node connection information
#
cluster name = gbcluster;
node 1 admin conninfo = 'dbname=gb host=localhost port=5434 user=postgres';
node 2 admin conninfo = 'dbname=gb host=localhost port=5430 user=postgres';
subscribe set (id=1, provider=1, receiver=2, forward=yes);
echo 'set 1 of gb tables subscribed by node 2';
_END_
Of course, you should assume that these scripts have no typos and that you've run them exactly as
intended. Yeah, right. Fortunately, you can recover from mistakes.
10/05/2017
10/28
769875629
Undoing
By this time, you probably have made a typo or two and need to know how to start over. The simplest
way of undoing is to start fresh. There are subtler ways of correcting mistakes by updating the
underlying tables. However, I don't recommend those unless you have intimate knowledge of the
underlying tables.



















To terminate the slon processes, list their process IDs and use kill -TERM to terminate the
oldest of the processes for each node.
To completely remove all Slony-I definitions from your database, uninstall each node:
#!/bin/sh
# gb_uninstallnode.sh
slonik << _END_
#
# Define cluster namespace and node connection information
#
cluster name = gbcluster;
node 1 admin conninfo = 'dbname=gb host=localhost port=5434 user=postgres';
node 2 admin conninfo = 'dbname=gb host=localhost port=5430 user=postgres';
echo 'Cluster defined, nodes identified';
#
# UnInstall both nodes
#
uninstall node (id=1);
uninstall node (id=2);
echo 'Nodes 1 and 2 Removed';
_END_
NOTE: UNINSTALL NODE removes all definitions, and you must start cleanly after that.
Slony-I schema
The underlying tables for Slony-I are fairly straightforward. The cluster name is the name of the schema
in the database in which the Slony tables reside. (Use set search_path in psql.) You can verify your
commands to add nodes, listens, paths, and so on by examining these tables. It also looks tempting to
"fix" things by just changing the underlying tables. Resist doing so, however. Use Slonik so that it can
trigger the appropriate events to perform the updates in an orderly fashion across all nodes.
10/05/2017
11/28
769875629
Figure 4. Slony schema
References






General Bits Slony Articles on Tidbits
The Slony-I Project documentation on GBorg
Slonik Commands
Jan Wieck's Original Slony-I Talk and Scripts July 2004 in Portland, Oregon, sponsored by
Affilias Global Registry Services
Information from IRC #slony on freenode.net
Mailing List: [email protected]
Elein Mustain has more than 15 years of experience working with databases, 10 of those working
exclusively with object relational database systems.
10/05/2017
12/28
769875629
Modifying Slony Clusters
by A. Elein Mustain
03/17/2005
Replication clusters have much forethought applied to their creation. However, in the course of systems
development, some changes are always necessary sooner or later. This article walks through the steps
required to:




Add a node to the cluster.
Switch data providers of a table set.
Promote a replication to master.
Apply schema changes to the replication cluster.
These examples will use the replication scheme originally set up in "Introducing Slony."
Figure 1.
Adding a Node
Suppose that in addition to the replica you created for the gb database, you want another replica of the
same database for reporting. Here's how to add a replica of the gb table set on a second database in the
8.0 installation. The 7.4 Node 1, database gb, will originate the data set and replicate it directly to Node
2, also database gb, and Node 3, database gb2.
10/05/2017
13/28
769875629
Figure 2.
Before starting, be sure to create gb2 in the 8.0 installation, seeding it with the same empty schema as
the other two databases in this cluster. You do not want a dump of the schema of gb as is it now, but
rather as it was before you defined the Slony-I cluster.
Next, define Node 3 and ensure there are paths from Node 3 to and from Nodes 1 and 2. From there,
enable listening along each path mirroring the expected table set replication. The listening of 2 and 3 via
Node 1 reflects this mirroring, rather than having a direct listen path between 2 and 3.
This is a really good time to remember that the connection information in the store path command
pertains to the server node. This is also a pretty good time to look up drop path and drop listen,
two more slonik commands.
#!/bin/sh
slonik << _END_
#
# Define cluster namespace and node connection information
#
cluster name = gbcluster;
node 1 admin conninfo = 'dbname=gb host=localhost port=5434 user=postgres';
node 2 admin conninfo = 'dbname=gb host=localhost port=5430 user=postgres';
node 3 admin conninfo = 'dbname=gb2 host=localhost port=5430 user=postgres';
echo 'Cluster defined, nodes identified';
#
# Initialize the cluster and create the second node
#
store node (id=3, comment='gb2 8.0 5430');
#
# create paths
#
store path (server=1, client=3,
port=5434 user=postgres');
store path (server=2, client=3,
port=5430 user=postgres');
store path (server=3, client=1,
port=5430 user=postgres');
store path (server=3, client=2,
port=5430 user=postgres');
conninfo='dbname=gb host=localhost
conninfo='dbname=gb host=localhost
conninfo='dbname=gb2 host=localhost
conninfo='dbname=gb2 host=localhost
#
# Enable listening along each path
#
store listen (origin=1, receiver=3, provider=1);
store listen (origin=3, receiver=1, provider=3);
10/05/2017
14/28
769875629
store listen (origin=2, receiver=3, provider=1);
store listen (origin=3, receiver=2, provider=1);
_END_
Now you are ready to start a new slon process for Node 3, the 8.0 installation with the gb2 database.
Because there are two replica databases on the 8.0 installation:
#! /bin/bash
#
# varlena Slony Start Up
# ==============================
slon gb "dbname=gb2 host=localhost port=5430 user=postgres" > slon_gb_3.out 2>&1 &
Once the new slon process is up and running, you can subscribe Table Set 1, originating at Node 1, to
Node 3. At this point the log files are invaluable; tail -f the log files to watch for progress and errors.
The log files in this case are slon_gb_1.out, slon_gb_2.out, and slon_gb_3.out. If you see any problems,
you may have to remove the paths and/or listens, replacing them with corrected ones.
The log file slon_gb_1.out looks like:
CONFIG main: slon version 1.0.5 starting up
CONFIG main: local node id = 1
CONFIG main: loading current cluster configuration
CONFIG storeNode: no_id=2 no_comment='Node 2 dbname=gb host=localhost
port=5430 user=postgres'
CONFIG storePath: pa_server=2 pa_client=1 pa_conninfo="dbname=gb host=localhost
port=5430 user=postgres" pa_connretry=10
CONFIG storeListen: li_origin=2 li_receiver=1 li_provider=2
CONFIG main: configuration complete - starting threads
CONFIG enableNode: no_id=2
CONFIG storeSet: set_id=1 set_origin=1 set_comment=''
CONFIG storeNode: no_id=3 no_comment='gb2 8.0 5430'
CONFIG enableNode: no_id=3
CONFIG storePath: pa_server=3 pa_client=1 pa_conninfo="dbname=gb2 host=localhost
port=5430 user=postgres" pa_connretry=10
CONFIG storeListen: li_origin=3 li_receiver=1 li_provider=3
The one for Node 2, slon_gb_2.out, looks very similar except the paths and listens are appropriate for
Node 2.
Now subscribe Node 3 to the table set.
#!/bin/sh
slonik << _END_
#
# Define cluster namespace and node connection information
#
cluster name = gbcluster;
node 1 admin conninfo = 'dbname=gb host=localhost port=5434 user=postgres';
node 2 admin conninfo = 'dbname=gb host=localhost port=5430 user=postgres';
node 3 admin conninfo = 'dbname=gb2 host=localhost port=5430 user=postgres';
subscribe set (id=1, provider=1, receiver=3, forward=yes);
echo 'set 1 of gb tables subscribed by node 3';
_END_
Verification of data is reassuring at this point.
10/05/2017
15/28
769875629
Swapping Providers
If your replication scheme has three or more nodes, you may want to switch providers. This is not the
same as failover or promotion of a new master. It simply changes the source of the replication data for a
node.
In the example case, Node 1, the origin of set 1, provided the information on set 1 to Node 2. When you
added Node 3, you subscribed it to Table Set 1. The Table Set both originated on and was provided by
Node 1.
Now the goal is to switch providers so that Node 3 retrieves Table Set 1 information from Node 2,
instead of directly from Node 1. Node 1 remains the origin of the Table Set.
Node 3 could have either Node 1 or Node 2 as a provider of that set information. Node 2 is available as
a provider, because when you subscribed Node 2 to Table Set 1, you also enabled it as a forwarder of
Table Set 1.
The listen paths, however, have Node 2 and Node 3 listening to each other via Node 1. One necessary
change is to ensure that 2 and 3 listen to each other directly, because Node 2 will now provide the data
for Node 1. Once the paths and listens are already set up, simply resubscribe the set, setting the provider
to Node 2 instead of Node 1.
Figure 3.
#! /bin/bash
#
# varlena Slony Initialization
# ==============================
slonik << _END_
cluster name =gbcluster;
node 1 admin conninfo = 'dbname=gb host=localhost port=5434 user=postgres';
node 2 admin conninfo = 'dbname=gb host=localhost port=5430 user=postgres';
node 3 admin conninfo = 'dbname=gb2 host=localhost port=5430 user=postgres';
# ==============================
store listen (origin=3, receiver=2, provider=3);
store listen (origin=2, receiver=3, provider=2);
subscribe set ( id=1, provider=2, receiver=3, forward=yes);
10/05/2017
16/28
769875629
_END_
After running this script, examine the log files to see that Slony stored the listens and updated the
subscription.
Switching Masters
In the example, the database called gb (mastered on a PostgreSQL 7.4 installation) replicates to two
nodes on 8.0 installation. The decision has been made to move the database forward so that the 8.0
installation is the master. (If you are using Slony-I, be prepared to take advantage of a fast upgrade using
master promotion.)
Before you even consider swapping masters, you must have in hand a complete set of steps to follow to
switch your applications accessing the 7.4 installation to the 8.0 installation. These steps are applicationdependent, and so you are on your own. However, the steps probably consist of stopping each
application, changing the connection information (ideally in one place), and bringing the application
back up after the switchover. It is imperative for a smooth and fast switchover that you have the
application switch information at hand. Write them down and save the instructions in your source code
control mechanism. You will never be sorry you did that.
One more cautionary note, I highly recommend that you test your application on a copy of the 8.0
database. This copy should not be the replica, if your application writes to the database; it should be
another copy. Remember that replicas are read only databases. Oh, yes, one more thing. Back up your
databases before performing any major operations on them. Switching replication masters is a major
operation. No excuses. Back up!
Everything that happens in a Slony-I replication happens because of an Event. One of the important
Events is a SYNC event. Every subscribing node replicates data up to a SYNC event and then commits the
changes to the database. All nodes capable of forwarding subscriptions also log those changes until
every node subscribed to that set has confirmed the SYNC event. This ensures that replication data
remains available in the system until Slony is sure that no node needs it.
To change the master of a set, you must first ensure that there are listeners for any of the new paths. The
example provided listening paths from and to both of the nodes already, so there are no new listener
paths required. Before swapping a master on a subscribed set, the set must be locked to ensure that no
updates occur during the swap-over part. Then you may move the set. Finally, the newly designated
replica node, formerly the master node, must subscribe to the set.
Before you run the script, ensure that write access to your master database is OFF.
#!/bin/sh
slonik << _EOF_
cluster name
node 1 admin
node 2 admin
node 3 admin
= gbcluster;
conninfo = 'dbname=gb host=localhost port=5434 user=postgres';
conninfo = 'dbname=gb host=localhost port=5430 user=postgres';
conninfo = 'dbname=gb2 host=localhost port=5430 user=postgres';
# add listener paths if required
#
# lock and move set
#
lock set (id=1, origin=1);
10/05/2017
17/28
769875629
move set (id=1, old origin=1, new origin=2);
# subscribe set if required
subscribe set (id=1, provider=2, receiver=1, forward=yes);
_END_
After this script, Slony-I is ready for the change. Restart your application and have it point to the new
master for writes. This process should only take seconds.
Failover
The replication example here began as a 7.4 database gb (Node 1), replicating to an 8.0 installation
(Node 2). Then you added a third node on the 8.0 database called gb2. Initially the third node replicated
directly from Node 1. You switched Node 3's provider to Node 2, then promoted Node 2 to be the
master. Now you have an 8.0 master at Node 2 and two replicas, one 7.4 and one in the same installation
as gb2. Consider the second replica, gb2, as if it were on another machine or installation.
Suppose the master database failed or lost power and had to be taken offline. (Remember Node 3, for the
sake of this discussion, is not on this machine, and so this hypothetical situation treats it as if it were live
when the machine died.) What to do next is a human decision. You must prepare for this failover
scenario. In this case, you have decided to failover to Node 1 in case of a failure on Node 2.
The applications communicating with the database will also start to fail, so you must take them offline
and restart them, pointing at the new master quickly after the master switch takes effect.
The failover procedure for Slony is a combination of a provider change and a master change, both of
which "Building and Configuring Slony" covered. Previously, you added Node 3 and had Node 1
provide for it. Then you changed Node 3's provider to Node 2, and finally, promoted Node 2 to master.
In the failover case, Slony must do the reverse using the failover command:


Promote Node 1 to master.
Switch Node 3 to pull from Node 1.
Then you can safely remove Node 2 for repair.
#!/bin/sh
slonik <<_EOF_
cluster name
node 1 admin
node 2 admin
node 3 admin
= gbcluster;
conninfo = 'dbname=gb host=localhost user=postgres port=5434';
conninfo = 'dbname=gb host=localhost user=postgres port=5430';
conninfo = 'dbname=gb2 host=localhost user=postgres port=5430';
failover (id=2, backup node = 1);
_EOF_
At this point, the slon process for Node 2 should be dead. When you are ready to put Node 2 back into
place, add it as a fresh empty database into the replication scheme. (See "Add Node," above.) When it
catches up, then you can switch masters so that Node 2 can again be the master of the cluster. (See
"Switching Masters.")
While resolving some of the problems that I ran into, I found that it was easiest and clearest for me to
drop Node 2. drop node "erases" the node from the entire replication cluster. This is different than
10/05/2017
18/28
769875629
uninstall node,
which removes the Slony-I schema from a specific database instance. Both tools are
useful, but don't confuse them.
The other issue I ran into was that while quickly cutting and pasting, I had mismatched paths, listens,
and connection information. It is very worthwhile to check each of these commands by hand to verify
that the commands are exactly what you meant. Also, don't forget that the connection information for
store path pertains to the server node. Undoing bad paths and listen connections is a delicate
operation and it is very tempting to throw the whole thing away and start from scratch.
Schema Changes
Changing the schema of a database being replicated is not simple. The schema changes must occur at the
exact transactional point in time. Direct schema changes could lead to serious corruption in Slony-I due
to the handling and disabling of triggers, rules, and constraints in the replica.
Slony-I provides a way to execute SQL statements via the Event mechanism. This provides the
transactional integrity necessary for schema changes, as well as the trigger, rule, and constraint changes
required for the replicas. You must initiate Events on the master node.
To add a "dummy" column to the artfile table, issue an Event to the master Node 1 pointing to the
ALTER TABLE script file so as to synchronize it between databases. The EXECUTE SCRIPT command in
Slonik will do this. Remember, Slony must be able to find your change script.
#!/bin/sh
#
# Create a new column in a table in the replicated set
#
echo "Creating new column in the artfile table"
slonik <<_EOF_
cluster name = gbcluster;
node 1 admin conninfo = 'dbname=gb host=localhost user=postgres port=5434';
node 2 admin conninfo = 'dbname=gb host=localhost user=postgres port=5430';
node 3 admin conninfo = 'dbname=gb2 host=localhost user=postgres port=5430';
execute script ( SET ID = 1, FILENAME = 'changes20050219.sql', EVENT NODE = 1);
_EOF_
Once this change propagates, you can do an update to populate the new column and verify it is being
updated on all replicas.
References






General Bits Slony Articles on Tidbits
The Slony-I Project documentation on GBorg
Slonik Commands
Jan Wieck's Original Slony-I Talk and Scripts (PDF) July 2004 in Portland, OR, sponsored by
Global Registry Services
Information from IRC #slony on freenode.net
Mailing list: [email protected]
A. Elein Mustain has more than 15 years of experience working with databases, 10 of those working
exclusively with object relational database systems.
10/05/2017
19/28
769875629
http://gborg.postgresql.org/project/slony1/projdisplay.php
http://gborg.postgresql.org/project/slony1/genpage.php?slonik_commands
http://www.varlena.com/varlena/GeneralBits/Tidbits/
Why yet another replication system?
Slony-I was born from an idea to create a replication system that was not tied to a
specific version of PostgreSQL, and allowed to be started and stopped on an
existing database with out the need for a dump/reload cycle.
What Slony-I is:
Slony-I is a "master to multiple slaves" replication system with cascading and
slave promotion. The big picture for the development of Slony-I is a
master-slave system that includes all features and capabilities needed to
replicate large databases to a reasonably limited number of slave systems.
Slony-I is a system for data centers and backup sites, where the normal mode of
operation is that all nodes are available.
What Slony-I is not:
Slony-I is not a network management system. Slony-I does not have any
functionality within it to detect a node failure, or automatically promote a node
to a master or other data origin.
Slony-I is not multi-master; it's not a
connection broker, and it doesn't make you coffee and toast in the morning.
Why doesn't Slony-I do automatic fail-over/promotion?
This is the job of network monitoring software, not Slony. Every site's
configuration and fail-over path is different. For example, keep-alive
monitoring with redundant NIC's and intelligent HA switches that guarantee
race-condition-free takeover of a network address and disconnecting the
"failed" node vary in every network setup, vendor choice, hardware/software
combination.
This is clearly the realm of network management software and not
Slony-I. Let Slony-I do what it does best: provide database replication.
Current Limitations:
Slony-I does not automatically propagate schema changes, nor does it have any
ability to replicate large objects.
10/05/2017
20/28
769875629
Getting started with Slony-I
Installation
Before you can begin replicating your databases with Slony-I, you need to
install it.
Requirements:
Any platform that can run PostgreSQL should be able to run Slony-I. The
platforms that have received specific testing at the time of this release are
FreeBSD-4X-i368, FreeBSD-5X-i386, FreeBSD-5X-alpha, osX-10.3, Linux-2.4X-i386
Linux-2.6X-i386 Linux-2.6X-amd64, Solaris-2.8-SPARC, Solaris-2.9-SPARC and
OpenBSD-3.5-sparc64.
All the servers used within the replication cluster need to have their Real Time
Clocks in sync. This is to ensure that slon doesn't error with messages
indicating that slave is already ahead of the master during replication.
We recommend you use ntpd running on the master, with the slaves using it as
their time peer.
The following software packages are required to build Slony-I:
GNU make. Other make programs will not work. GNU make is often installed under
the name gmake; this document will always refer to it by that name. (On some
systems GNU make is the default tool with the name make) to test for GNU make
enter gmake version. Version 3.76 or later is good. Previous versions may not
be.
You need an ISO/ANSI C complier.
Recent versions of GCC work.
You also need a recent version of PostgreSQL *source*. Slony-I depends on
namespace support so you must have version 7.3 or newer to be able to build and
use Slony-I.
If you need to get a GNU package, it comes in the standard packaging for your
operating system, or you can find it at your local GNU mirror (see
http://www.gnu.org/order/ftp.html for a list) or at ftp://ftp.gnu.org/gnu.)
If you need to obtain PostgreSQL source, you can download it form your
favorite PostgreSQL mirror (see for a list), or via BitTorrent at .
Also check to make sure you have sufficient disk space. You will need
approximately 5MB for the source tree during build and installation.
Getting Slony-I Source
You can get the Slony-I source from
After you have obtained the file, unpack it.
gunzip slony.tar.gz
tar xf slony.tar
This will create a directory Slony-I under the current directory with the
Slony-I sources. Change into that that directory for the rest of the
installation procedure.
Short Version
10/05/2017
21/28
769875629
./configure --with-pgsourcetree=
gmake all
gmake install
1.
Configuration
The first step of the installation procedure is to configure the source tree for
your system. This is done by running the configure script. Configure needs to
know where your PostgreSQL source tree is, this is done with the --withpgsourcetree= option.
Example:
./configure --with-pgsourcetree=/usr/local/src/postgresql-7.4.3
This script will run a number of tests to guess values for various dependent
variables and try to detect some quirks of your system. Slony-I is known to need
a modified version of libpq on specific platforms such as Solaris2.X on SPARC
this patch can be found at .
2.
Build
To start the build process, type
gmake all
(Remember to use GNU make) The build make take any ware from 30 seconds to 2
minutes depending on your hardware. The last line displayed should be
All of Slony-I is successfully made.
3.
Ready to install.
Installing Slony-I
To install Slony-I, enter
gmake install
This will install files into postgresql install directory as specified by the -prefix option used in the PostgreSQL configuration. Make sure you have
appropriate permissions to write into that area. Normally you need to do this as
root.
10/05/2017
22/28
769875629
Replicating Your First Database
In this example, we will be replicating a brand new pgbench database. The
mechanics of replicating an existing database are covered here, however we
recommend that you learn how Slony-I functions by using a fresh new
non-production database.
The Slony-I replication engine is trigger-based, allowing us to replicate
databases (or portions thereof) running under the same postmaster.
This example will show how to replicate the pgbench database running on
localhost (master) to the pgbench slave database also running on localhost
(slave). We make a couple of assumptions about your PostgreSQL configuration:
1. You have tcpip_socket=true in your postgresql.conf and
2. You have localhost set to trust in pg_hba.conf
The REPLICATIONUSER needs to be PostgreSQL superuser.
postgres or pgsql.
This is typically
You should also set the following shell variables:
CLUSTERNAME=slony_example
MASTERDBNAME=pgbench
SLAVEDBNAME=pgbenchslave
MASTERHOST=localhost
SLAVEHOST=localhost
REPLICATIONUSER=pgsql
PGBENCHUSER=pgbench
Here are a couple of examples for setting variables in common shells:
bash/sh:
export CLUSTERNAME=slony_example
(t)csh:
setenv CLUSTERNAME slony_example
Creating the pgbenchuser
createuser -A -D $PGBENCHUSER
Preparing the databases
createdb -O $PGBENCHUSER -h $MASTERHOST $MASTERDBNAME
createdb -O $PGBENCHUSER -h $SLAVEHOST $SLAVEDBNAME
pgbench -i -s 1 -U $PGBENCHUSER -h $MASTERHOST $MASTERDBNAME
Because Slony-I depends on the databases having the pl/pgSQL procedural
language installed, we better install it now. It is possible that you have
installed pl/pgSQL into the template1 database in which case you can skip this
step because it's already installed into the $MASTERDBNAME.
createlang plpgsql -h $MASTERHOST $MASTERDBNAME
Slony-I does not yet automatically copy table definitions from a master when a
slave subscribes to it, so we need to import this data. We do this with
pg_dump.
pg_dump -s -U $REPLICATIONUSER -h $MASTERHOST $MASTERDBNAME | psql -U
$REPLICATIONUSER -h $SLAVEHOST $SLAVEDBNAME
To illustrate how Slony-I allows for on the fly replication subscription, lets
start up pgbench. If you run the pgbench application in the foreground of a
10/05/2017
23/28
769875629
separate terminal window, you can stop and restart it with different
parameters at any time. You'll need to re-export the variables again so they
are available in this session as well.
The typical command to run pgbench would look like:
pgbench -s 1 -c 5 -t 1000 -U $PGBENCHUSER -h $MASTERHOST $MASTERDBNAME
This will run pgbench with 5 concurrent clients each processing 1000
transactions against the pgbench database running on localhost as the pgbench
user.
Configuring the Database for Replication.
Creating the configuration tables, stored procedures, triggers and
configuration is all done through the slonik tool. It is a specialized
scripting aid that mostly calls stored procedures in the master/salve (node)
databases. The script to create the initial configuration for the simple
master-slave setup of our pgbench database looks like this:
#!/bin/sh
slonik <<_EOF_
#-# define the namespace the replication system uses in our example it is
# slony_example
#-cluster name = $CLUSTERNAME;
#-# admin conninfo's are used by slonik to connect to the nodes one for each
# node on each side of the cluster, the syntax is that of PQconnectdb in
# the C-API
# -node 1 admin conninfo = 'dbname=$MASTERDBNAME host=$MASTERHOST
user=$REPLICATIONUSER';
node 2 admin conninfo = 'dbname=$SLAVEDBNAME host=$SLAVEHOST
user=$REPLICATIONUSER';
#-# init the first node. Its id MUST be 1. This creates the schema
# _$CLUSTERNAME containing all replication system specific database
# objects.
#-init cluster ( id=1, comment = 'Master Node');
#
#
#
#
#
#
#
#-Because the history table does not have a primary key or other unique
constraint that could be used to identify a row, we need to add one.
The following command adds a bigint column named
_Slony-I_$CLUSTERNAME_rowID to the table. It will have a default value
of nextval('_$CLUSTERNAME.s1_rowid_seq'), and have UNIQUE and NOT NULL
constraints applied. All existing rows will be initialized with a
number
#-table add key (node id = 1, fully qualified name = 'public.history');
#-# Slony-I organizes tables into sets. The smallest unit a node can
# subscribe is a set. The following commands create one set containing
# all 4 pgbench tables. The master or origin of the set is node 1.
#-create set (id=1, origin=1, comment='All pgbench tables');
10/05/2017
24/28
769875629
set add table (set id=1, origin=1, id=1,
'public.accounts', comment='accounts table');
set add table (set id=1, origin=1, id=2,
'public.branches', comment='branches table');
set add table (set id=1, origin=1, id=3,
'public.tellers', comment='tellers table');
set add table (set id=1, origin=1, id=4,
'public.history', comment='history table', key =
fully qualified name =
fully qualified name =
fully qualified name =
fully qualified name =
serial);
#-# Create the second node (the slave) tell the 2 nodes how to connect to
# each other and how they should listen for events.
#-store node (id=2, comment = 'Slave node');
store path (server = 1, client = 2, conninfo='dbname=$MASTERDBNAME
host=$MASTERHOST user=$REPLICATIONUSER');
store path (server = 2, client = 1, conninfo='dbname=$SLAVEDBNAME
host=$SLAVEHOST user=$REPLICATIONUSER');
store listen (origin=1, provider = 1, receiver =2);
store listen (origin=2, provider = 2, receiver =1);
_EOF_
Is the pgbench still running?
If not start it again.
At this point we have 2 databases that are fully prepared. One is the master
database in which bgbench is busy accessing and changing rows. It's now time
to start the replication daemons.
On $MASTERHOST the command to start the replication engine is
slon $CLUSTERNAME "dbname=$MASTERDBNAME user=$REPLICATIONUSER host=$MASTERHOST"
Likewise we start the replication system on node 2 (the slave)
slon $CLUSTERNAME "dbname=$SLAVEDBNAME user=$REPLICATIONUSER host=$SLAVEHOST"
Even though we have the slon running on both the master and slave and they are
both spitting out diagnostics and other messages, we aren't replicating any
data yet. The notices you are seeing is the synchronization of cluster
configurations between the 2 slon processes.
To start replicating the 4 pgbench tables (set 1) from the master (node id 1)
the the slave (node id 2), execute the following script.
#!/bin/sh
slonik <<_EOF_
# ---# This defines which namespace the replication system uses
# ---cluster name = $CLUSTERNAME;
# ---# Admin conninfo's are used by the slonik program to connect
# to the node databases. So these are the PQconnectdb arguments
# that connect from the administrators workstation (where
# slonik is executed).
# ---node 1 admin conninfo = 'dbname=$MASTERDBNAME host=$MASTERHOST
user=$REPLICATIONUSER';
node 2 admin conninfo = 'dbname=$SLAVEDBNAME host=$SLAVEHOST
user=$REPLICATIONUSER';
10/05/2017
25/28
769875629
# ---# Node 2 subscribes set 1
# ---subscribe set ( id = 1, provider = 1, receiver = 2, forward = no);
_EOF_
Any second here, the replication daemon on $SLAVEHOST will start to copy the
current content of all 4 replicated tables. While doing so, of course, the
pgbench application will continue to modify the database. When the copy
process is finished, the replication daemon on $SLAVEHOST will start to catch
up by applying the accumulated replication log. It will do this in little
steps, 10 seconds worth of application work at a time. Depending on the
performance of the two systems involved, the sizing of the two databases, the
actual transaction load and how well the two databases are tuned and
maintained, this catchup process can be a matter of minutes, hours, or
eons.
You have now successfully set up your first basic master/slave replication
system, and the 2 databases once the slave has caught up contain identical
data. That's the theory. In practice, it's good to check that the datasets
are in fact the same.
The following script will create ordered dumps of the 2 databases and compare
them. Make sure that pgbench has completed it's testing, and that your slon
sessions have caught up.
#!/bin/sh
echo -n "**** comparing sample1 ... "
psql -U $REPLICATIONUSER -h $MASTERHOST $MASTERDBNAME >dump.tmp.1.$$ <<_EOF_
select 'accounts:'::text, aid, bid, abalance, filler
from accounts order by aid;
select 'branches:'::text, bid, bbalance, filler
from branches order by bid;
select 'tellers:'::text, tid, bid, tbalance, filler
from tellers order by tid;
select 'history:'::text, tid, bid, aid, delta, mtime, filler,
"_Slony-I_${CLUSTERNAME}_rowID"
from history order by "_Slony-I_${CLUSTERNAME}_rowID";
_EOF_
psql -U $REPLICATIONUSER -h $SLAVEHOST $SLAVEDBNAME >dump.tmp.2.$$ <<_EOF_
select 'accounts:'::text, aid, bid, abalance, filler
from accounts order by aid;
select 'branches:'::text, bid, bbalance, filler
from branches order by bid;
select 'tellers:'::text, tid, bid, tbalance, filler
from tellers order by tid;
select 'history:'::text, tid, bid, aid, delta, mtime, filler,
"_Slony-I_${CLUSTERNAME}_rowID"
from history order by "_Slony-I_${CLUSTERNAME}_rowID";
_EOF_
if diff dump.tmp.1.$$ dump.tmp.2.$$ >$CLUSTERNAME.diff ; then
echo "success - databases are equal."
rm dump.tmp.?.$$
rm $CLUSTERNAME.diff
else
echo "FAILED - see $CLUSTERNAME.diff for database differences"
fi
If this script returns "FAILED" please contact the developers at
http://slony.org/
10/05/2017
26/28
769875629
Doing switchover and failover with Slony-I
Foreword
Slony-I is an asynchronous replication system. Because of that, it
is almost certain that at the moment the current origin of a set
fails, the last transactions committed have not propagated to the
subscribers yet. They always fail under heavy load, and you know
it. Thus the goal is to prevent the main server from failing.
The best way to do that is frequent maintenance.
Opening the case of a running server is not exactly what we
all consider professional system maintenance. And interestingly,
those users who use replication for backup and failover
purposes are usually the ones that have a very low tolerance for
words like "downtime". To meet these requirements, Slony-I has
not only failover capabilities, but controlled master role transfer
features too.
It is assumed in this document that the reader is familiar with
the slonik utility and knows at least how to set up a simple
2 node replication system with Slony-I.
Switchover
We assume a current "origin" as node1 (AKA master) with one
"subscriber" as node2 (AKA slave). A web application on a third
server is accessing the database on node1. Both databases are
up and running and replication is more or less in sync.
Step 1)
At the time of this writing switchover to another server requires
the application to reconnect to the database. So in order to avoid
any complications, we simply shut down the web server. Users who
use pgpool for the applications database connections can shutdown
the pool only.
Step 2)
A small slonik script executes the following commands:
lock
wait
move
wait
set
for
set
for
(id =
event
(id =
event
1, origin = 1);
(origin = 1, confirmed = 2);
1, old origin = 1, new origin = 2);
(origin = 1, confirmed = 2);
After these commands, the origin (master role) of data set 1
is now on node2. It is not simply transferred. It is done
in a fashion so that node1 is now a fully synchronized subscriber
actively replicating the set. So the two nodes completely switched
roles.
Step 3)
After reconfiguring the web application (or pgpool) to connect to
the database on node2 instead, the web server is restarted and
resumes normal operation.
Done in one shell script, that does the shutdown, slonik, move
config files and startup all together, this entire procedure
takes less than 10 seconds.
10/05/2017
27/28
769875629
It is now possible to simply shutdown node1 and do whatever is
required. When node1 is restarted later, it will start replicating
again and eventually catch up after a while. At this point the
whole procedure is executed with exchanged node IDs and the
original configuration is restored.
Failover
Because of the possibility of missing not-yet-replicated transactions that are
committed, failover is the worst thing that can happen in a master-slave
replication scenario. If there is any possibility to bring back the failed server
even if only
for a few minutes, we strongly recommended that you follow the
switchover procedure above.
Slony does not provide any automatic detection for failed systems.
Abandoning committed transactions is a business decision that
cannot be made by a database. If someone wants to put the
commands below into a script executed automatically from the
network monitoring system, well ... its your data.
Step 1)
The slonik command
failover (id = 1, backup node = 2);
causes node2 to assume the ownership (origin) of all sets that
have node1 as their current origin. In the case there would be
more nodes, All direct subscribers of node1 are instructed that
this is happening. Slonik would also query all direct subscribers
to figure out which node has the highest replication status
(latest committed transaction) for each set, and the configuration
would be changed in a way that node2 first applies those last
minute changes before actually allowing write access to the
tables.
In addition, all nodes that subscribed directly from node1 will
now use node2 as data provider for the set. This means that
after the failover command succeeded, no node in the entire
replication setup will receive anything from node1 any more.
Step 2)
Reconfigure and restart the application (or pgpool) to cause it
to reconnect to node2.
Step 3)
After the failover is complete and node2 accepts write operations
against the tables, remove all remnants of node1's configuration
information with the slonik command
drop node (id = 1, event node = 2);
After failover, getting back node1
After the above failover, the data stored on node1 must be
considered out of sync with the rest of the nodes. Therefore, the
only way to get node1 back and transfer the master role to it is
to rebuild it from scratch as a slave, let it catch up and then
follow the switchover procedure.
10/05/2017
28/28
769875629