Document directory
- PostgreSQL cluster replication Solution
- About pgq and londiste tools
- Install
- Configuration
- Start replication daemon
- Add a replica set
- Test conclusion
- Exploration of application scenarios
PostgreSQL cluster replication Solution
Refer to the Article 25 high availability, Server Load balancer and replication to learn about some common replication solutions.
About pgq and londiste tools
From the skype toolkit skytools, pgq is based on postgresql's general queue implementation. Londiste is a pgq-based replication tool. The combination of the two tools is similar to slony, but the former is simpler than the latter. We can simply think of this replication as the producer (provider)/consumer (subscriber) type. The master generates data in the queue, while the slave retrieves data from the queue and acts on itself.
Install
A brief description of the installation process is as follows (assume that two machines, one is the master node and the other is the slave node, and the database has been installed)
- Install psycopg2. Download it here.
Tar xzvf psycopg2-2.0.7.tar.gz
Cd psycopg2-2.0.7
Python setup. py build_ext-Rpgsql installation directory/lib
Python setup. py install
- Install skytools. Download it here.
tar xzvf skytools-2.1.7.tar.gz
cd skytools-2.1.7
./configure
make
make install
python setup.py install
Configuration
Configuration File Name: ticker. ini
[pgqadm]
job_name = ticker
db = dbname=foo host=127.0.0.1 port=5555 user=chry
# how often to run maintenance [minutes]
maint_delay_min = 1
# how often to check for activity [secs]
loop_delay = 0.1
logfile = %(job_name)s.log
pidfile = %(job_name)s.pid
use_skylog = 0
- Master-provider and slave-subscriber
Configuration File Name: replica. ini
[londiste]
job_name = lmaster_to_slave
provider_db = dbname=foo host=127.0.0.1 port=5555 user=chry
subscriber_db = dbname=foo host=205.203.*.* port=5555 user=chry
# it will be used as sql ident so no dots/spaces
pgq_queue_name = londiste.write
pidfile = %(job_name)s.pid
logfile = %(job_name)s.log
use_skylog = 0
Start replication daemon
Pgqadm. py ticker. ini ticker-d (start the pgq management process)
Londiste. py replica. ini provider install (install the schema required by londiste)
Londiste. py replica. ini subscriber install (install the schema required by londiste)
Londiste. py replica. ini replay-d (start data replication. The replica file is deployed twice)
The above process requires that slave can log on to the master database.
Add a replica set
- Suppose you want to synchronize the table entry with two fields: id (sequence, primary key) and item text. Then we need to synchronize sequence: entry_id_seq
Master
londiste.py replica.ini provider add entry
londiste.py replica.ini provider add-seq entry_id_seq
Slave
londiste.py replica.ini subscriber add entry
londiste.py replica.ini subscriber add-seq entry_id_seq
Added.
Test conclusion
- The synchronized table object must have the primary key.
- The insert, update, and delete operations on the master node can be correctly synchronized to the corresponding slave table. However, truncate is not supported.
- If the slave database is down, the changes in the master data during the time when the slave is restarted can be synchronized to the corresponding slave table, but the old data will not be synchronized.
- If you execute insert/update/delete table operations on the slave, the master will not be affected. In contrast, any changes to the sequence on the slave will be reset when the master changes.
- To force the old data synchronization, run the following command on slave:
londiste.py replica.ini subscriber resync -v --force
Exploration of application scenarios
- This scheme can be applied to scenarios where reading and writing can be separated, for example, the following applications combined with plproxy:
One side is the feed data to the write cluster, and the other side is not only to improve the performance of read/write splitting, but also to support load balance, so pgq/londiste is undoubtedly a better solution. Previously, we may need to write data from two nodes at the same time to ensure data synchronization. This not only compromises performance, but also ensures transactions. The standard plproxy does not support transactions.
Ticket search is such an application, and some OLTP and web applications may also be suitable.