PostgreSQL high availability, load balancing, replication and cluster scenario introduction

Last Update:2016-04-27 Source: Internet

Author: User

Tags connection pooling failover postgresql

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Catalogue [-]

One, high availability, load balancing, replication of several scenarios comparison:
Two, multi-node cluster scheme comparison

9.3 Official Document (Chinese): http://58.58.27.50:8079/doc/html/9.3.1_zh/high-availability.html

replication, cluster, and connection pooling: https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling

Cluster scenario Feature List: http://blog.osdba.net/46.html

One, high availability, load balancing, replication of several scenarios comparison:

Shared disk failover

Shared disk failover avoid the overhead of synchronizing by saving only one copy of the database. This scenario allows multiple servers to share a single disk array with each other. If the primary server fails, the backup server mounts the database immediately, just like recovering from a crash. This scheme allows for fast failover and no loss of data.

The ability to share hardware is typically provided by a networked storage device, or you can use a network file system that is fully POSIX-compliant (see section 17.2.1). The limitation of this scenario is that if the shared disk array is damaged, the entire system will be paralyzed. Another limitation is that the backup server cannot access the shared storage while the primary server is functioning properly.

File system replication (block devices)

An improved scenario is file system replication: Any changes to the file system are mirrored to the backup server. The only limitation of this scenario is to ensure that the mirror of the backup server is exactly the same as the primary server-in particular, the write order must be identical. DRBD is a popular file system replication scheme on Linux.

Transaction log Shipping

The hot standby server can maintain the current state of the database by reading the Wal record stream. If the primary server fails, the hot standby server will contain data from almost all primary servers and can quickly switch itself to the primary server. This is an asynchronous scheme and can only be implemented on the entire database server.

Use file-based log shipping or stream replication, or combine the two. The former refer to section 25.2, which refer to section 25.2.5. See section 25.5 For information about hot spares.

Trigger-based primary and standby replication

This scenario sends all requests to modify the data to the primary server. The primary server asynchronously sends the change information to the data from the server. Only read requests are answered from the server when the primary server is running. The server is ideal for requests from the Data warehouse.

Slony-i is an example of this scenario, which supports granularity for each table and supports multiple slave servers. Due to its asynchronous, batch update from the server, there may be data loss during failover.

Statement-based replication middleware

You can use a statement-based replication middleware program to intercept each SQL query and send it to one or all of the servers. Each server is running independently. Read-write requests are sent to all servers, so each server receives any changes. However, read-only requests are sent only to a single server, which enables read load balancing.

If you are simply broadcasting SQL statements that modify data, then similarrandom(),CURRENT_TIMESTAMPand sequence functions will produce different results on different servers. This is because each server runs independently and broadcasts SQL statements instead of modifying the rows. If this result is unacceptable, then the middleware or application must ensure that the values are always read from the same server and applied to the write request. It is also important to ensure that each transaction must be committed successfully or fully rolled back on all servers, or using two-phase commit (PREPARE TRANSACTION and commit PREPARED). Pgpool-ii and continuent Tungsten are examples of this scenario.

Asynchronous multi-master server replication

For servers with irregular connections, such as laptops or remote servers, it is cumbersome to keep data consistent between them. In this scenario, each server works independently and periodically communicates with other servers to identify conflicting transactions. Conflicts can be handled through user or conflict decision rules.

Synchronizing multi-master server replication

In this scenario, each server can accept the write request, and the modified data must be broadcast from the original server to all other servers before the transaction is committed. Excessive write actions result in excessive locking, which results in poor performance. In fact, the performance of simultaneous writes on multiple servers is always lower than the performance written on a single server. Read requests will be distributed evenly to each individual server. Some implementations use shared disks to reduce communication overhead. Synchronous multi-master replication scenarios are best suited for reading far more than writing. The advantage is that each server can accept write requests-so there is no need to partition workloads between master and slave servers. Because data changes are sent between servers, there is no adverse effect on nondeterministic functions (such asrandom()).

PostgreSQL does not provide this type of replication. However, the two-phase commit of PostgreSQL (PREPARE transaction and commit PREPARED) can be used to implement this functionality in the application layer or in the middleware code.

Business Solutions

Because PostgreSQL is open source and easily extensible, many companies have created a commercially closed source solution based on PostgreSQL, providing unique failover, replication, and load balancing capabilities.

Feature	Shared Disk Failover	File System Replication	Transaction Log Shipping	trigger-based Master-standby Replication	statement-based Replication Middleware	asynchronous Multimaster Replication	synchronous multimaster Replication
Most Common implementation	Nas	Drbd	Streaming REPL.	Slony	Pgpool-ii	Bucardo
Communication Method	Shared disk	Disk Blocks	WAL	Table rows	Sql	Table rows	Table rows and Row locks
No Special Hardware Required		•	•	•	•	•	•
Allows multiple master servers					•	•	•
No master server Overhead	•		•		•
No Waiting for multiple servers	•		With Sync off	•		•
Master failure'll never lose data	•	•	With Sync on		•		•
Standby Accept Read-only Queries			With hot	•	•	•	•
Per-table Granularity				•		•	•
No Conflict resolution Necessary	•	•	•	•			•

There are several solutions that are not suitable for the above categories:

Data partitioning

Data partitioning splits a table into datasets. Only one server per dataset can be modified. For example, data can be partitioned by office, for example, London and Paris, with one server per office. If the query requires data combined between London and Paris, the application can query two servers, or primary/standby replication can be used to keep a copy of the read-only data for other offices on each server.

Multi-server parallel query execution

Many of these solutions allow multiple servers to handle multiple queries, but do not allow a single query to use multiple servers for faster completion. This solution allows a single query to run concurrently on multiple servers. It is usually separated from the data between the servers to execute part of its query and returns the result to the central server, which is used to federate the result and return it to the user. PGPOOL-II has this ability. You can also use the Pl/proxy toolset implementation.

Two, multi-node cluster scheme comparison

Can be based on replication stream (stream replication).

Program

	License	Maturity	Replication Method	Sync	Connection Pooling	Load Balancing	Query Partitioning
Pgcluster	Bsd	Stalled pause	Master-master	Synchronous	No	Yes	No
Pgpool-i	Bsd	Stable	statement-based Middleware	Synchronous	Yes	Yes	No
Pgpool-ii	Bsd	Recent release	statement-based Middleware	Synchronous	Yes	Yes	Yes
Slony	Bsd	Stable	Master-slave	Asynchronous	No	No	No
Bucardo	Bsd	Stable	Master-master, Master-slave	Asynchronous	No	No	No
Londiste	Bsd	Stable	Master-slave	Asynchronous	No	No	No
Mammoth	Bsd	Stalled	Master-slave	Asynchronous	No	No	No
Rubyrep	MIT	Stalled	Master-master, Master-slave	Asynchronous	No	No	No
BDR (bi-directional Replication)	PostgreSQL (BSD)	Beta	Master-master (No triggers needed)	Asynchronous	No	No	No
Pg_shard	Lgpl	Recent release	Statement-based Middleware (as an extension)	Synchronous	No	Yes	Yes

PostgreSQL high availability, load balancing, replication and cluster scenario introduction

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More