Catalogue [-]
- One, high availability, load balancing, replication of several scenarios comparison:
- Two, multi-node cluster scheme comparison
9.3 Official Document (Chinese): http://58.58.27.50:8079/doc/html/9.3.1_zh/high-availability.html
replication, cluster, and connection pooling: https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
Cluster scenario Feature List: http://blog.osdba.net/46.html
One, high availability, load balancing, replication of several scenarios comparison:
Shared disk failover
Shared disk failover avoid the overhead of synchronizing by saving only one copy of the database. This scenario allows multiple servers to share a single disk array with each other. If the primary server fails, the backup server mounts the database immediately, just like recovering from a crash. This scheme allows for fast failover and no loss of data.
The ability to share hardware is typically provided by a networked storage device, or you can use a network file system that is fully POSIX-compliant (see section 17.2.1). The limitation of this scenario is that if the shared disk array is damaged, the entire system will be paralyzed. Another limitation is that the backup server cannot access the shared storage while the primary server is functioning properly.
File system replication (block devices)
An improved scenario is file system replication: Any changes to the file system are mirrored to the backup server. The only limitation of this scenario is to ensure that the mirror of the backup server is exactly the same as the primary server-in particular, the write order must be identical. DRBD is a popular file system replication scheme on Linux.
Transaction log Shipping
The hot standby server can maintain the current state of the database by reading the Wal record stream. If the primary server fails, the hot standby server will contain data from almost all primary servers and can quickly switch itself to the primary server. This is an asynchronous scheme and can only be implemented on the entire database server.
Use file-based log shipping or stream replication, or combine the two. The former refer to section 25.2, which refer to section 25.2.5. See section 25.5 For information about hot spares.
Trigger-based primary and standby replication
This scenario sends all requests to modify the data to the primary server. The primary server asynchronously sends the change information to the data from the server. Only read requests are answered from the server when the primary server is running. The server is ideal for requests from the Data warehouse.
Slony-i is an example of this scenario, which supports granularity for each table and supports multiple slave servers. Due to its asynchronous, batch update from the server, there may be data loss during failover.
Statement-based replication middleware
You can use a statement-based replication middleware program to intercept each SQL query and send it to one or all of the servers. Each server is running independently. Read-write requests are sent to all servers, so each server receives any changes. However, read-only requests are sent only to a single server, which enables read load balancing.
If you are simply broadcasting SQL statements that modify data, then similarrandom(),CURRENT_TIMESTAMPand sequence functions will produce different results on different servers. This is because each server runs independently and broadcasts SQL statements instead of modifying the rows. If this result is unacceptable, then the middleware or application must ensure that the values are always read from the same server and applied to the write request. It is also important to ensure that each transaction must be committed successfully or fully rolled back on all servers, or using two-phase commit (PREPARE TRANSACTION and commit PREPARED). Pgpool-ii and continuent Tungsten are examples of this scenario.
Asynchronous multi-master server replication
For servers with irregular connections, such as laptops or remote servers, it is cumbersome to keep data consistent between them. In this scenario, each server works independently and periodically communicates with other servers to identify conflicting transactions. Conflicts can be handled through user or conflict decision rules.
Synchronizing multi-master server replication
In this scenario, each server can accept the write request, and the modified data must be broadcast from the original server to all other servers before the transaction is committed. Excessive write actions result in excessive locking, which results in poor performance. In fact, the performance of simultaneous writes on multiple servers is always lower than the performance written on a single server. Read requests will be distributed evenly to each individual server. Some implementations use shared disks to reduce communication overhead. Synchronous multi-master replication scenarios are best suited for reading far more than writing. The advantage is that each server can accept write requests-so there is no need to partition workloads between master and slave servers. Because data changes are sent between servers, there is no adverse effect on nondeterministic functions (such asrandom()).
PostgreSQL does not provide this type of replication. However, the two-phase commit of PostgreSQL (PREPARE transaction and commit PREPARED) can be used to implement this functionality in the application layer or in the middleware code.
Business Solutions
Because PostgreSQL is open source and easily extensible, many companies have created a commercially closed source solution based on PostgreSQL, providing unique failover, replication, and load balancing capabilities.
Feature |
Shared Disk Failover |
File System Replication |
Transaction Log Shipping |
trigger-based Master-standby Replication |
statement-based Replication Middleware |
asynchronous Multimaster Replication |
synchronous multimaster Replication |
Most Common implementation |
Nas |
Drbd |
Streaming REPL. |
Slony |
Pgpool-ii |
Bucardo |
|
Communication Method |
Shared disk |
Disk Blocks |
WAL |
Table rows |
Sql |
Table rows |
Table rows and Row locks |
No Special Hardware Required |
|
• |
• |
• |
• |
• |
• |
Allows multiple master servers |
|
|
|
|
• |
• |
• |
No master server Overhead |
• |
|
• |
|
• |
|
|
No Waiting for multiple servers |
• |
|
With Sync off |
• |
|
• |
|
Master failure'll never lose data |
• |
• |
With Sync on |
|
• |
|
• |
Standby Accept Read-only Queries |
|
|
With hot |
• |
• |
• |
• |
Per-table Granularity |
|
|
|
• |
|
• |
• |
No Conflict resolution Necessary |
• |
• |
• |
• |
|
|
• |
There are several solutions that are not suitable for the above categories:
Data partitioning
Data partitioning splits a table into datasets. Only one server per dataset can be modified. For example, data can be partitioned by office, for example, London and Paris, with one server per office. If the query requires data combined between London and Paris, the application can query two servers, or primary/standby replication can be used to keep a copy of the read-only data for other offices on each server.
Multi-server parallel query execution
Many of these solutions allow multiple servers to handle multiple queries, but do not allow a single query to use multiple servers for faster completion. This solution allows a single query to run concurrently on multiple servers. It is usually separated from the data between the servers to execute part of its query and returns the result to the central server, which is used to federate the result and return it to the user. PGPOOL-II has this ability. You can also use the Pl/proxy toolset implementation.
Two, multi-node cluster scheme comparison
Can be based on replication stream (stream replication).
Program
|
License |
Maturity |
Replication Method |
Sync |
Connection Pooling |
Load Balancing |
Query Partitioning |
Pgcluster |
Bsd |
Stalled pause |
Master-master |
Synchronous |
No |
Yes |
No |
Pgpool-i |
Bsd |
Stable |
statement-based Middleware |
Synchronous |
Yes |
Yes |
No |
Pgpool-ii |
Bsd |
Recent release |
statement-based Middleware |
Synchronous |
Yes |
Yes |
Yes |
Slony |
Bsd |
Stable |
Master-slave |
Asynchronous |
No |
No |
No |
Bucardo |
Bsd |
Stable |
Master-master, Master-slave |
Asynchronous |
No |
No |
No |
Londiste |
Bsd |
Stable |
Master-slave |
Asynchronous |
No |
No |
No |
Mammoth |
Bsd |
Stalled |
Master-slave |
Asynchronous |
No |
No |
No |
Rubyrep |
MIT |
Stalled |
Master-master, Master-slave |
Asynchronous |
No |
No |
No |
BDR (bi-directional Replication) |
PostgreSQL (BSD) |
Beta |
Master-master (No triggers needed) |
Asynchronous |
No |
No |
No |
Pg_shard |
Lgpl |
Recent release |
Statement-based Middleware (as an extension) |
Synchronous |
No |
Yes |
Yes |
PostgreSQL high availability, load balancing, replication and cluster scenario introduction