PostgreSQL high availability, load balancing, replication and cluster scenario introduction

Source: Internet
Author: User
Tags connection pooling failover postgresql



Catalogue [-]


    • One, high availability, load balancing, replication of several scenarios comparison:
    • Two, multi-node cluster scheme comparison


9.3 Official Document (Chinese): http://58.58.27.50:8079/doc/html/9.3.1_zh/high-availability.html



replication, cluster, and connection pooling: https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling



Cluster scenario Feature List: http://blog.osdba.net/46.html





One, high availability, load balancing, replication of several scenarios comparison:


Shared disk failover



Shared disk failover avoid the overhead of synchronizing by saving only one copy of the database. This scenario allows multiple servers to share a single disk array with each other. If the primary server fails, the backup server mounts the database immediately, just like recovering from a crash. This scheme allows for fast failover and no loss of data.



The ability to share hardware is typically provided by a networked storage device, or you can use a network file system that is fully POSIX-compliant (see section 17.2.1). The limitation of this scenario is that if the shared disk array is damaged, the entire system will be paralyzed. Another limitation is that the backup server cannot access the shared storage while the primary server is functioning properly.



File system replication (block devices)



An improved scenario is file system replication: Any changes to the file system are mirrored to the backup server. The only limitation of this scenario is to ensure that the mirror of the backup server is exactly the same as the primary server-in particular, the write order must be identical. DRBD is a popular file system replication scheme on Linux.



Transaction log Shipping



The hot standby server can maintain the current state of the database by reading the Wal record stream. If the primary server fails, the hot standby server will contain data from almost all primary servers and can quickly switch itself to the primary server. This is an asynchronous scheme and can only be implemented on the entire database server.



Use file-based log shipping or stream replication, or combine the two. The former refer to section 25.2, which refer to section 25.2.5. See section 25.5 For information about hot spares.



Trigger-based primary and standby replication



This scenario sends all requests to modify the data to the primary server. The primary server asynchronously sends the change information to the data from the server. Only read requests are answered from the server when the primary server is running. The server is ideal for requests from the Data warehouse.



Slony-i is an example of this scenario, which supports granularity for each table and supports multiple slave servers. Due to its asynchronous, batch update from the server, there may be data loss during failover.



Statement-based replication middleware



You can use a statement-based replication middleware program to intercept each SQL query and send it to one or all of the servers. Each server is running independently. Read-write requests are sent to all servers, so each server receives any changes. However, read-only requests are sent only to a single server, which enables read load balancing.



If you are simply broadcasting SQL statements that modify data, then similarrandom(),CURRENT_TIMESTAMPand sequence functions will produce different results on different servers. This is because each server runs independently and broadcasts SQL statements instead of modifying the rows. If this result is unacceptable, then the middleware or application must ensure that the values are always read from the same server and applied to the write request. It is also important to ensure that each transaction must be committed successfully or fully rolled back on all servers, or using two-phase commit (PREPARE TRANSACTION and commit PREPARED). Pgpool-ii and continuent Tungsten are examples of this scenario.



Asynchronous multi-master server replication



For servers with irregular connections, such as laptops or remote servers, it is cumbersome to keep data consistent between them. In this scenario, each server works independently and periodically communicates with other servers to identify conflicting transactions. Conflicts can be handled through user or conflict decision rules.



Synchronizing multi-master server replication



In this scenario, each server can accept the write request, and the modified data must be broadcast from the original server to all other servers before the transaction is committed. Excessive write actions result in excessive locking, which results in poor performance. In fact, the performance of simultaneous writes on multiple servers is always lower than the performance written on a single server. Read requests will be distributed evenly to each individual server. Some implementations use shared disks to reduce communication overhead. Synchronous multi-master replication scenarios are best suited for reading far more than writing. The advantage is that each server can accept write requests-so there is no need to partition workloads between master and slave servers. Because data changes are sent between servers, there is no adverse effect on nondeterministic functions (such asrandom()).



PostgreSQL does not provide this type of replication. However, the two-phase commit of PostgreSQL (PREPARE transaction and commit PREPARED) can be used to implement this functionality in the application layer or in the middleware code.



Business Solutions



Because PostgreSQL is open source and easily extensible, many companies have created a commercially closed source solution based on PostgreSQL, providing unique failover, replication, and load balancing capabilities.


Feature Shared Disk Failover File System Replication Transaction Log Shipping trigger-based Master-standby Replication statement-based Replication Middleware asynchronous Multimaster Replication synchronous multimaster Replication
Most Common implementation Nas Drbd Streaming REPL. Slony Pgpool-ii Bucardo
Communication Method Shared disk Disk Blocks WAL Table rows Sql Table rows Table rows and Row locks
No Special Hardware Required • • • • • •
Allows multiple master servers • • •
No master server Overhead • • •
No Waiting for multiple servers • With Sync off • •
Master failure'll never lose data • • With Sync on • •
Standby Accept Read-only Queries With hot • • • •
Per-table Granularity • • •
No Conflict resolution Necessary • • • • •


There are several solutions that are not suitable for the above categories:



Data partitioning



Data partitioning splits a table into datasets. Only one server per dataset can be modified. For example, data can be partitioned by office, for example, London and Paris, with one server per office. If the query requires data combined between London and Paris, the application can query two servers, or primary/standby replication can be used to keep a copy of the read-only data for other offices on each server.



Multi-server parallel query execution



Many of these solutions allow multiple servers to handle multiple queries, but do not allow a single query to use multiple servers for faster completion. This solution allows a single query to run concurrently on multiple servers. It is usually separated from the data between the servers to execute part of its query and returns the result to the central server, which is used to federate the result and return it to the user. PGPOOL-II has this ability. You can also use the Pl/proxy toolset implementation.





Two, multi-node cluster scheme comparison


Can be based on replication stream (stream replication).


Program
License Maturity Replication Method Sync Connection Pooling Load Balancing Query Partitioning
Pgcluster Bsd Stalled pause Master-master Synchronous No Yes No
Pgpool-i Bsd Stable statement-based Middleware Synchronous Yes Yes No
Pgpool-ii Bsd Recent release statement-based Middleware Synchronous Yes Yes Yes
Slony Bsd Stable Master-slave Asynchronous No No No
Bucardo Bsd Stable Master-master, Master-slave Asynchronous No No No
Londiste Bsd Stable Master-slave Asynchronous No No No
Mammoth Bsd Stalled Master-slave Asynchronous No No No
Rubyrep MIT Stalled Master-master, Master-slave Asynchronous No No No
BDR (bi-directional Replication) PostgreSQL (BSD) Beta Master-master
(No triggers needed)
Asynchronous No No No
Pg_shard Lgpl Recent release Statement-based Middleware (as an extension) Synchronous No Yes Yes


PostgreSQL high availability, load balancing, replication and cluster scenario introduction


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.