The extension of PostgreSQL replication and BDR

Last Update:2017-01-13 Source: Internet

Author: User

Tags postgresql psql

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In this chapter, you will be presented with a new technology to become a BDR. Two-way Replication (BDR), in the world of PostgreSQL, it is definitely a rising star. In the near future, many new things will be seen, and people can expect a thriving project.

This chapter will be about the following topics:

Understanding the BDR Replication concept

? Installing BDR

? Set up a simple cluster

? modifying clusters and Failovers

? Understanding the performance of BDR

Before digging through all the technical details, it is important to understand the basic techniques of BDR.

Understanding the BDR replication Concept

In the past, people had to use slony to replicate data before they were introduced in 9.0. The core problem with a solution like slony is that it requires an updated trigger that actually writes data two times. Trigger-based solutions are difficult to manage, can't handle DDLs, and generally work a little tricky.

BDR has been created to end a trigger-based solution and transform PostgreSQL into a stronger, more scalable, and simpler management solution. Trigger-based replication is really an outdated thing and should not be seen in modern infrastructure. You can bet bdr-it is a long-term, safe solution.

Understanding final Consistency

The cap comprehension is discussed in the earlier part of this book. This is an important component, and when a new database technology is evaluated, it should always be kept in mind. For BDR, the system is ultimately consistent. What does that mean? Wikipedia (Http://en.wikipedia.org/wiki/Eventual_consistency) provides the following definitions:

Final consistency is a consistent model used by distributed computing that is used to make it informally guaranteed to be highly available, and eventually all accesses to a given data will return the last updated value if there is no update for the given data.

This definition is so good and so simple that it is meaningful to put it here. The idea of eventual consistency is that the data is not immediately the same on all nodes, but over time it will actually be the same (if nothing happens). Eventual consistency also means that data is replicated asynchronously by default, so not all nodes see the same data at all times. You must expect to see slightly different data, depending on the host you are connected to.

[BDR also supports synchronous replication.] However, it is not as secure as the classic two-phase commit transaction. ]

Handling Conflicts

Given the consistency model, an important topic arises: What about the conflict? In general, all systems that use eventual consistency require some sort of conflict resolution. This is used for BDR.

The beauty of BDR is that conflict management is very resilient. By default, BDR provides a clear conflict management algorithm that defines the last update always wins.

However, it is also possible to write your own conflict resolution algorithm in the BDR. A server-side stored procedure can be used to define what conflicts are required to do. This mechanism provides users with maximum flexibility and helps them achieve more complex goals.

Another advantage of BDR is that conflicting modifications can be recorded in a table. In other words, if there is a conflict, it is still possible to reverse the ongoing project in the system. Data will not be silently forgotten, but will be saved for future investigations.

When it comes to conflict, the use of BDR must be kept in mind; BDR is designed as a (geographically) distributed database system that allows you to process large amounts of data. From a consistent point of view, one thing must be kept in mind: How likely is a person in New York and a person in Rome to change the same row on their nodes at the same time? If this is a conflict in the case of a table, the BDR is really not the right solution. However, if the conflict hardly occurs (this is the case for 99% of applications), BDR is really the option to choose. Keep in mind that as long as the same row is changed by many people, or if a primary key is violated, a conflict occurs. If two people change the two lines that are completely separate from each other, the conflict does not occur. Therefore, for some occasions, final consistency is a reasonable choice.

Distribution sequence

A sequence is a potential source of conflict. Imagine hundreds of users adding data to an identical table at the same time. Any autogrow column will immediately become the true source of the conflict, because the instance tends to repeat the same or similar numbers.

Therefore, the BDR provides a distributed sequence. Each node is not assigned a value and a series of values can then be used until the next series value is assigned. The availability of distributed sequences greatly reduces the number of potential conflicts and helps your system run more smoothly than other methods.

processing DDLs

The data structure is by no means constant. Once a period of time, structural changes occur. It is possible for a table to be added data, or a column must be deleted, and so on.

BDR can handle these operations well. Most DDLs are simply copied, as they are simply sent to all nodes and executed. However, there are also forbidden orders. Here are two more prominent:

ALTER TABLE ... ALTER COLUMN ... USING ();

ALTER TABLE ... ADD COLUMN ... DEFAULT;

Keep in mind that conflicts are likely to be concurrent, so there are not too many problems with these commands. In most cases, these restrictions do not matter. However, they must be kept in mind.

In many cases, there is a workaround for those commands, for example, setting explicit values, and other methods.

BDR The usage scenario

For BDR, there are both good usage scenarios and bad usage scenarios. Of course, this rule applies to every piece of software. However, the database system is a bit different and it is necessary to think carefully before making a decision.

yes, okay. BDR Usage Scenarios

In general, the BDR works best if a particular data set is modified only on one node. This greatly reduces the likelihood of conflict and helps you enjoy a smooth process.

What does it actually mean to modify on one node? We assume that there are three locations: Vienna, Berlin, London. In the case of Germans working in Berlin, the possibility of modifying the data of Austria in Vienna is even greater. German operators usually modify the data of German customers rather than Austrian or British data.

Austrian operators are highly likely to change data in Austria. Each operator should see all of the company's data in every country. More likely, however, is where the data is created and where it is changed. From a commercial point of view, this is basically impossible, two people at the same time in different countries change the same data-conflict is not possible.

In addition, if the workload is composed primarily of write operations, not update or delete operations. The plug-in is unlikely to cause a conflict, so it is a good workload for BDR.

bad. BDR Usage Scenarios

However, there are also some generally, non-beneficial workloads for BDR, and if consistency is your primary goal and is the key to your application, then using BDR is certainly not the right choice. Because of the asynchronous nature of the product, conflict and consistency can be offset against each other.

Another bad scenario for BDR is that BDR is difficult if all nodes need to see exactly the same data at the same time.

BDR can write data effectively. However, it cannot extend the write operation indefinitely, because at this point, all the write operations still end at each server. Keep in mind that you can extend a write operation only by actually splitting the data. So, if you're looking for a scenario that can better extend the write operation, Pl/proxy might be a better choice.

Logical decoding of tricks

The main concept behind the BDR is logical decoding. As already mentioned in this book, logic decoding has been invented to dissect transaction log streams and convert binary logs into a more readable format, such as SQL. The advantage of a logical flow compared to a binary stream is that replication can occur more easily at the version boundary.

Another important advantage is that there is no need to synchronize physical xlog locations. As the book shows in the previous chapters, the Xlog address is very important to make things work and cannot be changed. Therefore, Xlog-based replication is always single-master and multi-slave replication. There is no way to unify two binary xlog streams into a changed stream. The logic decoding solves that problem in an elegant way, because it leaves the whole xlog synchronization problem. Replicating real-world changes in SQL format gives you a lot of flexibility and new operations for future improvements.

The whole xlog decoding thing is basically done behind the scenes; the end user won't notice it.

installation BDR

Installing BDR is easy. The software is available as a source package, and it can be deployed directly using a binary package. Of course, it is also possible to install from the source code. However, this process may change as more and more changes are transferred to the PostgreSQL kernel. So I decided to skip the source code installation.

Installing binary Packages

In the previous chapters, you learned how to install BDR on a Linux system using a precompiled binary package. The selected display installation work is if the Linux distribution is in CentOS 7 (to find information about other packages, check http://bdr-project.org/docs/stable/installation.html)

The installation process itself is simple and first installs the repo:

Yum Install http://packages.2ndquadrant.com/postgresql-bdr94-2ndquadrant/yum-repo-rpms/ postgresql-bdr94-2ndquadrant-redhat-1.0-2.noarch.rpm

In the next steps, you can deploy the BDR:

Yum Install POSTGRESQL-BDR94-BDR

Once the DBR is installed on all nodes, the system is ready to run.

[Keep in mind that BDR is still in a very early state of development, so the subsequent process may change over time. ]

Set up a simple cluster

Once the installation is complete, it is time to start and actually set up a simple cluster. In this scenario, a cluster consisting of three nodes is created.

Note that to make it easier for beginners, all data nodes will be installed on the same physical server.

Scheduling storage

The first thing an administrator has to do is create some space for PostgreSQL. In this simple example, only three directories are created:

[Email protected] ~]# Mkdir/data

[Email protected] ~]# Mkdir/data/node1/data/node2/data/node3

Make sure these directories belong to Postgres (this is usually a good idea if you run PostgreSQL with postgres users):

[Email protected] ~]# cd/data/

[Email protected] data]# chown postgres.postgres node*

Once these directories are created, you have everything you need for a successful setup:

[Email protected] data]# ls-l

Total 0

Drwxr-xr-x 2 postgres postgres 6 APR 05:51 Node1

Drwxr-xr-x 2 postgres postgres 6 APR 05:51 Node2

Drwxr-xr-x 2 postgres postgres 6 APR 05:51 node3

Create a DB instance

After creating some space for our experiment, it made sense to cross-check our system paths. Make sure that the correct version of PostgreSQL is in your path. Some users reported problems with the version of PostgreSQL that was provided by some operating systems during the installation process because of the unexpected. Therefore, it makes sense to simply check the path and make the appropriate settings, if necessary:

Export Path=/usr/pgsql-9.4/bin: $PATH

Then, you can create three DB instances. The INITDB command can be used in any usual situation:

[Email protected] ~]$ initdb-d/data/node1/-A Trust

[Email protected] ~]$ initdb-d/data/node2/-A Trust

[Email protected] ~]$ initdb-d/data/node3/-A Trust

To make the installation process simpler, trust will be used as a validation method. Of course, user authentication is possible, but it's not at the heart of this chapter, so it's best to simplify this part as much as possible.

Now that you have created three DB instances, you can adjust the postgresql.conf. The following parameters are required:

shared_preload_libraries = ' BDR '

Wal_level = ' logical '

Track_commit_timestamp = On

max_connections = 100

Max_wal_senders = 10

Max_replication_slots = 10

Max_worker_processes = 10

The first thing to do is to load the BDR module into PostgreSQL. It contains important infrastructure for replication. Next, you must enable logical decoding. It will be the backbone of the entire infrastructure.

To get the BDR to work, you have to open the Track_commit_timestamp. In standard PostgreSQL9.4, this setting does not exist. It will most likely appear with BDR in future versions of PostgreSQL. It is necessary to know the timestamp of the commit to the BDR's internal conflict resolution algorithm (final victory).

The max_wal_senders must then be set up with the replication slots. Streaming replication also requires these settings, nor should it be a big surprise.

Finally, there is a max_worker_processes. As soon as PostgreSQL is started, BDR initiates some client work processes in the background. These worker processes are based on the standard background work API and are required to process data transfers during replication. It is essential to ensure that there are enough processes available.

Finally, there are some conflict-related settings that can be used:

# Handling conflicts

#bdr. default_apply_delay=2000 # milliseconds

#bdr. Log_conflicts_to_table=on

Now that postgresql.conf has been configured, it's time to focus on the pg_hba.conf. In the simplest case, a simple replication rule must be created:

Local Replication Postgres Trust

Host Replication Postgres 127.0.0.1/32 Trust

Host replication postgres:: 1/128 Trust

Please note that in a real, efficient setup, a sensible manager configures a special replication user and sets a password, or uses other authentication methods. For the sake of simplicity, this process has been ruled out here.

Make the database start working just like the normal PostgreSQL:

Pg_ctl-d/data/node1/start

Pg_ctl-d/data/node2/start

Pg_ctl-d/data/node3/start

For our tests, we need one instance of the database:

Createdb test-p 5432

Createdb test-p 5433

Createdb test-p 5434

Load the module and start the cluster

So far, very good! In order to ensure that the BDR can do its job, it must be loaded into the database. Two extensions are required, namely btree_gist and BDR:

[Email protected] node1]$ psql test-p 5432

test=# CREATE EXTENSION btree_gist;

CREATE EXTENSION

test=# CREATE EXTENSION bdr;

CREATE EXTENSION

These extensions must be loaded into the three databases that were created earlier. It is not enough to simply load them into one component. It is critical to load them into all the databases.

Finally, our database nodes must all be joined to a BDR group. So far, there are only three separate DB instances, which exactly include the thank you module. In the next step, these nodes will connect to each other.

The first thing to do is to create a BDR group:

test=# SELECT Bdr.bdr_group_create (

Local_node_name: = ' Node1 ',

NODE_EXTERNAL_DSN: = ' port=5432 dbname=test '

);

Bdr_group_create

------------------

(1 row)

Basically, you need two parameters: a local name and a database connection to a node from a remote host. The simplest way to define local_node_name is to give the node a simple name.

To check if the node is ready for BDR, call the following function. If the answer is the same as the following, it means there is no problem with the configuration:

test=# SELECT Bdr.bdr_node_join_wait_for_ready ();

Bdr_node_join_wait_for_ready

------------------------------

(1 row)

Now is the time to add the other nodes to the replication system:

test=# SELECT Bdr.bdr_group_join (

Local_node_name: = ' Node2 ',

NODE_EXTERNAL_DSN: = ' port=5433 dbname=test ',

JOIN_USING_DSN: = ' port=5432 dbname=test '

);

Bdr_group_join

----------------

(1 row)

Again, a null value is a good sign. First, a second node is added to the BDR. Then, a third node can join:

test=# SELECT Bdr.bdr_group_join (

Local_node_name: = ' node3 ',

NODE_EXTERNAL_DSN: = ' port=5434 dbname=test ',

JOIN_USING_DSN: = ' port=5432 dbname=test '

);

Bdr_group_join

----------------

(1 row)

Once all the nodes have been added, the administrator can check if all the nodes are ready:

[Email protected] node2]$ psql test-p 5433

test=# SELECT Bdr.bdr_node_join_wait_for_ready ();

Bdr_node_join_wait_for_ready

------------------------------

(1 row)

[Email protected] node2]$ psql test-p 5434

test# SELECT Bdr.bdr_node_join_wait_for_ready ();

Bdr_node_join_wait_for_ready

------------------------------

(1 row)

If all two queries return null, that means the system is running well.

Check your settings

After this simple process, the BDR is up and running. To check if all the work is expected, it is meaningful to check the relevant replication process:

[[Email protected] ~]$ PS Ax | grep BDR

31296? Ss 0:00 POSTGRES:BGWORKER:BDR Supervisor

31396? Ss 0:00 postgres:bgworker:bdr Db:test

31533? Ss 0:00 POSTGRES:BGWORKER:BDR Supervisor

31545? Ss 0:00 POSTGRES:BGWORKER:BDR Supervisor

31553? Ss 0:00 postgres:bgworker:bdr Db:test

31593? Ss 0:00 postgres:bgworker:bdr Db:test

31610? Ss 0:00 postgres:bgworker:bdr (6136360420896274864,1,16385,)->bdr (6136360353631754624,1,

...

31616? Ss 0:00 postgres:bgworker:bdr (6136360353631754624,1,16385,)->bdr (6136360420896274864,1,

As you can see, each instance will have at least three bdr processes. If these processes are all in, this is usually a good sign, and replication should work as expected.

A simple test can reveal whether the system is working:

test=# CREATE TABLE t_test (id int, t timestamp DEFAULT now ());

CREATE TABLE

After the table is created, the structure should look like this:

test=# \d T_test

Table "Public.t_test"

Column | Type | Modifiers

--------+-----------------------------+---------------

ID | Integer |

T | Timestamp without time zone | Default Now ()

Triggers:

Truncate_trigger after truncate on t_test for each STATEMENT EXECUTE PROCEDURE bdr.queue_truncate ()

This table looks as expected. There is only one exception: a truncate trigger is created automatically. Keep in mind that replication slots is capable of streaming insert, UPDATE, and DELETE statements. DDLs and truncate are now line-level information, so these statements are no longer in the flow. The trigger is required to capture the truncate and copy it into plain text. Do not attempt to change or delete the trigger.

To test replication, a simple INSERT statement can work:

test=# INSERT into T_test VALUES (1);

INSERT 0 1

test=# TABLE t_test;

ID | T

----+----------------------------

1 | 2015-04-11 08:48:46.637675

(1 row)

In this example, the value has been added to the listener 5432 instance. A quick check shows that the data has been well copied to the listener 5433 and 5434 instances:

[Email protected] ~]$ psql test-p 5434

test=# TABLE t_test;

ID | T

----+----------------------------

1 | 2015-04-11 08:48:46.637675

(1 row)

Handling conflicts

As it was earlier in this chapter, conflict is an important thing when working with BDR. Keep in mind that the BDR is designed as a distributed system, so it makes sense to use it when the conflict is unlikely. However, it is important to understand what is going on in the event of a conflict.

To show what happened, here's a simple table:

test=# CREATE TABLE t_counter (id int PRIMARY KEY);

CREATE TABLE

Then, add a line:

test=# INSERT into T_counter VALUES (1);

INSERT 0 1

To run the test, a simple SQL query is necessary. In this example, 10,000 UPDATE statements are used:

[Email protected] ~]$ head-n 3/tmp/script.sql

UPDATE t_counter SET id = id + 1;

Now this script is executed three times, one node at a time:

[email protected] ~]$ cat run.sh

#!/bin/sh

Psql test-p 5432 </tmp/script.sql >/dev/null &

Psql test-p 5433 </tmp/script.sql >/dev/null &

Psql test-p 5434 </tmp/script.sql >/dev/null &

The number of conflicts is expected to soar as the same line hits again and again.

[Please note that this is not the purpose that BDR originally established.] It's just a demo to show what happened in the conflict event. ]

Once these three scripts are complete, you can check out what's going on in the conflict:

test=# \x

Expanded display is on.

test=# TABLE bdr.bdr_conflict_history LIMIT 1;

-[RECORD 1]------------+------------------------------

conflict_id | 1

Local_node_sysid | 6136360318181427544

Local_conflict_xid | 0

LOCAL_CONFLICT_LSN | 0/19aae00

Local_conflict_time | 2015-04-11 09:01:23.367467+02

Object_schema | Public

object_name | T_counter

Remote_node_sysid | 6136360353631754624

Remote_txid | 1974

Remote_commit_time | 2015-04-11 09:01:21.364068+02

REMOTE_COMMIT_LSN | 0/1986900

Conflict_type | Update_delete

conflict_resolution | Skip_change

Local_tuple |

Remote_tuple | {"id": 2}

Local_tuple_xmin |

Local_tuple_origin_sysid |

Error_message |

Error_sqlstate |

error_querystring |

Error_cursorpos |

Error_detail |

Error_hint |

Error_context |

Error_columnname |

Error_typename |

Error_constraintname |

Error_filename |

Error_lineno |

Error_funcname |

The BDR provides a simple and very easy way to read a table that contains all the conflicting rows. The LSN at the top, the transaction ID, and more information about the display of the conflict. In this example, the BDR has made a skip_change solution. Remember that each changed row will hit the same row because we are asynchronous multi-master. This is very annoying for bdr. In this case,

The UPDATE statement is indeed skipped; it is important to understand this. BDR can skip conflicts or changes in your cluster for concurrent events.

Understanding Collections

So far, the entire cluster has been used. Everyone is able to copy data to other people. In many cases, this is not required. The BDR is more resilient in this respect.

One-way replication

The BDR is capable of both bidirectional replication and bidirectional replication. In some cases, this is very convenient. Consider a system that only provides read services. A simple one-way slave may be what you need.

The BDR provides a simple function to register a node as a one-way slave:

Bdr.bdr_subscribe (Local_node_name,

SUBSCRIBE_TO_DSN,

NODE_LOCAL_DSN,

Apply_delay integer DEFAULT NULL,

Replication_sets text[] Default array[' default '],

Synchronize Bdr_sync_type DEFAULT ' full ')

Of course, it is also possible to remove a node from one-way replication:

Bdr.bdr_unsubscribe (Local_node_name)

The installation process is very simple and suitable for the basic design principles of BDR.

Working with data tables

The beauty of BDR is that it does not need to replicate the entire instance to a cluster. Replication can be very granular, and administrators can decide what data to copy to. Two functions can be used for a table copy collection:

Bdr.table_set_replication_sets (p_relation Regclass, p_sets text[])

This sets the replication collection for a table. The initial allocation will be overwritten.

If you want to know which replication collection a table belongs to, you can call the following function:

Bdr.table_get_replication_sets (Relation regclass) text[]

We will see more features in the partial replication area along with the development of BDR. It will allow you the flexibility to schedule data as needed.

Control replication

For maintenance reasons, it may be necessary to maintain and restore replication again and again. Just consider a major software update. It might do some nasty things about your data structure. You absolutely don't want to have the wrong stuff copied to your system. Therefore, it is convenient to stop copying and restart it once it proves to be normal.

Two functions can be used for this work:

SELECT Bdr.bdr_apply_pause ()

To restart again, you can use the following function:

SELECT Bdr.bdr_apply_resume ()

Connections to remote nodes (or nodes) are persisted, but data cannot be read from them. The effect of pausing a request is not long, so if PostgreSQL is restarted or postmaster is resumed after a background failure, replay will be resumed again. Aborting personal backstage use pg_terminate_backend will not cause replay to recover, or will reload postmaster without requiring a full restart. No option to pause a replay from a unique peer node

Summarize

BDR is a rising star in the replication world of PostgreSQL. Currently, it is still in development, and we can expect more in the near future (perhaps when you hold the book in your Hand).

The BDR is an asynchronous multi-master and allows people to run a geographically distributed database. It is important to remember that when the replication conflict rate is low, the BDR is particularly useful.

The extension of PostgreSQL replication and BDR

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More