The fifth chapter of PostgreSQL replication set up synchronous replication (1)

Source: Internet
Author: User
Tags postgresql

So far, we have processed file-based replication (or log shipping) and simple stream-based replication settings. In both cases, after the transaction is committed on master, the data is submitted and received by slave. It will still be lost at the time that master commits and slave actually receives the data completely.

In this chapter, we will study the following topics:

• Ensure that no transactions are lost

• Configure PostgreSQL synchronous replication

• Understand and use application_name

• Performance impact of synchronous replication

• Speed optimization of replication

5.1 Setting up synchronous replication

As mentioned earlier, synchronous replication has been used to do everything in its cost to protect your data. The core idea of synchronous replication is that the transaction must exist at least two servers before master returns success to the client. Setting up synchronous replication works just like setting up asynchronous replication. Just a handful of the parameters discussed in this chapter must be changed to enjoy the blessing of synchronous replication. However, if you want to create your own settings on the basis of synchronous replication, we recommend starting with asynchronous settings and gradually extending your configuration to turn it into synchronous replication. This will allow you to debug your system more easily, and you can avoid many of the problems in the process.

5.1.1 Understanding the drawbacks of synchronous replication

The most important thing you have to know about synchronous replication is that it is quite expensive. Do you remember the first chapter on Cap theory, about the speed of light, and so on? Synchronous replication and its drawbacks are one of the core reasons why we decided to include all of this background information in this book. Understanding the physical limitations of synchronous replication is necessary, or you may end up in a dilemma.

When setting up synchronous replication, try to keep the following points in mind:

• Minimize delay

• Ensure that you have redundant delays

• Synchronous replication is more expensive than asynchronous replication

5.1.2 Understanding Application_name Parameters

Application_name has an important role to play in the synchronization settings. In a typical application, people use the Application_name parameter for debugging purposes. It can help you track bugs, determine what an application is doing, and so on:

test=# SHOW application_name;

Application_name

------------------

Psql

(1 row)

test=# SET application_name to ' whatever ';

SET

test=# SHOW application_name;

Application_name

------------------

Whatever

(1 row)

As you can see, it is possible to set the Application_name parameter freely. This setting is available for the session that we are in, as long as we disconnect will disappear. The question now is: What do application_name have to do in synchronous replication?

Next, the story goes like this: if slave is connected to master, it sends a application_name as part of the Primary_conninfo setup. If this application_name happens to be the first entry for Synchronous_standby_names, slave will become a synchronization server.

[In the case of cascading replication (which means that a slave is once again connected to a slave), the cascaded slave will not be viewed synchronously. ]

Keep in mind all this information, we can go ahead and configure our first synchronous replication.

5.1.3 makes synchronous replication work

In order to show you that synchronous replication is working, this chapter will include a complete working example that lists all the relevant configuration parameters.

Several changes must be made on master. The postgresql.conf file on master will require the following settings:

Wal_level = Hot_standby

Max_wal_senders = 5 # or any number

Synchronous_standby_names = ' Book_sample '

Hot_standby = On

# on the slave to make it

Then we have to configure pg_hba.conf as we have said in the previous chapters to see. After that, the server can be restarted and master ready to act.

[We recommend also setting up wal_keep_segments to keep more transaction logs on the master database. This makes the entire setting more robust. ]

Next, we can perform a basic backup, as we have done before. We must raise the pg_basebackup in slave. Theoretically, we have included the transaction log when we do the basic backup (--xlog-method=stream). This allows us to start the server quickly and without any significant risk.

[--xlog-method=stream and wal_keep_segments are a good combination, and in most cases, it seems to us that it is a guarantee of a system that sets up perfectly and safely to work. ]

We have suggested that setting up the Hot_standby;config file on master will be copied anyway, so you can save one step to change this setting in postgresql.conf. Of course, this is not art, but a convenient, practical way.

Once the underlying backup is complete. We can move forward and write a simple recovery.conf file for synchronous replication.

imac:slavehs$ Cat recovery.conf

Primary_conninfo = ' Host=localhost

Application_name=book_sample

port=5432 '

Standby_mode = On

Config file looks the same as before. The only difference is that we have added Application_name. Note The Application_name parameter must be the same as the Synchronous_standby_names setting on master.

Once we have finished recovery.conf, we can start slave. In our example, slave and master are on the same server. In this case, you must make sure that the two instances take different TCP ports. Otherwise, the second startup instance will not start. The port can be easily changed in postgresql.conf.

After you complete these steps, the DB instance is ready to start. Slave will check its connection information and connect to master. Once it replays all the related transaction logs, it will be in sync, and master and slave will remain exactly the same data since then.

5.1.4 Checking for replication

Now that we have started the DB instance, we can connect to the system to see if the system is working properly.

To check for replication, we can connect to master to take a look at pg_stat_replication. For this check, we can connect to our (master) instance within any database.

postgres=# \x

Expanded display is on.

postgres=# SELECT * from Pg_stat_replication;

-[RECORD 1]----+------------------------------

PID | 62871

Usesysid | 10

Usename | Hs

Application_name | Book_sample

client_addr | :: 1

Client_hostname |

Client_port | 59235

Backend_start | 2013-03-29 14:53:52.352741+01

State | Streaming

sent_location | 0/30001e8

write_location | 0/30001e8

flush_location | 0/30001e8

replay_location | 0/30001e8

sync_priority | 1

Sync_state | Sync

This system view will display a row for each slave connected to your Master System.

[\x will make the output more readable to you.] If you do not use \x to transform the output, these lines are long and it can be quite difficult for you to understand the contents of the table. In extended display mode, each column becomes a row. ]

As you can see, the Application_name parameter has been passed from slave (book_sample in our example) to master from the connection string. Since the Application_name parameter matches the Synchronous_standby_names setting of master, we have replicated the system synchronously, and no more transactions will be lost because each transaction will be present on two servers in an instant. The Sync_state setting tells you how the data is moved from master to slave.

[You can also use a series of application_names in Synchronous_standby_names or just one * to indicate that the first slave must be synchronous. ]

5.1.5 Understanding Performance Issues

In the different viewpoints in this book we have been able to point out that synchronous replication is an expensive thing. Keep in mind that we have to wait for a remote server, not just a local system; The network between the two nodes needs to be accelerated. Writing to more than one node is always more expensive than writing a single node. Therefore, we must keep an eye on speed, or you may face some nasty surprises.

[Think about what we learned in the previous cap theory in this book; the impact of synchronous replication with severe physical constraints on performance is quite obvious.] ]

The main question you really need to ask yourself is: Do you really want to replicate all the transactions synchronously? In many cases, you do not need to. To prove our point, let's imagine a typical scenario: a bank wants to store accounting-related data and some log data. We will never want to lose millions of because of a database node failure. This kind of database may be worthwhile and we can replicate it synchronously. However, the log data is completely different. It is too expensive to respond with synchronous replication. So, we want to replicate this data asynchronously to ensure maximum throughput.

How do we configure a system to handle important and unimportant transactions? The answer is on one of the variables you saw earlier in this book: the Synchronous_commit variable.

5.1.6 set Synchronous_commit to On

In the default PostgreSQL configuration, the Synchronous_commit is already set to on.

In this case, the commit will wait until the current synchronization of the standby indicates that it has received the transaction's commit record and flushed it to disk. In other words, both servers must report that the data has been written securely. Unless two servers crash at the same time, your data will be exempt from potential problems. (Two server crashes are not possible at the same time).

Set Synchronous_commit to be Remote_write

Flushing to two disks is very expensive. In many cases it is sufficient to know that the remote server has received Xlog and passed it to slave's operating system without flushing to disk, as we can be sure that we will not lose both servers at the same time, which is a reasonable compromise between performance and consistent data protection.

Set Synchronous_commit to be off

We have already dealt with this setting in the previous chapters. The idea is to delay the Wal write to reduce disk refreshes. If performance is more important than durability, it can be used. In the case of replication, this means that we do not perform replication in a completely synchronous way.

Keep in mind that this can have a serious impact on your application. Imagine the transaction commit on master and you want to immediately query the data on the slave. There is still a small window during the period when you actually get the data that is out of date.

Set Synchronous_commit to be Local

Local will refresh locally without waiting for a copy response. In other words, it transforms your transaction into an asynchronous transaction. Setting Synchronous_commit to local will cause a time window where slave will return slightly stale data during this window. You should remember this behavior when you decide to unload the read slave. In short, if you want to replicate synchronously, you must make sure that Synchronous_commit is either set to on or set to Remote_write.

Changing durability settings While the system is running

It is easy to change the way data is replicated while the system is running. In this chapter, we have adjusted the Synchronous_standby_names (master) and application_name (slave) parameters

A fully synchronized replication infrastructure is established. The benefit of PostgreSQL is that you can change your durability requirements when the system is running:

test=# BEGIN;

BEGIN

test=# CREATE TABLE t_test (id int4);

CREATE TABLE

test=# SET synchronous_commit to local;

SET

test=# \x

Expanded display is on.

test=# SELECT * from Pg_stat_replication;

-[RECORD 1]----+------------------------------

PID | 62871

Usesysid | 10

Usename | Hs

Application_name | Book_sample

client_addr | :: 1

Client_hostname |

Client_port | 59235

Backend_start | 2013-03-29 14:53:52.352741+01

State | Streaming

sent_location | 0/3026258

write_location | 0/3026258

flush_location | 0/3026258

replay_location | 0/3026258

sync_priority | 1

Sync_state | Sync

test=# COMMIT;

COMMIT

In this case, we have changed the durability of the system in the course of the system's operation. It will ensure that this particular transaction does not wait for slave to flush the disk. Note, as you can see, that Sync_state has not changed. Don't be fooled by what you see here; you can rely entirely on the behavior described in this section. PostgreSQL is fully capable of handling every transaction individually. This is a unique feature of this wonderful open source database, which allows you to control and let you decide which durability requirements to use.

The fifth chapter of PostgreSQL replication set up synchronous replication (1)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.