After performing your first instant recovery (pitr,point-in-time-recovery), we are ready to work on a real replication setup. In this chapter, you will learn how to set up asynchronous replication and flow. Our goal is to ensure that you can achieve higher availability and higher data security.
In this chapter, we will discuss the following topics:
• Configuring asynchronous replication
• Understanding Flow
• Merge streams and Archives
• Manage Timelines
At the end of this chapter, you will easily set up stream replication within minutes.
4.1 Setting up Stream replication
In the previous section, we have recovered from the simple 16MB xlog file. Logically, the replay process can only replay 16MB at a time. This can cause a delay in your replication settings, because you must wait until the master DB instance has finished creating 16MB. In many cases, this delay is unacceptable.
[Lost the last Xlog file, the file has not yet been finalized (and therefore does not send the archive, due to a crash loss), is often why people report the core cause of data loss in the case of instant recovery (pitr,point-in-time-recovery). ]
In this scenario, stream replication will resolve your issue. With stream replication, the latency of replication will be minimal and you can enjoy some extra level of protection for your data.
Let's talk about the overall architecture of the PostgreSQL streaming infrastructure. Shows the basic system design:
You have seen this type of schema. We have added a stream connection here. It is basically a normal database connection, just as you would use any other application's connection. The only difference is that, in the case of a stream connection, the connection is in a special mode that can carry xlog.
4.1.1 Adjusting the configuration file on the master server
The question now is: how do you make a stream connection exist? In the previous example, most of the infrastructure has been done. On master, the following settings must be set:
Wal_level must be set to Hot_standby
Max_wal_senders must be set to a reasonable higher value to support enough slaves
[Archive_mode and Archive_command how to set? Many people use stream replication to make their systems replicate more data to slave as much as possible. In addition, file-based replication is often used to ensure that there is an additional layer of security. Basically, the two mechanisms use the same technology, and only the Xlog source is different in stream-based and archive-based recovery. ]
Since Master knows that it should generate enough xlog to handle xlog send and so on, we can proceed to the next step.
For security reasons, you must configure master to stream replication connections. This needs to change pg_hba.conf as shown in the previous section. Second, this requires running pg_basebackup and subsequent stream connections. If you are using the traditional way to do the base backup, you also need to allow replication to connect to the streaming xlog, so this step is necessary.
Once your master has been successfully configured, you can restart the database (to make wal_level and max_wal_senders work) and continue to work on slave.
4.1.2 Handling Pg_basebackup and recovery.conf
By now, you have seen that process perform a normal instant recovery (pitr,point-in-time-recovery) with absolute consistency. The only difference currently is wal_level, which must be configured with different parameters for normal instant recovery (pitr,point-in-time-recovery). This is the same technique, no difference.
In order to get the base Backup library we use Pg_basebackup as shown in the previous section. Here is an example:
imac:dbhs$ pg_basebackup-d/target_directory \
-H sample.postgresql-support.de\
--xlog-method=stream
Now that we've made a basic backup, we can go and configure the stream. To do this, we have to write a file called recovery.conf (just like before). The following is a simple example:
Standby_mode = On
primary_conninfo= ' host=sample.postgresql-support.de port=5432 '
We have two new settings:
Standby_mode: This setting ensures that PostgreSQL will not stop once the xlog is exhausted. Instead, it waits for the new xlog to arrive. To ensure that this setting is required for the second server as standby, standby continues to replay Xlog.
Primary_conninfo: This setting tells us where the slave can find master. You must place a standard PostgreSQL connection string, (as in LIBPQ) Primary_conninfo as the core tells PostgreSQL to send Xlog.
For basic settings, these two settings are completely sufficient. Now all we have to do is start slave, just as you start a normal DB instance.
imac:slavehs$ pg_ctl-d. Start
Server starting
Log:database system was interrupted; Last known up
At 2013-03-17 21:08:39 CET
Log:creating Missing WAL Directory
"Pg_xlog/archive_status"
log:entering Standby mode
Log:streaming Replication Successfully connected
To Primary
Log:redo starts at 0/2000020
Log:consistent recovery state reached at 0/3000000
The DB instance has successfully started. It detects that the normal operation has been interrupted. Then it goes into Standby mode and starts transferring Xlog from the primary stream. PostgreSQL then arrives in a consistent state, and the system is ready to act.
4.1.3 Make slave readable
So far, we have only set the stream to lose. Slave has started to consume the transaction log from master, but it is not yet readable. If you try to connect to the instance, you will face the following scenario:
imac:slavehs$ psql-l
Fatal:the database system is starting up
Psql:FATAL:the database system is starting up
This is the default configuration. Slave instances are always backup mode and remain replayed xlog.
If you want slave to be readable, you must have the Postgresql.conf;hot_standby on the slave system set to ON. You can set this directly, but you can also change it later, simply restarting the slave instance when you need this feature.
imac:slavehs$ pg_ctl-d. Restart
Waiting for server-shut down ....
Log:received Smart Shutdown Request
Fatal:terminating Walreceiver process due to Administrator command
Log:shutting down
Log:database system is shut down
Done
Server stopped
Server starting
Log:database system was shut under recovery at 2013-03-17 21:56:12
Cet
log:entering Standby mode
Log:consistent recovery state reached at 0/3000578
Log:redo starts at 0/30004e0
Log:record with zero length at 0/3000578
Log:database system is ready for accept read Only connections
Log:streaming replication successfully connected to primary
Restarting will shut down the server and start it back up again. It's not too much of a surprise; however, it's worth looking at the logs. As you can see, a process called walreceiver is terminated.
Once we do a backup and run, we can connect to the server. Logically, we are only allowed to perform read-only operations:
test=# CREATE TABLE x (id int4);
Error:cannot Execute CREATE TABLE in a read-only transaction
As expected, the server will not accept writes. Remember, slave is read-only.
4.1.4 Underlying protocol
When using stream replication, you should be aware of two processes:
Wal_sender
Wal_receiver
Wal_sender instances are processes that provide xlog to their slave on the master instance called the Wal_receiver process. Each slave has a wal_receiver process, and this process is connected to a wal_sender process in the data source.
How does the whole thing work inside? As we said before, the connection from slave to master is basically a normal database connection. The transaction log takes the same approach as the Copy command. Inside copy mode, PostgreSQL transmits information back and forth using a micro-language. The main advantage is that this micro-language has its own parser, so it is possible to add functionality quickly and in a fairly easy, non-intrusive way. By the PostgreSQL9.2, the command is supported:
Identify_system
start_replication <position>
Base_backup
°°[label ' LABEL ']
°°[progress]
°°[fast]
°°[wal]
°°[nowait]
What you see is the protocol level provided with Pg_basebackup as the command line flag.
The fourth chapter of PostgreSQL replication set up asynchronous replication (1)