4.5 Make stream replication more robust
When connecting to master, the first thing slave to do is catch up with master. But is this going to work all the time? We've seen that we can use a mix of stream-based and file-based settings. This gives us some extra security to prevent the flow from not working.
In real-world scenarios, the two methods of transmitting xlog may be overly complex. In many cases, using a stream is sufficient. The crux of the problem is that in a normal setup as already described, you can discard Xlog as long as you no longer need xlog to fix master,master. Depending on your checkpoint configuration, Xlog may exist for quite a long time, or only for a short time. The trouble is: if you slave connect to master, it may happen that the expected Xlog is no longer in this case, slave cannot resynchronize itself. You may find this somewhat annoying because it implicitly limits your slave to the maximum downtime of your master checkpoint behavior.
Obviously, this can lead to problems on the production system. To make your setup more robust, we recommend using wal_keep_segments heavily. The idea of this postgresql.conf setup is to keep master in more Xlog files than is theoretically needed. If you set the variable to 1000, it means that master will keep the xlog above 16GB. In other words, your slave can disappear 16GB compared to normal (convert to master). This greatly increases the slave's advantage of joining a cluster without having to fully synchronize itself from the beginning. For a 500MB database this is not worth mentioning, but if your setup needs to accommodate hundreds of G or t of data, this is a huge advantage. Generating a basic backup of a 20TB instance is a lengthy process, and you probably don't want to do it too often, and you certainly don't want to do it over and over again.
The fourth chapter of PostgreSQL replication set up asynchronous Replication (5)