Checkpoints can is a major drag on Write-heavy PostgreSQL installations. The first step toward identifying issues in this "to" monitor how often they happen, which just got a easier to use Interface added to the database recently.
Checkpoints is periodic maintenance operations the database performs to make sure that everything it's been caching in Me Mory have been synchronized with the disk. The idea is this once you ' ve finished one, you can eliminate needing to worry about older entries placed into the Write-ah EAD log of the database. That is means less time to recover after a crash.
The problem with checkpoints are that they can being very intensive, because to complete one requires W Riting every single bit of changed data in the database ' s buffer cache out to disk. There were a number of features added to PostgreSQL 8.3 so allow you to better monitor the checkpoint overhead, and to L Ower it by spreading the activity over a longer period of time. I wrote a long article about those changes called checkpoints and the Background writer that goes over what Chan GED, but it ' s pretty dry reading.
What are probably want to know are how to monitor checkpoints on your production system, and how to Tell if they ' re happening too often. Even though things have improved, "Checkpoint spikes" where disk I/O becomes really heavy is still possible even in Curre NT PostgreSQL versions. and it doesn ' t help, the default configuration is tuned-very low disk space and fast Crash recovery rather than performance . The checkpoint_segments parameter that's one input on what often a checkpoint happens defaults to 3, which forces a Checkpo int after only 48MB of writes.
You can find out checkpoint frequency and ways. You can turn on log_checkpoints and watch What's happens in the logs. You can also with the pg_stat_bgwriter view, which gives a count of each of the both sources for checkpoints (time Passing and writes occurring) as well as statistics on how much work they did.
The main problem with making this easier to does is the until recently, it's been impossible to reset the counters inside o F Pg_stat_bgwriter. That means you has to take a snapshot with a timestamp on it, and wait a while, take another snapshot, then subtract all the Values to derive any useful statistics from the data. That ' s a pain.
Enough of a pain that i wrote a patch to make it easier. With the current development version of the database, you can now call pg_stat_reset_shared ( ' Bgwriter ') and pop all these values back to 0 again . This allows following a practice the used to being common on PostgreSQL. Before 8.3, there is a parameter named Stats_reset_on_server_start you could turn on. That's reset all of the server's internal statistics each time you started it. That meant this could call the handy pg_postmaster_start_time () function, compare with the current time, and always ha ve an accurate count in terms of Operations/second of any statistic available on the system.
It's still not automatic, and now that resetting these gkfx pieces is possible you can do It yourself. The first key is to integrate statistics clearing into your server startup sequence. A script like this would work:
pg_ctl start -l $PGLOG -w
psql -c "select pg_stat_reset();"
psql -c "select pg_stat_reset_shared(‘bgwriter‘);"
Note the "-W" on the start command there–that'll make Pg_ctl wait until the server is finished starting before it return S, which is vital if your want to immediately execute a statement against it.
If you've done so, and your server start time is essentially the same as when the background writer stats started Collec tion, can now use this fun query:
SELECT
total_checkpoints,
seconds_since_start / total_checkpoints / 60 AS minutes_between_checkpoints
FROM
(SELECT
EXTRACT(EPOCH FROM (now() - pg_postmaster_start_time())) AS seconds_since_start
(checkpoints_timed+checkpoints_req) AS total_checkpoints
FROM pg_stat_bgwriter
) AS sub;
And get a simple report of exactly how often checkpoints is happening on your system. The output looks like this:
total_checkpoints | 9
minutes_between_checkpoints | 3.82999310740741
What do I information is stare at the average time interval and see if it seems too fast. Normally, you ' d want a checkpoint to happen no more than every five minutes, and on a busy system you might need to push I T to ten minutes or more to has a hope of keeping up. With this example, every 3.8 minutes are probably too fast–this are a system that needs checkpoint_segments to be HIGHER.
Using This technique to measure the checkpoint interval lets if you know to need the increase a nd checkpoint_timeout parameters in order to achieve that goal. You can compute the numbers manually right now, and once 9.0 ships it's something you can consider making completely autom Atic–so long as you don't mind your stats going away each time the server restarts.
There is some other interesting ways to analyze the data the background writer provides for your in Pg_stat_bgwriter , but I ' m not going to give away all of my tricks today.
Note:
1, the above gives a query checkpoint execution time length of SQL, of course, before the calculation to clear the history record. Direct run Select Pg_stat_reset () is not clear, need to execute select pg_stat_reset_shared (' Bgwriter '), which can remove the values in the view Pg_stat_bgwriter stats_ Reset is set to 0;
2. View Pg_stat_bgwriter field:
swrd=# \d pg_stat_bgwriter
View "pg_catalog.pg_stat_bgwriter"
Column | Type | Modifiers
-----------------------+--------------------------+-----------
checkpoints_timed | bigint |
checkpoints_req | bigint |
checkpoint_write_time | double precision |
checkpoint_sync_time | double precision |
buffers_checkpoint | bigint |
buffers_clean | bigint |
maxwritten_clean | bigint |
buffers_backend | bigint |
buffers_backend_fsync | bigint |
buffers_alloc | bigint |
stats_reset | timestamp with time zone |
Where checkpoints_timed represents the number of checkpoint caused by Checkpoint_timeout, checkpoints_req indicates that the Checkpoint_ The number of checkpoint caused by segments. Manually executing the checkpoint command will calculate the number of times into the Checkpoints_req field, depending on the size of the two, you can decide to modify the size of the checkpoint_timeout and checkpoint_segments values.
The sum of the two fields is the total number of checkpoint, which can be combined with the Buffers_checkpoint value to calculate the average buffer size per checkpoint.
3, calculate the checkpoint time sql:
SELECT
total_checkpoints,
seconds_since_start / total_checkpoints / 60 AS minutes_between_checkpoints
FROM
(SELECT
EXTRACT(EPOCH FROM (now() - pg_postmaster_start_time())) AS seconds_since_start
(checkpoints_timed+checkpoints_req) AS total_checkpoints
FROM pg_stat_bgwriter
) AS sub;
EPOCH:
The Unix epoch (or Unix time or POSIX time or unix timestamp ) is the number of seconds that has elapsed since January 1, 1970 (midnight Utc/gmt), not counting leap seconds (in ISO 8601:1970-01-01 t00:00:00z). Literally speaking the Epoch is UNIX time 0 (midnight 1-1-1970), but ' epoch ' was often used as a synonym for ' unix time '. Many Unix systems store epoch dates as a signed 32-bit integer, which might cause problems on January, 2038 (known as T He year 2038 problem or Y2038).
EXTRACT:
EXTRACT (fieldfromsource)
Theextractfunction retrieves subfields such as year or hour from date/time values.sourcemust be a value expression of typetimestamp,time, orinterval. (Expressions of typedateis cast totimestampand can therefore is used as well.)Fieldis a identifier or string that selects what field to extract from the source value. Theextractfunction returns values of typedouble precision.
Reference:
http://blog.2ndquadrant.com/measuring_postgresql_checkpoin/
Http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm
http://yao.iteye.com/blog/628941
Http://www.postgresql.org/docs/9.4/static/functions-datetime.html
Measuring PostgreSQL Checkpoint Statistics