PostgreSQL vacuum Principle 1 Functions and Parameters

Source: Internet
Author: User

PostgreSQL vacuum Principle 1 Functions and Parameters

From the previous article "PostgreSQL MVCC source code implementation []", we know that PG does not store the old version of undo like Oracle, but directly stores the old version in the relation file. The problem is that there are too many dead tuple files, resulting in increasing relation files and space expansion. To solve this problem, PG introduced the vacuum background process to specifically clean up the dead tuple and reduce the space. Of course, vacuum is not only used here, but also has the following functions:

I. vacuum Functions

1. Reclaim Space

This is usually the easiest thing to remember. Reclaim space and clear dead tuple. However, the allocated space is generally not released. Except for vacuum full, exclusive lock is required. Generally, it is not recommended because vacuum full is often used if the table will eventually rise to this high level. Generally, it is enough to set the vacuum parameter properly.

2. Freeze tuple's xid

PG stores the xmin and xmax information in the header of each record (tuple) (add, delete, and modify the transaction ID ). The maximum value of transactionID is 2 32 times, that is, it is expressed by no integer character. When transactionID exceeds the maximum value, it is recycled.

This brings about a problem: the transactionID of the latest transaction is smaller than the transactionID of the old transaction. If this happens, PG cannot identify the transaction sequence by transactionID or implement MVCC. Therefore, PG uses the vacuum background process to trigger the vacuum action according to a certain cycle and algorithm, and freeze the transaction ID in the header of the old tuple. Freeze the transaction ID, which is set to "2" ("0" indicates invalid transaction ID; "1" indicates bootstrap, that is, initialization; "3" indicates the smallest transaction ID ). PG considers the frozen transaction ID to be older than any other transaction. In this way, the above situation will not occur.

3. update statistics

Vacuum analyze updates statistics so that the PG planner can calculate a more accurate execution plan. The autovacuum_analyze_threshold and autovacuum_analyze_scale_factor parameters can control the trigger frequency of analyze.

4. Update visibility map

In PG, a visibility map is used to mark pages that do not have dead tuple. There are two advantages: one is that when vacuum performs scan, you can directly skip these pages. Second, when performing index-only scan, you can first check the visibility map. In this way, the visibility judgment of fetch tuple is reduced, IO operations are reduced, and performance is improved. In addition, the visibility map is much smaller than the entire relation and can be cached into the memory.

Ii. vacuum parameter Introduction

Autovacuum has many parameters to control its behavior, which are roughly as follows:
• Autovacuum: The default value is on, indicating whether to enable autovacuum. It is enabled by default. In particular, PG also performs vacuum even though this value is off when xid needs to be frozen.
• Autovacuum_naptime: The next vacuum time. The default value is 1 min. This naptime will be allocated to each DB by vacuum launcher. Autovacuum_naptime/num of db.
• Log_autovacuum_min_duration: records the autovacuum action to the log file when the vacuum action exceeds this value. "-1" indicates no record. "0" indicates that records are recorded every time.
• Autovacuum_max_workers: the maximum number of workers running simultaneously, excluding the launcher itself.
• Autovacuum_vacuum_threshold: The default value is 50. Used with autovacuum_vacuum_scale_factor. The default value of autovacuum_vacuum_scale_factor is 20%. When the number of tuples for update and delete exceeds autovacuum_vacuum_scale_factor * table_size + autovacuum_vacuum_threshold, vacuum is performed. Change this value to a smaller value if vacuum is to work diligently.
• Autovacuum_analyze_threshold: The default value is 50. Used with autovacuum_analyze_scale_factor. autovacuum_analyze_scale_factor is 10% by default. When the number of tuples for update, insert, and delete exceeds autovacuum_analyze_scale_factor * table_size + autovacuum_analyze_threshold, perform analyze.
• Autovacuum_freeze_max_age and autovacuum_multixact_freeze_max_age: the first 200 million and the last 400 million. The maximum number of transactions that will be frozen by xid next time.
• Autovacuum_vacuum_cost_delay: if it is-1, take the value of vacuum_cost_delay.
• Autovacuum_vacuum_cost_limit: if it is-1, it is the cumulative value of vacuum_cost_limit.

Cost-Based vacuum parameters:
• Vacuum_cost_delay: calculates the maximum I/O that can be consumed per millisecond, vacuum_cost_limit/vacuum_cost_dely. The default value of vacuum_cost_delay is 20 ms.
• Vacuum_cost_page_hit: The cost of page hit in buffer during vacuum. The default value is 1.
• Vacuum_cost_page_miss: When vacuum is used, the page is not in the buffer. The default read cost is 10. Vacuum_cost_page_dirty: When vacuum is used, the clean page is modified. This indicates that extra IO is required to fl dirty blocks to the disk. The default value is 20.
• Vacuum_cost_limit: When this value is exceeded, vacuum sleep. The default value is 200.

Adjust the small point of each cost value above and increase the limit value to extend the time of vacuum each time. In this way, IO may be affected in a high-load system because of vacuum. However, the increase in the physical storage space of tables slows down.

In the next article, we will focus on source code implementation. See

------------------------------------ Lili split line ------------------------------------

Install PostgreSQL 6.3 on yum in CentOS 9.3

PostgreSQL cache details

Compiling PostgreSQL on Windows

Configuration and installation of LAPP (Linux + Apache + PostgreSQL + PHP) Environment in Ubuntu

Install and configure phppgAdmin on Ubuntu

Install PostgreSQL9.3 on CentOS

Configure a Streaming Replication cluster in PostgreSQL

How to install PostgreSQL 7/6 and phpPgAdmin in CentOS 5/6. 4

------------------------------------ Lili split line ------------------------------------

PostgreSQL details: click here
PostgreSQL: click here

This article permanently updates the link address:

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.