PostgreSQL cluster solution hot standby preliminary test (v) -- xlog detailed explanation record

Source: Internet
Author: User

In the previous article, I talked about the xlog header. Today I will explain the record part in detail. I hope these two articles will help the xlog Study of postgresql:


From: http://blog.csdn.net/lengzijian/article/details/7840332


First, let's take a look at the xlog log record structure:


Xlogrecord records the control information of xlog. The data structure is as follows:

Typedef struct xlogrecord

{

Pg_crc32 xl_crc;/* CRC verification code recorded in this record */

Xlogrecptr xl_prev;/* The first log record */

Transactionid xl_xid;/* transaction ID */

Uint32 xl_tot_len;/* total length of the entire record */

Uint32 xl_len;/* Data Length of the Member Manager */

Uint8 xl_info;/* Information flag */

Rmgrid xl_rmid;/* Resource Manager idtypedef uint8 rmgrid ;*/

} Xlogrecord;

The resource manager ID is mainly used in the log system. The database system classifies the data to be recorded and assigns them the corresponding resource management number. When the system replies or reads the log records, it is easy to know which type of metadata the log records belong to. Combined with the Information flag (xl_info) information, you can know the operation that the database performs on the metadata. There are a total of 16 Resources (there are several items that do not know what to do ):

# Define rm_xlog_id 0 this log records a checkpoint information.

# Define rm_xact_id 1 the log records the submission or termination information of a transaction.

# Define rm_smgr_id 2

# Define rm_clog_id 3 initialization of a page in clog

# Define rm_dbase_id 4

# Define rm_tblspc_id 5

# Define rm_multixact_id 6

# Define rm_relmap_id 7

# Define rm_standby_id 8

# Define rm_heap2_id 9

# Define rm_heap_id 10 this log records the modified information of the team member group.

# Define rm_btree_id 11 this log records the modification of the btree.

# Define rm_hash_id 12

# Define rm_gin_id 13

# Define rm_gist_id 14

# Define rm_seq_id 15

The Resource Manager uses the high-level four bits of the Information flag (xl_info) to indicate which type of log the log is, and the lowest four bits indicate whether the corresponding block needs to be backed up. For the high-level four bits, there are several types of information:

/* Include/access/xact. h

* Xlog allows to store some information in high 4 bits of log

* Record xl_info Field

*/

# Define xlog_xact_commit 0x00 // transaction commit

# Define xlog_xact_prepare 0x10 // preparation

# Define xlog_xact_abort 0x20 // The transaction is canceled.

# Define xlog_xact_commit_prepared 0x30 // prepare to submit the transaction

# Define xlog_xact_abort_prepared 0x40 // prepare to cancel the transaction

# Define xlog_xact_assignment 0x50 // unknown... (Supplemented later)

 

 

/* Include/access/htup. h

* Wal record definitions for heapam. C's Wal operations

* Xlog allows to store some information in high 4 bitsof log

* Record xl_info field. We use 3 for opcode and one for init bit.

*/

# Define xlog_heap_insert 0x00 // insert tuples

# Define xlog_heap_delete 0x10 // Delete tuples

# Define xlog_heap_update 0x20 // update the tuples

 

The following tuples and the encoding (0x00) of the preceding transaction operations are repeated. Do not forget the operation we mentioned earlier. You should use the xl_rmid field to determine the operation: first, determine which operation you want to perform.

Only three digits are used in the lower four digits, as shown in the following figure (the last digit is not used ):

/* Include/access/xlog. h

* If we backed up any disk blocks with the xlog record, we use flag bits in

* Xl_info to signal it. We support backup of up to 3 disk blocks per xlog

* Record.

*/

# Define xlr_bkp_block_mask 0x0e/* All Info Bits used for bkp blocks */

# Define xlr_max_bkp_blocks 3

# Define xlr_set_bkp_block (iblk) (0x08> (iblk ))

# Define xlr_bkp_block_1 xlr_set_bkp_block (0)/* 0x08 */

# Define xlr_bkp_block_2 xlr_set_bkp_block (1)/* 0x04 */

# Define xlr_bkp_block_3 xlr_set_bkp_block (2)/* 0x02 */

 

Log Record Data Information:

Rmgr data is written by the xloginsert () function. There is one or more xlogrecdata data structures. When there are multiple xlogrecdata structures, there are two situations: 1. the source data is not physically adjacent to the memory; 2. data is specified in multiple buffers.

 

If the buffer is valid, xlog checks whether the buffer must be backed up (that is, whether the buffer has been changed for the first time since the last checkpoint ). If so, the content of the entire page will be appended to the xlog, and xlog sets the xlr_bkp_block_x bit in the flag xl_info. Note: After the buffer backup, we cannot insert data to the xlog record through the xlogrecdata struct, because we assume that it is already in the buffer, therefore, the redo operation of rmgr must pay attention to the value of xlr_bkp_block_x to guide what is stored in the xlog.

 

If the buffer is valid, the caller must set buffer_std to indicate whether the page uses the standard pd_lower/pd_upper header field.

 

The data information in the log is stored in the xlogrecdata structure (the structure is as follows ):

Typedef struct xlogrecdata

{

Char * data;/* Resource Manager data */

Uint32 Len;/* Resource Manager Data Length */

Buffer buffer;/* buffer involved in the Data */

Bool buffer_std;/* Buffer Storage standard */

Struct xlogrecdata * Next;/* next node pointer */

} Xlogrecdata;

All operation information is saved here. After reading the xlogdump source code, it is found that the xlogrecdata struct is not required to read the xlog record. For example, when reading the insert operation, you only need to call the xl_heap_insert struct to retrieve the data.

 

For example, if an insert is executed, you can use xlogdump to read the following code (the same updata, delete, and transaction operations commit and abort are both modifying data ):

Insert: 2 row (s) found in the table 't_ user'. // There are two fields in the table "t_user ".

Insert: column 0, name userid, type 1043, value '20160301' // The field name, field type, and field value are all retrieved from the data, for details, see the xlogdump source code.

Insert: column 1, name, type 1043, value 'lengzijian'

 

The header information of the backup data block in the xlog record is stored in bkpblock. The data structure is as follows:

Typedef struct bkpblock

{

Relfilenode node;/* Table node */

Forknumber fork;/* link Branch */

Blocknumber block;/* number of blocks */

Uint16 hole_offset;/* "hole" offset value */

Uint16 hole_length;/* "hole" length */

/* The actual block data is after the struct */

} Bkpblock;

 

When xlogdump is used, a problem is found, that is, sometimes statements cannot be printed during update or insertion. For example:

Lengzijian ----------- record-> xl_len [21]/sizeofheapupdate: [28]/sizeofheapheader: [5]

// I printed his judgment information here, because the unsigned type (21-28-5) is larger than maxheaptuplesize

// Exit. No statements is printed.

[Cur: 0/4c6f0960, Xid: 5215584, rmid: 10 (HEAP), Len/tot_len: 21/2781, Info: 9, PREV: 0/4c6f0910] insert: s/D/R: pg_default/lengzijian/t_user BLK/off: 758/61 header: None

 

[Cur: 0/4c6f0960, Xid: 5215584, rmid: 10 (HEAP), Len/tot_len: 21/2781, Info: 9, PREV: 0/4c6f0910] bkpblock [1]: s/D/R: FIG/lengzijian/t_user BLK: 758 hole_off/Len: 268/5484

// According to the above analysis: the data has been backed up to bkpblock [1]. You can also see through the log observation. After that, xlogdump prints bkpblock, it indicates that the last four digits of xl_info are set.

 

The next article explains how to use the xlogdump tool and some source code analysis.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.