Multi-version concurrency control: PostgreSQL vs InnoDB

Source: Internet
Author: User
Tags mysql manual
Multi-version concurrency control technology is used by many database or storage engines, such as Oracle, ms SQL Server 2005 +, PostgreSQL, Firebird, InnoDB, Falcon, pbxt, Maria and so on. The new database storage engine uses multi-version instead of single-version locking to implement concurrency control without exception. It can be said that multi-version has become a development trend in the future.

Although multiple versions are available, the implementations of different systems are quite different. The multi-version Implementation of PostgreSQL and InnoDB, the two most prestigious systems in the open-source database field, is a world of difference.

I. Multi-version Implementation of postgresql (based on version 8.4.1)
PostgreSQL uses the storage structure of heap + B + tree indexes (ignoring the r tree, hash, GIST, and other infrequently used indexes). The storage mode of heap and index is different.

Stack records contain versionized information. PostgreSQL stores the latest or old versions of the records in the stack. Simply put, every record in the heap records the t_xmin and t_xmax attributes, respectively indicating the transaction ID for creating and deleting this version, and the t_ctid attribute, indicates the Rid Of the next updated version of the record, that is, multiple versions of the record constitute a one-way linked list from the oldest to the latest (see heaptupleheaderdata structure ). When you delete a record, you set t_xmax instead of deleting the record. When you update a record, you do not directly update the record. Instead, you insert a new version to the original version, set t_xmax to the current transaction ID and set t_ctid to point to the new version.

This information is not enough. To determine the version visibility, two more things are required: transaction commit logs and transaction snapshots. The transaction commit log uses two bits for each transaction, recording whether the transaction is active, committed, or rolled back. Transaction snapshots are allocated at the beginning of the transaction. The most important information is the list of currently active transactions (see the snapshotdata structure ).

With these features, the system can determine whether a version is visible. The judgment process is complicated, but in simple terms, the system first checks whether t_xmin is in the global active transaction list, whether it is in the transaction snapshot active transaction list, and whether the transaction is committed or rolled back Based on the transaction commit log. the transaction has been committed at the beginning; then, use a similar method to determine whether t_xmax has been committed at the beginning of the transaction. If t_xmin is not committed at the beginning of the transaction, it is invisible; If t_xmin has been committed at the beginning of the transaction and t_xmax has not, it is visible; if t_xmin and t_xmax are committed at the beginning of the transaction, they are invisible. (For detailed procedures, see functions such as heaptuplesatisfiesmvcc, transactioniddidcommit, and xidinmvccsnapshot ).

The index does not contain version information. Generally, all versions of the record have corresponding index items in the index. For example, if a table has three indexes and a record is updated, not only will a new version be inserted into the heap, but the index items corresponding to the new version also need to be inserted into the three indexes, even if this update may not update the attributes of some indexes (see the execupdate function ). The hot (heap-only-tuple) technology is introduced in PostgreSQL 8.3. If the new and old versions are on the same page and update does not update any index attributes, the corresponding index items of the new version are not inserted.

Because the index does not have version information, even if all the attributes required for the query exist in the index, You need to retrieve the corresponding records from the heap to determine whether the query is visible (see the index_getnext function ).

The transaction commit or rollback operation is simple. In addition to writing a transaction during the transaction commit, you only need to update the transaction status in the transaction commit log. That is to say, during rollback, the operations performed by the transaction do not need to be cleared physically. As long as the transaction status is set to rollback, the version generated by the transaction is naturally invisible to other transactions.

Old versions that are no longer needed are cleared by vacuum. Because the old and new versions are mixed together, vacuum requires scanning of all data. Version 8.4 introduces the visibility map technology to skip pages that certainly do not contain older versions in vacuum. However, if the system is updated frequently and discretely, this technology is useless. Online vacuum can only clean up old versions on the page, but it cannot reduce the space occupied by the table, which actually generates fragments. Vacuum will lock the table when the tablespace is to be reduced, and the table cannot be updated during the period.

Ii. Multi-version Implementation of InnoDB (based on InnoDB in MySQL 5.1.33)
InnoDB uses indexes to organize the storage structure of the table. There is no heap, and records are stored in the primary key index. Other indexes are called secondary indexes. Each index item contains the primary key of the corresponding record. The storage format of primary key indexes and secondary indexes is also different.

The primary key index has versionized information, but different from PostgreSQL, the InnoDB primary key index generally only stores the latest version of the record, and the old version of the information is stored in the rollback segment in a centralized manner, multiple versions must be stored in the primary key index at the same time only when the primary key is updated. The primary key index record contains a 6-byte transaction ID and a 7-byte pointer to the old version of the rollback segment (see the MySQL Manual ). Delete is only marked but not deleted. Update the image locally and write it to the rollback segment.

There is a read view similar to the transaction snapshot in PostgreSQL, and it also records the list of active transactions at the beginning of the transaction (see the read_view_struct structure), but the transaction commit log in PostgreSQL is not required. Based on the transaction ID in the read view and record, you can determine whether a version has been committed at the beginning of the transaction, that is, whether it is visible. If the records stored in the primary key index are invisible, find the old information based on the pointer pointing to the old version in the rollback segment and create the old record. The rollback segment uses append-only log-type storage. The old version of the record is not a complete record, but only the front image of the updated attribute. The old version information in the rollback segment also contains the location of the old version, that is, the version linked list is from new to old.

Because no transaction log indicates whether the transaction is rolled back, you must clear the modifications made by the transaction during transaction rollback. The inserted records must be deleted, and the updated records must be updated (see the row_undo function ). No processing is required when the transaction is committed.

Each index item in the secondary index does not have versioning information. However, the maximum value of the transaction ID for the page operation is recorded in the page header. This value can be used to determine whether the page may contain invisible data. If yes, you need to access the primary key index to determine the visibility. Otherwise, you can directly obtain the required attributes from the index. Secondary indexes may store index items corresponding to multiple versions of a record. If the update operation updates the attributes of an index, it is similar to PostgreSQL, and a new index item is inserted into the secondary index, old indexes are not deleted. However, if the index is not updated by the update operation, you do not need to insert a new index.

The system uses a background thread to process rollback segments from time to time. When necessary, it clears old versions generated by the primary key update in the delete, secondary index, or primary key index, this process is called purge. If update does not update the index, the purge overhead is not involved.

Iii. My comments
The major difference between PostgreSQL and InnoDB is whether the latest and historical versions are separated from each other.

This design of postgresql was called no-overwrite by its original designer Mike Stonebraker, after several years of designing PostgreSQL, he wrote a retrospective paper titled the implementation of postgresql (PostgreSQL was called Postgres, stonebraker pointed out that the primary reason for this design was to look for a different storage mechanism from the wal model that was widely used at the time, which was somewhat innovative for innovation. This design has two benefits: one is that the transaction rollback does not require complex processing and is very fast; the other is that the previous historical data can be queried. Another possible benefit is that data can be used as a log, that is, when updating data, you only need to update the data. You do not need to write logs to describe the updates. However, to achieve this benefit, there must be a persistent storage medium with a random Write Performance similar to that of sequential write, because in order to ensure the durability after the transaction is committed, data updated by transactions must be written, which may be discrete. The wal system is different. When a transaction is committed, you only need to write logs, while logs are written in sequence. The current hardware environment is not like this, so PostgreSQL still needs to write logs, but it does not need to write undo logs, as long as redo logs are enough.

The latest PostgreSQL and Stonebraker design have been greatly improved. For example, the hot technology reduces the number of versions in the index, the visibility MAP Technology accelerates vacuum, and the record header structure is more compact. However, the no-overwrite design principle remains unchanged.

Compared with InnoDB, PostgreSQL has only one advantage: Transaction rollback can be completed immediately, regardless of the number of operations performed by the transaction. The function of querying historical data is not commonly used and is not useful in PostgreSQL.

The main disadvantage of PostgreSQL is:
1. The latest and historical versions are not separated from each other. As a result, more scans are required to clear old versions at a higher cost;
2. Update Is Not A local update and old versions need to be cleared. In contrast, InnoDB only needs to clear old record data during transaction rollback. Transaction rollback is rare;
3. As long as an index attribute is updated or the new version of the record is not on the same page as the original version, the new version of the index of all indexes will be inserted;
4. the space occupied by the heap cannot be recycled through the online vacuum. The online vacuum will generate many fragments (this is also caused by the use of the heap instead of the index organization table );
5. Because the index has no version information, coverage index scan cannot be implemented. That is, the query only scans the index and returns the required attributes directly from the index. In contrast, the transaction ID information of the most recently modified page in the header of the secondary index page in InnoDB can be coverage index scan in most cases. Coverage index scan is a frequently used optimization technique in applications. PostgreSQL does not support this restriction to improve system performance because index scanning is sequential access, accessing the heap is likely to become out-of-order access, and the performance may be a hundred times different;
6. Determine that version visibility is more complex and overhead is greater. When determining the visibility of PostgreSQL and InnoDB, you need to add operations to access the transaction commit log. Two bits are required for each transaction in the transaction commit log, which occupies a large space for systems with high update load, at this time, either the transaction commit log occupies a large amount of memory, or the visibility may produce additional Io. Compare the visibility judgment functions in PostgreSQL, heaptuplesatisfiesmvcc, and InnoDB, read_view_sees_trx_id. It is easy to see that the complexity of the two is different from that of the other two.

The main disadvantage of InnoDB is that all the modifications made by the transaction must be cleared during transaction rollback. Therefore, avoid using ultra-large transactions when using InnoDB. Otherwise, rollback may be extremely slow.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.