In the field of open source database, PostgreSQL is well-deserved for its superior performance, function and good stability, especially for high concurrency support. and superior performance and stability, it is nothing but a good infrastructure, this paper will have a good performance and stability of the business and locking mechanism to explore, and practical testing, to truly explain and prove its characteristics.
1. DDL that can be in a transaction
In PostgreSQL, DDL statements can be either committed or rolled back in a transaction, which, in practice, is of great practical significance and will also bring convenience and security to the work, as shown in:
2, MVCC
PostgreSQL, a good implementation of the MVCC function, PostgreSQL MVCC implementation of read-write non-blocking, but also explain the same data line write blocking, as shown in:
Session1: Starts a transaction, inserts a data into the transaction into the T1, but does not commit, and does not roll,
Session2: At this time, another session, connected to the same user and database, also insert into the same data row in T1, at this time, the transaction is blocked, expressed in the session is suspended,
Session1: In return to Session1, a commit command is issued at this time,
Session2: In return to Session2, at this point, because the same data in Session1 is successfully insert and commit because the T1 table C2 field has a primary key, the INSERT into Session2 transaction in T1 fails with an error,
3. MVCC realization Mechanism
Although the current relational library implements MVCC functionality, some of the underlying architecture and design factors lead to weaknesses in system stability and performance. In PostgreSQL, the implementation of this feature is very ingenious, which is one of the important factors that PostgreSQL claims to be free version of Oracle. Because of the MVCC function, the read and write of the same data row in PostgreSQL is not blocked, and writing blocks and waits. So, how does PostgreSQL implement the MVCC function? PostgreSQL, the MVCC function is implemented mainly through several hidden fields on the data rows, respectively: Xmin,amax,cmin,cmax, as shown in:
Session1: In this session, start a transaction, and in the transaction, update the fields C1=1 and the Delete field c1=2 two rows of data, each querying the data and the hidden fields of each data row in T1 respectively:
Session2: Open another session session2, and connect to the same database and users, after the delete, update, commit operation in Session1, query the table T1 data rows and several hidden column values, you can find: Session1, The data row before the update data row is filled with only the value of the Xmin hidden column, and the xmax of the old data row is filled in as the XID of the update transaction when the xid;update operation is then, and the new row of data is re-insert. And fill in the new data row xmin as the current transaction xid,session1 in the data is the new data row (Xmin is the current transaction XID), and because the Session1 in the update is not committed, Session2 found in the data behavior of old data rows ( Xmin the Xid,xmax for the first insert is the XID of the current transaction), the delete operation fills the xmax of the old data row as the XID of the current transaction, and when queried in Session1, the data row was not found because it was deleted. In Session2, because the delete operation in Session1 is not committed, you see the data row before the delete, but at this point the xmax of the old data row is filled in as XID of the transaction in Session1. After submission in Session1, the old data rows are no longer visible in the Session2, only the results after the update and delete are seen:
In this way, PostgreSQL cleverly implements the MVCC function through several hidden columns of the data row, while the same MVCC function, while Oracle achieves elegance, requires that the change vectors be taken to the undo segment to reconstruct the front image of the block, which may consume some of the system resources MySQL InnoDB, although it is not necessary to refactor the data block before the image, but also to the undo space to get the old data rows; DB2 and SQL Server libraries, although similar to MVCC functionality, but DB2, through the redo information to reconstruct the data before the image, This may cause the redo to become overheated and congested, and SQL Server puts the pre-image information in the temp database. While the MVCC functionality in PostgreSQL is ingenious and does not consume too much system resources in the process of fetching old data rows, there are situations in which data segments exist in different versions of the data rows, and when the old data is too long, it can lead to performance problems, and when the old data is cleaned up, can also cause jitter or bumps in system load and performance.
4, the free version of Oracle well-deserved
PostgreSQL In addition to the row lock implementation mechanism, the transaction level and Oracle are very similar, although support four standard transaction level settings, but PostgreSQL actually only support two transaction levels: Read Committed and Serializable. In addition, PostgreSQL also supports complete constraints, indexing, multi-language functions and process coding implementations, and for process models, which can be installed on almost every popular platform, these features and functions can be said to be the free version of Oracle well deserved.
Analysis of PostgreSQL database transaction and ROW lock features