MySQL database engine, transaction isolation level, lock

Source: Internet
Author: User
Tags mutex mysql in table definition types of tables

MySQL database engine, transaction isolation level, lock
  • What is the difference between the database engine InnoDB and MyISAM

    • The general difference is that the MyISAM type does not support advanced processing such as transaction processing, and InnoDB type support. MyISAM types of tables emphasize performance, which performs more efficiently than InnoDB types, but does not support transactions, while InnoDB provides transactional support and advanced database functionality such as foreign keys.

    • The difference between specific implementations:

      1. InnoDB does not support indexes of fulltext types

      2. The exact number of rows in a table is not saved in InnoDB, that is, when querying SQL is executed, INNODB scans the entire table to calculate how many rows, and MyISAM simply reads the saved rows, but when the where condition is included, the operation of the two tables is the same

      3. For fields of type auto_increment, InnoDB must contain only the index of the field, but in the MyISAM table, you can establish a federated index with other fields

      4. When you execute a delete SQL, InnoDB does not reestablish the table, but deletes a row of rows

      5. The LOAD TABLE from master operation has no effect on InnoDB. The workaround is to change the InnoDB table to a MyISAM table, import the data, and then change it to a InnoDB table, but not for tables that use additional InnoDB features (such as foreign keys)

    • The difference in composition:

        1. Each myisam is stored on a disk divided into three files. The first file name starts with the name of the table, and the extension indicates the file type

          • . frm file storage table definition

          • . myd files as data files

          • . Myi files as index files

        2. Disk-based resources are data files for the InnoDB tablespace and its log files, and the InnoDB table size is limited only by the size of the operating system files, typically 2G

      • Differences in the handling of things:

        1. InnoDB supports things, MyISAM does not support things. For INNODB Each SQL statement is encapsulated as a thing by default, autocommit, which affects speed, so it's best to put multiple SQL statements between Begin and commit to make up a thing.

        2. InnoDB supports foreign keys, while MyISAM is not supported. Converting a InnoDB table with a foreign key to MyISAM will fail.

        3. InnoDB is a clustered index, and the data file is tied to the index and must have a primary key, which is highly efficient through the primary key index. However, the secondary index requires two queries, first querying the primary key, and then querying the data through the primary key, so the primary key should not be too large, because the primary key is too large and the other indexes are large. While MyISAM is a non-focused index, the data file is detached, and the index holds a pointer to the data file. The primary key index and the secondary index are independent.

        4. InnoDB does not save the exact number of rows in the table, and a full table scan is required to execute SELECT COUNT (*) from table. Instead, MyISAM uses a variable to hold the entire table's row count, and it is only necessary to read the variable when executing the above statement, which is fast.

        5. InnoDB does not support full-text indexing, and MyISAM does not support full-text indexing, MyISAM to query efficiency

        6. The MyISAM type of table emphasizes performance, which executes faster than the InnoDB type, but does not support things. InnoDB support things, external keys and other advanced database functions

        7. If you perform a large number of query select operations, MyISAM is a better choice

        8. If you perform a large number of insert or update operations, for performance reasons, you should use the InnoDB engine

        9. Performing a delete data operation does not work for InnoDB, and the workaround is to first change the InnoDB table to a MyISAM table, import the data to perform the operation and then change to the InnoDB table, but not for tables that use an additional InnoDB feature (such as a foreign key)

      • Operation on the Auto_increment

        1. MyISAM is automatically updated for INSERT and update operations, which makes the Auto_increment column faster (at least 10%). The value at the top of the sequence cannot be exploited after it has been deleted. (When the Auto_increment column is defined as the last column of a multicolumn index, the use of values removed from the top of the sequence can occur)

        2. The value of auto_increment can be reset with alter or MYISAMCH

        3. For fields of type auto_increment, InnoDB must contain only the index of the field, but in the MyISAM table, you can establish a federated index with other fields

      • mysql in Lock:

        1. Lock is a mechanism by which computers coordinate multiple processes or threads for concurrent access to a resource.

        2. mysql are divided into table and row locks: As the name implies, a table lock is a table lock, And a row lock is a lock on one line.

        3. table lock features : Low overhead, no deadlock, high probability of lock collisions, and lower concurrency.

          row lock features : large overhead, resulting in deadlock, the probability of a lock collision is low, and the concurrency is high.

        4. So the MyISAM engine uses a table lock, The InnoDB storage engine uses a row lock.

    • How to select the Database engine

      • Whether the business needs to support things, if you need to choose InnoDB, if you do not need to consider MyISAM

      • If the vast majority of the tables are simply query operations, consider MyISAM, and if read and write operations are frequent, use InnoDB

      • Need to consider a system crash, MyISAM recovery is more difficult, can accept

      • MySQL5.5 version started InnoDB has become the default engine for MySQL (previously MyISAM)

    • MySQL Field width

      • The display width of the integer value is specified in parentheses following the MySQL type keyword (for example, INT (11)). The optional display width specifies that the width is filled from the left when the value that is less than the specified width of the column is displayed. The display width does not limit the range of values that can be saved within the column, nor does it limit the display of values that exceed the specified width of the column. so int (1) and int (11) are no different by default!!!

      • When combined with optional extended attribute Zerofill, the default supplemental space is replaced with 0. For example, for a column declared as int (5) Zerofill, the value 4 is retrieved as 00004. Note that if you save a value that exceeds the display width in an integer column, you will encounter problems when MySQL generates temporary tables for complex joins, because in these cases MySQL believes that the data fits the original column width.

      • All integer types can have an optional (non-standard) attribute of unsigned. Unsigned values can be used when you want to allow only non-negative numbers in a column and the column requires a large upper range of values. If you set the Zerofill extended Properties test, the default is unsigned (UNSIGNED)

      • So the characters in parentheses after int (1) and int (11) represent the width of the display, the display width of the integer column and the number of characters required by MySQL to display the value of the columns, and the size of the storage space required by the integer, the data limit of the type of int can be stored or 2147483647 ( Signed) and 4294967295 (unsigned). In fact, when we choose the type of int, whether int (1) or int (11), it stores the length of 4 bytes in the database.

      • Int (m) Zerofill, plus Zerofill after M to show a difference, such as int (3) Zerofill, you insert into the database is 10, the actual insert is 010, that is, add a 0 in front. if int (3) and int (10) Without Zerofill, they are no different. M is not used to limit the range of values stored in an int column. The maximum and minimum values of int (M) are related to unsigned.

    • In general, the main difference between the two types is that InnoDB supports object handling and foreign key and row level locks. And MyISAM does not support it. So MyISAM tend to be considered only for use in small projects, and MyISAM is definitely the first choice for convenience and high scalability. The reasons are as follows:

      1, most of the projects on the platform are read and write less projects, and MyISAM reading performance is stronger than InnoDB.

    2, MyISAM index and data are separate, and the index is compressed, the memory usage of the corresponding improved a lot. Can load more indexes, and InnoDB is the index and the data is tightly bound, do not use compression which will cause innodb than MyISAM volume is large.

    3, often 1, 2 months will occur application developers accidentally update a table where the scope of the wrong, resulting in this table can not be normal use, this time the superiority of the MyISAM is reflected, casually from the day of the copy of the compressed package out of the corresponding table file, casually put into a database directory, Then dump into SQL and then back to the main library, and the corresponding binlog complement. If it's InnoDB, I'm afraid it can't be so fast, don't tell me to let InnoDB regularly back up with an export xxx.sql mechanism, because the minimum amount of data for a database instance is roughly dozens of g in size.

    4, from the application logic of contact, select COUNT (*) and order BY is the most frequent, probably can account for the entire SQL total statement of more than 60% of the operation, and this operation InnoDB actually will lock the table, many people think InnoDB is a row-level lock, That's just where the primary key is valid, and the non-primary key will lock the full table.

    5, there is often a lot of application departments need me to give them regular data on some tables, MyISAM words are very convenient, as long as they correspond to the list of the frm. myd,myi files, let them in the corresponding version of the database to start the line, and InnoDB need to export xxx.sql, because the light to other people's files, by the dictionary data file, the other side is not available.

    6, if and myisam than insert write operation, InnoDB also not up to MyISAM write performance, if is for index-based update operation, although MyISAM may be inferior innodb, but so high concurrency of write, from the library can chase is also a problem, It might as well be solved by a multi-instance sub-Library table architecture.

    7, if it is used MyISAM, the merge engine can greatly speed up the development of the application department, they just do some select count (*) operation on this merge table, it is very suitable for a large project total of about hundreds of millions of rows of a type (such as log, survey statistics) business table.

    8, of course, InnoDB is not absolutely not used, the project with the business of InnoDB. In addition, someone might say that you myisam not be able to write too much, but you can make up for it through architecture.

  • What are the characteristics of a thing:

    1. Atomic Nature

    2. Consistency

    3. Isolation of

    4. Durability

  • The relationship between the transaction isolation level and the lock in InnoDB

    • One block && two-segment lock

        • one block: Because there is a lot of concurrent access, in order to prevent deadlocks, it is recommended to use a blocking method in general applications. It is at the beginning of the method that you know beforehand what data will be used, then lock it all up and unlock it after the method has been run. This method can effectively avoid the cyclic deadlock. However, this approach does not work in the database because the database does not know what data will be used at the beginning of the transaction.

        • Two-stage lock

          The database follows a two-phase lock protocol that divides a transaction into two phases, locking the stage and unlocking the stage

          Although this approach does not prevent deadlocks, the two-phase lock protocol ensures that concurrent scheduling of transactions is serializable (serialization is important, especially in database recovery and backup)

          • lock-up phase: The lock operation can be performed at this stage. To request and obtain an S lock (shared lock) before any data is read, other transactions can continue to share the lock, but not lock it. To request and obtain an X lock (exclusive lock) before the write operation, no other transaction can acquire any more locks. If the lock is unsuccessful, the transaction goes into a wait state until Ga Cheng can continue to execute

          • Unlocking phase: After a transaction has released a blockade, the transaction enters the unlocked extreme, which can only be unlocked at that stage and no additional lock operations are allowed

    • Four isolation levels for transactions

      In the database operation, in order to effectively guarantee the correctness of the concurrent read data, the transaction isolation level is proposed. Database locks are also available to build these levels.

      • READ UNCOMMITTED: Allow dirty reads, which may read data that has not been committed in other sessions

      • Commit read: Only the data that has been committed can be read. Most databases, such as Oracle, are this level by default

      • REPEATABLE READ: Data can be read repeatedly. Queries within the same transaction are consistent at the beginning of a transaction. In the SQL standard, this isolation level eliminates non-repeatable reads, but there is also a phantom read

      • Serial read: Fully serialized read data. A table-level shared lock is required for each read, and both reads and writes block each other

    • The lock in MySQL

      1. There are many types of locks in MySQL, there are common table locks and row locks, as well as newly added metadata lock and so on.

      2. Table lock is a whole table lock, although can be divided into read lock and write lock, but after all, is to lock the entire table, will cause the concurrency decreased, is generally used to do DDL processing

      3. Row locks are locked data rows, this acceleration is more complex, but because only a limited amount of data locked, for other data is not locked, so the concurrency is strong, MySQL is generally using row locks to handle concurrent transactions, row locks can prevent different transaction version of the data when the data conflict caused by changes in the situation

      4. In the RC (read Committed = read submission) level, the data read is not locked, but the data write, modify, delete is required to lock the

      5. Since MySQL InnoDB is using RR level by default, you need to turn the session into an RC level and set the Binlog mode

      6. If a condition cannot be quickly filtered through the index, the storage engine plane will lock all records back and then filter by the MySQL server layer

      7. but in the actual use of the process, MySQL made some improvements in MySQL server filter conditions, When the discovery is not satisfied, the Unlock_row method is called to release the record that does not satisfy the condition (violates the constraint of the two-segment protocol). This ensures that only the locks that satisfy the condition record will be held at the end, but the lock operation for each record cannot be omitted. This also applies to MySQL's default isolation level RR. So for a large number of tables to do batch modification, if the corresponding index can not be used, MySQL Server filter data is particularly slow, it will appear that although some rows of data is not modified, but they are still locked in the phenomenon.

    • The difference between non-accented and phantom reads:

      1. repeattable Read (reread), which is the INNODB default isolation level in MySQL. The concept of re-stressing is that multiple instances of a transaction will see the same data row when the data is read concurrently. In the RR level of MySQL, the problem of Phantom Read was resolved

      2. Non-stressed emphasis is on update and delete, and the focus of phantom reading is insert

      3. If you use a lock mechanism to implement both isolation levels, in repeatable reads, the SQL is locked for the first time after it has been read to the data, and other transactions cannot read the data, enabling repeatable reads. However, this method cannot lock the insert data, so when transaction A has previously read the data or modified all the data, transaction B can still insert the data to commit, then transaction a will be inexplicably more than a previous data, this is the Phantom read, can not be avoided by row lock. Need to serializable isolation level, read lock, write with write lock, read lock and write lock mutex, this can effectively avoid phantom reading, non-repeatable read, dirty read and other problems, but will greatly reduce the concurrency of the database.

      4. The biggest difference between non-repeatable reads and Phantom reads, is how to solve their problems through the locking mechanism. The pessimistic locking mechanism can be used to deal with these two problems, but the mature databases such as MySQL, Oracle, PostgreSQL, and so on, for performance reasons, are MVCC (multi-version concurrency control) that uses optimistic locking as a theoretical basis to avoid these two problems

    • Optimistic lock and pessimistic lock

      • Pessimistic lock ==> It refers to the conservative attitude of the data being modified by the outside world (including other transactions currently in the system, as well as transactions from external systems), so that data is locked during the entire data processing process

        1. Pessimistic lock implementation, often rely on the database provided by the lock mechanism (also only the database layer provides a lock mechanism to truly guarantee the exclusivity of data access, otherwise, even in this system to implement the locking mechanism, there is no guarantee that the external system will not modify the data).

        2. In the case of pessimistic locking, in order for you to ensure the isolation of the transaction, you need a consistent lock read. Lock when reading data, other transactions cannot modify this data. Locks are also added when data is modified, and other transactions cannot read the data.

        3. Pessimistic locking relies on the locking mechanism of the database in most cases to ensure the maximum independence of the operation. But what comes with this is a lot of overhead in database performance, especially for long transactions, which are often unsustainable

      • Optimistic lock ==> is relative to pessimistic lock, the optimistic locking mechanism adopts more relaxed locking mechanism. Optimistic locks, most of which are based on the data version (adding a version identity to the data), are implemented in a version solution based on a database table, typically by a single version field for the database table. When the data is read, this version is read together, and then, when updated, adds one to this version. At this point, the version data of the submitted data is compared to the current version of the database table, and if the version number of the submitted data is greater than the current version of the database table, it is updated, otherwise it is considered to be outdated data

    • MVCC the realization of the INOODB in MySQL

      1. MVCC implementations do not have a fixed specification, each database will have a different way of implementation

      2. In InnoDB, two additional hidden values are added after each row of data to implement MVCC, which is a record of when this row of data was created and when another record of when this row of data expires (or is deleted). In practice, the storage is not the time, but the version number of the transaction, and each time a new transaction is opened, the transaction version number is incremented. Under the re-repeatable Reds transaction ISOLATION level:

        • Select, read create version number <= current transaction version number, delete version number is empty or > Current transaction version number.

        • When you insert, save the current transaction version number to the created version number of the row

        • Delete, save the deleted version number of the row for the current transaction version number

        • UPDATE, insert a new record, save the current transaction version number for the line to create the version number, while saving the current transaction version number to the original deleted row

      3. Through MVCC, although each row of records requires additional storage space, more row checking work and some additional maintenance work, but can reduce the use of locks, most operations do not lock, read data operation is very simple, good performance, and can be guaranteed only to read to meet the standard line, but also only lock the necessary rows

    • The difference between "read" in MySQL and "read" in the transaction isolation level

      1. At the RR level, although the data becomes repeatable, the data we read may be historical data, which is not timely data, not the current data of the database! This can be problematic in some businesses that are particularly sensitive to the timeliness of data. For this way of reading historical data, it is called snapshot read, and the current version of the data in the database is read in the way called the current read. Obviously, in MVCC, the snapshot read is select, the current read is a special read operation, insert/update/delete operation, belongs to the current reading, processing is the current degree of data, need to lock

      2. The isolation level of a transaction is actually defined as the current level of reading, in order to reduce the time of the lock processing (including waiting for other locks), improve the concurrency capability, introduce the concept of snapshot read, so that select does not lock, and update, INSERT, delete these "current read", We need another module to solve the problem.

      3. Current read: Although the requirements for reading data are defined in the isolation level of a transaction, this can actually be said to be the requirement to write data. In order to solve the current read of the Phantom read problem, MySQL transaction uses the Next-key lock

      4. The Next-key lock is a combination of a row lock and Gap (Gap Lock), a row lock prevents other transactions from being modified or deleted, gap locks prevent other transactions from being added, and a next-key lock combined with a row lock and Gap lock resolves the problem of the RR level when writing data

      5. Serializable level: read operation plus shared lock, write operation plus exclusive lock, read and write mutex. The pessimistic locking theory used is simple, the data is more secure, but the concurrency ability is very poor. If the business concurrency is very small or not concurrency, while the data is reliable, you can use this mode, at the serializable level, select will be locked

MySQL database engine, transaction isolation level, lock

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.