Optimization of SQL statements when inserting data in Mysql

Last Update:2017-01-13 Source: Internet

Author: User

Tags commit

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1) for tables of the Myisam type, you can import a large amount of data quickly using the following methods.

Alter table tblname disable keys;
Loading the data
Alter table tblname enable keys;

These two commands enable or disable the update of non-unique indexes in the Myisam table. When importing a large amount of data to a non-empty Myisam table, you can improve the import efficiency by setting these two commands. To import a large amount of data to an empty Myisam table, the index is created only after the data is imported first by default, so you do not need to set it.

For Innodb tables, this method cannot improve the efficiency of data import.

2) for Innodb tables, we have the following methods to improve the import efficiency:

Because Innodb tables are saved in the order of primary keys, the imported data is arranged in the order of primary keys, which can effectively improve the efficiency of data import. If the Innodb table does not have a primary key, an internal column is created by default as the primary key. Therefore, if you can create a primary key for the table, you can use this advantage to improve the efficiency of data import.

Run SET UNIQUE_CHECKS = 0 before importing data, disable the uniqueness check, and run SET UNIQUE_CHECKS = 1 after the import to restore the uniqueness check, which improves the import efficiency.

If the application uses the automatic submission method, we recommend that you execute set autocommit = 0 before import, disable automatic submission, and then execute set autocommit = 1 after import. Enable automatic submission, it can also improve the import efficiency.

Some performance tests on MySQL innodb have found some ways to improve the insert efficiency for your reference.

1. Insert multiple data records into one SQL statement.

Common insert statements are as follows:

Insert into 'Insert _ table' ('datetime', 'uid', 'content', 'type ')
VALUES ('0', 'userid _ 0', 'content _ 0', 0 );
Insert into 'Insert _ table' ('datetime', 'uid', 'content', 'type ')
VALUES ('1', 'userid _ 1', 'content _ 1', 1 );

Modify:

Insert into 'Insert _ table' ('datetime', 'uid', 'content', 'type ')
VALUES ('0', 'userid _ 0', 'content _ 0', 0), ('1', 'userid _ 1', 'content _ 1 ', 1 );
The modified insert operation can improve the insert efficiency of the program. The second high SQL execution efficiency is mainly caused by the reduction of the merged log volume (MySQL binlog and innodb transaction reduce the log volume), reducing the data volume and frequency of log disk flushing, and thus improving the efficiency. By combining SQL statements, you can also reduce the number of SQL statement resolutions and reduce the I/O of network transmission.
Here we provide some test and comparison data, namely, importing a single piece of data and converting it into an SQL statement, respectively, testing 1 million, 1 thousand, and 10 thousand data records.

2. Insert a transaction.

Modify the insert statement:

Start transaction;
Insert into 'Insert _ table' ('datetime', 'uid', 'content', 'type ')
VALUES ('0', 'userid _ 0', 'content _ 0', 0 );
Insert into 'Insert _ table' ('datetime', 'uid', 'content', 'type ')
VALUES ('1', 'userid _ 1', 'content _ 1', 1 );
...
COMMIT;

3. Data is inserted in sequence.

Data insertion means that the inserted records are sorted on the primary key. For example, datetime is the primary key of the record:

Insert into 'Insert _ table' ('datetime', 'uid', 'content', 'type ')
VALUES ('1', 'userid _ 1', 'content _ 1', 1 );
Insert into 'Insert _ table' ('datetime', 'uid', 'content', 'type ')
VALUES ('0', 'userid _ 0', 'content _ 0', 0 );
Insert into 'Insert _ table' ('datetime', 'uid', 'content', 'type ')
VALUES ('2', 'userid _ 2', 'content _ 2', 2 );

Modify:

Insert into 'Insert _ table' ('datetime', 'uid', 'content', 'type ')
VALUES ('0', 'userid _ 0', 'content _ 0', 0 );
Insert into 'Insert _ table' ('datetime', 'uid', 'content', 'type ')
VALUES ('1', 'userid _ 1', 'content _ 1', 1 );
Insert into 'Insert _ table' ('datetime', 'uid', 'content', 'type ')
VALUES ('2', 'userid _ 2', 'content _ 2', 2 );

Because index data needs to be maintained during Database insertion, unordered records will increase the cost of index maintenance. We can refer to the B + tree index used by innodb. If every inserted record is at the end of the index, the index positioning efficiency is high and the index adjustment is small; if the inserted records are in the middle of the index, the B + tree splitting and merging operations will consume a lot of computing resources, and the index positioning efficiency of the inserted records will decrease, disk operations are frequently performed when the data volume is large.

The following describes the performance comparison between random data and ordered data, which are recorded as 10 thousand, 0.1 million, 1 million, and respectively.

From the test results, the performance of the optimization method has been improved, but the improvement is not very obvious.

Comprehensive Performance test:

Here we provide a test to optimize the INSERT efficiency by using the above three methods at the same time.

From the test results, we can see that when the data merging + transaction method has a small amount of data, the performance improvement is obvious. When the data volume is large (more than 10 million), the performance will drop sharply, this is because the data volume exceeds the innodb_buffer capacity at this time. Each index location involves a large number of disk read/write operations, and the performance decreases rapidly. However, the use of combined data + transactions + ordered data still performs well when the data volume reaches more than 10 million levels. When the data volume is large, it is easier to locate ordered data indexes, you do not need to perform read/write operations on disks frequently, so you can maintain high performance.

Note:

1. the SQL statement has a length limit. When data is merged in the same SQL statement, the length limit cannot be exceeded. The max_allowed_packet configuration can be modified. The default value is 1 MB and the value is 8 MB.
2. The transaction size needs to be controlled. Too large a transaction may affect the execution efficiency. MySQL has the innodb_log_buffer_size configuration item. If this value is exceeded, innodb data will be flushed to the disk, and the efficiency will decrease. Therefore, it is better to commit a transaction before the data reaches this value.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More