MySQL Data inserts insert performance optimization detailed

Source: Internet
Author: User
Tags commit datetime flush mysql insert

For some large data systems, the problem is that the query efficiency is low, and there is a very important problem is that the insertion time is long. We have a business system that requires 4-5 clocks per day for data import. This time-consuming operation is actually very risky, assuming that the program is out of the question, want to run the operation that is a painful thing. Therefore, it is necessary to improve the MySQL insert efficiency of large data volume system.

After the MySQL test, found that some can improve the efficiency of the Insert method for reference.
1. An SQL statement inserts more than one piece of data.

Common INSERT statements such as:

The code is as follows Copy Code

Insertinto ' insert_table ' (' datetime ', ' uid ', ' content ', ' type ') VALUES (' 0 ', ' userid_0 ', ' Content_0 ', 0);
Insertinto ' insert_table ' (' datetime ', ' uid ', ' content ', ' type ') VALUES (' 1 ', ' userid_1 ', ' content_1 ', 1);

Modified into:

Insertinto ' insert_table ' (' datetime ', ' uid ', ' content ', ' type ') VALUES (' 0 ', ' userid_0 ', ' Content_0 ', 0),
(' 1 ', ' userid_1 ', ' content_1 ', 1);

The modified insert operation can improve the insertion efficiency of the program. Here the second type of SQL execution efficiency is the main reason for two, one is to reduce the operation of SQL statement parsing, only need to parse the data can be inserted operation, the second is a short SQL statement, can reduce network transmission of IO.

This provides some test comparison data, respectively, the import and transformation of a single data into a SQL statement to import, respectively, test 100, 1000, 10,000 data records.

Record number of single data inserted into multiple data inserts

100 0.149s 0.011s
1000 1.231s 0.047s
10,000 11.678s 0.218s

2. Insert processing in a thing.
Modify the insertion into:

The code is as follows Copy Code
StartTransaction;
Insertinto ' insert_table ' (' datetime ', ' uid ', ' content ', ' type ') VALUES (' 0 ', ' userid_0 ', ' Content_0 ', 0);
Insertinto ' insert_table ' (' datetime ', ' uid ', ' content ', ' type ') VALUES (' 1 ', ' userid_1 ', ' content_1 ', 1);
...
COMMIT;

Using things can improve the insertion efficiency of data, because when you do an insert, there's a thing inside MySQL that actually inserts into things. By using things, you can reduce the cost of creating things, and all inserts are executed before they are committed.

Test comparisons are also provided, respectively, between the use of things and the use of things in records of 100, 1000, 10,000.

Number of records use things without using things

100 0.149s 0.033s
1000 1.231s 0.115s
10,000 11.678s 1.050s

Performance test:

This provides a test to optimize the insert efficiency using both of the above methods. That is, multiple data is merged into the same SQL and inserted into things.

Record number of single data insert merge data + things insert
10,000 0m15.977s 0m0.309s
Ten million 1m52.204s 0m2.271s
Wan 18m31.317s 0m23.332s

From the test results you can see that the efficiency of the insert is about 50 times times higher, this is a very objective figure.


If you want to insert many records at the same time at the same client, you can use the INSERT statement to include multiple values. This approach is much faster than an INSERT statement that uses a single value (in some cases faster). If you add a record to a non-empty datasheet, you can adjust the value of the variable bulk_insert_buffer_size to make it faster.

If you want to insert a large number of records from a client that you don't use, you can also increase the speed by using the Insert delayed statement.

For MyISAM, you can insert records when a SELECT statement is running, as long as no records are being deleted at this time.

To load a text file into a datasheet, you can use the load data INFILE. This is usually 20 times times more than using a large number of INSERT statements.

With some extra work, you can make the load data infile run faster with a large number of indexes on the datasheet. The steps are as follows:

Build a table with table with CREATE table

Execute Flush Tables Statement or admin flush-tables command

Executes the myisamchk–keys-used=0-rq/path/to/db/tbl_name command to delete all indexes of the datasheet.

Executes the load data INFILE, which is inserted into the table, so it will be very fast because the table index does not need to be updated.

If you only read the table in the future, run Myisampack to make the datasheet smaller.

Run Myisamchk-r-q/path/to/db/tbl_name rebuild the index. The indexed tree that you create is saved in memory before it is written to disk, which eliminates disk-disk search, so it is much faster. The reconstructed index tree distribution is very balanced.

Execute Flush Tables Statement or mysqladmin flush-tables command

Note that in MySQL 4.0, ALTER TABLE Tbl_name DISABLE keys can be run instead of myisamchk–keys-used=0-rq/path/to/db/tbl_name. Run ALTER TABLE TBL _name ENABLE keys instead of Myisamchk-r-q/path/to/db/tbl_name. This eliminates the flush tables step.



You can speed up the insert operation by executing several statements together after the table is locked:

LOCK TABLES a WRITE;

INSERT into a VALUES (1,23), (2,23);

INSERT into a VALUES (8,7);

UNLOCK TABLES;

The benefit of this performance improvement is that the index cache is flushed to disk once all INSERT statements have been completed. In general, the number of times an INSERT statement will have the overhead of how many times the index cache is flushed to disk. If you can insert multiple values at once in a single statement, it is obvious that the lock table operation is unnecessary. For the transaction table, use Begin/commit instead of lock tables to improve speed. Locking the table also lowers the total time for multiple connection tests, although each isolated connection increases the maximum wait time for the lock.

Connection 1 Does 1000 inserts

Connection 2,3 and 4 do 1 inserts

Connection 5 does 1000 inserts

If there is no lock table, the connection 2,3,4 will be completed before 1,5. If the table is locked, the connection 2,3,4 may be completed after 1,5, but the total time may be only 40%. MySQL inserts, updates, and deletes are all very fast, but in one statement it is best to lock for better performance if more than 5 inserts or upgrades are added. If you want to do a lot of inserts at a time, it's best to add lock tables and unlock tables around each loop so that other processes can access the datasheet; This is still a good performance. Inserts are always slower to insert than the load data infile because they have distinct differences in their implementation strategies.

To MyISAM a table faster, you can increase the value of the system variable key_buffer_size in both the load DATA infile and insert.


Precautions:

1. SQL statements are limited in length and must not exceed the SQL length limit in the same SQL for data merging, which can be modified by the Max_allowed_packe configuration, and the default is 1M.

2. Things need to be controlled in size, and things are too big to affect the efficiency of execution. MySQL has innodb_log_buffer_size configuration items, more than this will log the use of disk data, then the efficiency will be reduced. So it is better to do something before the size of the item reaches the configuration data level.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.