Five tips for parsing and optimizing the MySQL insertion method

Five tips for parsing and optimizing the MySQL insertion method _ MySQL

Last Update:2018-04-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Five tips for parsing and optimizing the MySQL insertion method bitsCN.com

About 0.2 million of data insertion operations were encountered during work. after the program was compiled, it was found that the operation timed out. The maximum PHP execution time was modified to 600, or timed out. check the number of data inserted before the timeout to estimate the number, about 40 ~ It takes 60 minutes to complete the insertion. it seems that the write efficiency of the program is too low and it has to be optimized.
Test computer configuration:
CPU: AMD Sempron (tm) Processor
Memory: 1.5 GB
The statement is as follows:

$ SQL = "insert into 'test' ('test') values ('$ content ')";
For ($ I = 1; I I <1000; $ I ++ ){
Mysql_query ($ SQL );
}
Mysql_unbuffered_query:
9.85321879387
9.43223714828
9.46858215332
The execution time of mysql_query is:
10.0020229816
9.61053204536
9.24442720413
I think the most efficient method is as follows:
$ SQL = "insert into 'test' ('test') values ('$ content ')";
For ($ I = 1; I I <999; $ I ++ ){
$ SQL. = ", ('$ content ')";
}
Mysql_query ($ SQL );
Execution time:
0.0323481559753
0.0371758937836
0.0419669151306

INSERT statement speed
The time required to insert a record is composed of the following factors, and the number indicates the approximate proportion:
Connection: (3)
Send query to server: (2)
Analysis query: (2)
Insert Record: (1x record size)
Insert index: (1x index)
Close: (1)
This does not take into account the initial overhead of opening a table. each concurrent query is opened.
The table size slows down index insertion at the speed of logN (B tree.
Some methods to accelerate insertion:
If you INSERT many rows from the same client at the same time, use the INSERT statement containing multiple values to INSERT several rows at the same time. This is faster than using a single-row INSERT statement (several times faster in some cases ). If you add data to a non-empty table, you can adjust the bulk_insert_buffer_size variable to make data insertion faster. See section 5.3.3 "server system variables ".
If you INSERT many rows from different clients, you can use the insert delayed statement to speed up. See section 13.2.4 "INSERT syntax ".
MyISAM is used. If no row is deleted in the table, the row can be inserted while the SELECT statement is running.
When a table is loaded from a text file, load data infile is used. This is usually 20 times faster than using many INSERT statements.
When a table has many indexes, it is possible to do more work to make load data infile faster. Use the following process:

You can use create table to CREATE a TABLE.
Run the flush tables statement or mysqladmin flush-tables command.
Use myisamchk -- keys-used = 0-rq/path/to/db/tbl_name. This removes all indexes from the table.
Use load data infile to insert DATA into the table, because no index is updated, so it is very fast.
If you only want to read the table later, use myisampack to compress it.
Use myisamchk-r-q/path/to/db/tbl_name to re-create the index. This will create an index tree in memory before writing data to the disk, and it is faster, because it avoids a large number of disk searches. The results index tree is also perfectly balanced.
Run the flush tables statement or the mysqladmin flush-tables command.

Note that if an empty MyISAM table is inserted, load data infile can also be optimized. The main difference is that myisamchk can allocate more temporary memory for index creation, it is more allocated than the server re-create index when the load data infile statement is executed.
You can also use alter table tbl_name disable keys to replace myisamchk -- keys-used = 0-rq/path/to/db/tbl_name, replace myisamchk-r-q/path/to/db/tbl_name with alter table tbl_name enable keys. In this way, you can skip flush tables.
Locking a table can accelerate the INSERT operation with multiple statements:
Lock tables a WRITE;
Insert into a VALUES );
Insert into a VALUES (8, 26), (6, 29 );
Unlock tables;
This improves the performance because the index cache is refreshed to the disk only once after all INSERT statements are completed. Generally, the number of INSERT statements is the index cache refresh. If you can use one statement to insert all rows, you do not need to lock them.
For transaction TABLES, BEGIN and COMMIT should be used instead of lock tables to speed up insertion.
Locking also reduces the overall time for multi-connection testing, although the maximum wait time for them to wait for locking will increase. For example:
Connection 1 does 1000 inserts
Connections 2, 3, and 4 do 1 insert
Connection 5 does 1000 inserts
If no lock is used, 2, 3, and 4 are completed before 1 and 5. If locking is used, 2, 3, and 4 may not be completed before 1 or 5, but the overall time should be about 40% faster.
INSERT, UPDATE, and DELETE operations are fast in MySQL. by locking more than five consecutive INSERT or UPDATE operations in a row, you can achieve better overall performance. If you insert a table multiple times in a row, you can execute lock tables and then immediately execute unlock tables (about every 1000 rows) to allow other threads to access the table. This will also achieve good performance.
INSERT loading DATA is much slower than load data infile, even if the above policy is used.
To speed up load data infile and INSERT in the MyISAM table, increase the key-speed buffer by adding the key_buffer_size system variable.
INSERT syntax

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [(col_name,...)]
VALUES ({expr | DEFAULT },...),(...),...
[On duplicate key update col_name = expr,...]

Or

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name
SET col_name = {expr | DEFAULT },...
[On duplicate key update col_name = expr,...]

Or

INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [(col_name,...)]
SELECT...
[On duplicate key update col_name = expr,...]

1. Use of DELAYED
Use delayed insert operations
The DELAYED modifier is applied to INSERT and REPLACE statements. When the DELAYED insert operation arrives,
The server puts data rows in a queue and immediately returns a status message to the client.
You can continue the operation before the data table is actually inserted into the record. If the reader
When reading data from a table, the data in the queue will be kept until there is no reader. Then the server
Start to insert data rows in the delayed-row queue. During the insert operation, the server
Check whether new read requests arrive and wait. If yes, the delayed data row queue will be suspended,
Allow the reader to continue the operation. When no reader is available, the server inserts delayed data rows again.
This process continues until the queue is empty.
Notes:
Insert delayed should be used only for the INSERT statement that specifies the value list. The server ignores the DELAYED used for the insert delayed... SELECT statement.
The server ignores the DELAYED used for the insert delayed... on duplicate update statement.
Because the statement returns immediately before the row is inserted, you cannot use LAST_INSERT_ID () to obtain the AUTO_INCREMENT value. The AUTO_INCREMENT value may be generated by the statement.
For SELECT statements, DELAYED rows are invisible until these rows are indeed inserted.
DELAYED is ignored in the slave replication server, because DELAYED does not generate data different from the master server in the slave server.
Note: Currently, each row in the queue is only stored in the memory until they are inserted into the table. This means that if you forcibly stop mysqld (for example, use kill-9)
Or if mysqld stops unexpectedly, all rows that are not written to the disk will be lost.

II. Use of IGNORE
IGNORE is an extension between MySQL and standard SQL. If duplicate keywords exist in the new table,
Or, if a warning is reported after the STRICT mode is started, IGNORE is used to control the running of the alter table.
If no IGNORE is specified, the copy operation is abandoned when a duplicate keyword error occurs, and the previous step is returned.
If IGNORE is specified, only the first row is used for rows with duplicate keywords, and other conflicting rows are deleted.
In addition, correct the error value so that it is as close as possible to the correct value.
Insert ignore into tb (...) value (...)
In this way, you do not need to check whether there is any. If yes, ignore it. if no, add it.

III. use of ON DUPLICATE KEY UPDATE
If you specify on duplicate key update and insert a row, DUPLICATE values will appear in a UNIQUE index or primary key, the old row will be updated. For example, if Column a is defined as UNIQUE and contains a value of 1, the following two statements have the same effect:
Mysql> insert into table (a, B, c) VALUES (1, 2, 3)
-> On duplicate key update c = c + 1;

Mysql> UPDATE table SET c = c + 1 WHERE a = 1;

If a row is inserted as a new record, the value of the affected row is 1. if the original record is updated, the value of the affected row is 2.
Note: If Column B is also a unique column, INSERT is equivalent to this UPDATE statement:
Mysql> UPDATE table SET c = c + 1 WHERE a = 1 OR B = 2 LIMIT 1;

If a = 1 OR B = 2 matches multiple rows, only one row is updated. Generally, you should avoid using the on duplicate key clause for tables with multiple unique keywords.

You can use the VALUES (col_name) function from INSERT... The INSERT part of the UPDATE statement references the column value. In other words, if there is no duplicate keyword conflict, the VALUES (col_name) in the UPDATE clause can reference the value of the inserted col_name. This function is particularly applicable to multiline inserts. The VALUES () function is only used in INSERT... The UPDATE statement is meaningful. otherwise, NULL is returned.
Example:

Mysql> insert into table (a, B, c) VALUES (1, 2, 3), (4, 5, 6)
-> On duplicate key update c = VALUES (a) + VALUES (B );
This statement serves the same purpose as the following two statements:
Mysql> insert into table (a, B, c) VALUES (1, 2, 3)
-> On duplicate key update c = 3;
Mysql> insert into table (a, B, c) VALUES (4, 5, 6)
-> On duplicate key update c = 9;

When you use on duplicate key update, the DELAYED option is ignored.

BitsCN.com

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More