"MYSQL Big Data volume quick insert and statement optimization"

Source: Internet
Author: User

INSERT the speed of the statement

The time required to insert a record is made up of the following factors, where the number represents the approximate scale: Connection: (3) Send query to server: (2) Analysis query: (2) Insert record: (1x record Size) Insert index: (1x index) Close: (1) This does not consider the initial cost of opening the table, Each concurrently running query is opened.

The size of the table slows down the insertion of the index at a speed of Logn (b-Tree).

Some ways to speed up insertions:

· If you insert many rows from the same client at the same time, insert a few rows using the INSERT statement with multiple value. This is faster than using a single-line INSERT statement (in some cases several times faster). If you are adding data to a non-empty table, you can adjust the bulk_insert_buffer_size variable to make the data insert faster. See section 5.3.3, "Server System Variables".

· If you insert many rows from different clients, you can speed up the insert delayed statement. See section 13.2, 4, "Insert Syntax".

· With MyISAM, if no rows are deleted in the table, the row can be inserted while the SELECT statement is running.

· When loading a table from a text file, use load DATA INFILE. This is usually 20 times times faster than using many INSERT statements. See section 13.2, 5, "LOAD DATA infile Syntax".

· When a table has many indexes, it is possible to do more work to make the load DATA infile faster. Use the following procedure:

Selectively use CREATE TABLE Create a table. executes the Flush tables statement or command mysqladmin flush-tables. Use Myisamchk--keys-used=0-rq/path/to/db/tbl_name. This cancels the use of all indexes from the table. Insert the data into the table with the load data infile, because no indexes are updated, so it's fast. If you only want to read the table later, use Myisampack to compress it. See section 15.1, 3.3, "Compression table characteristics". Re-create the index with Myisamchk-r-q/path/to/db/tbl_name. This creates an index tree in memory before writing to the disk, and it is faster because a large number of disk searches are avoided. The result index tree is also perfectly balanced. Executes the Flush tables statement or mysqladmin flush-tables command. Note that if you insert an empty MyISAM table, the load data infile can also perform the previous optimizations, and the main difference is that you can have myisamchk allocate more temporary memory for the creation of the index than the load data The infile statement is assigned more to the server when re-creating the index.

You can also use the ALTER TABLE tbl_name DISABLE keys instead of the MYISAMCHK--keys-used=0-rq/path/to/db/tbl_name, using ALTER TABLE Tbl_name ENABLE Keys instead of Myisamchk-r-q/path/to/db/tbl_name. In this way, you can also skip the flush TABLES.

· Locking a table can speed up insert operations with multiple statements:

LOCK TABLES a write;insert into a values (1,23), (2,34), (4,33), INSERT into a values (8,26), (6,29); UNLOCK TABLES; This improves performance because the index buffers are flushed to disk only once after all INSERT statements have completed. How many INSERT statements typically have a flush of index buffers. If you can insert all the rows with a single statement, you do not need to lock.

For a transaction table, you should use begin and commit instead of lock tables to speed up the insertion.

Locking will also reduce the overall time of the multi-connection test, although the maximum wait time will rise as they wait for the lock. For example:

Connection 1 Does inserts

Connections 2, 3, and 4 do 1 insert

Connection 5 Does inserts

If you do not use locks, 2, 3, and 4 will be completed before 1 and 5. If you use locks, 2, 3, and 4 will probably not be completed before 1 or 5, but the overall time should be about 40% faster.

Insert, update, and delete operations are fast in MySQL, and better overall performance can be achieved by locking up more than about 5 consecutive insertions or updates in a row. If you insert multiple times in a row, you can execute lock TABLES and then immediately execute unlock TABLES (approximately every 1000 rows) to allow other threads to access the table. This will also get good performance.

The insert load data is much slower than the load infile, even with the above policy.

· To get faster speed on the load DATA infile and insert in the MyISAM table, increase the key high-speed buffer by increasing the key_buffer_size system variable. See section 7.5.2, "Tuning Server Parameters".

INSERT Grammar

INSERT [Low_priority | DELAYED | High_priority] [Ignore][into] tbl_name [(Col_name,...)] VALUES ({expr | DEFAULT},...), (...),... [on DUPLICATE KEY UPDATE col_name=expr, ...] Or:

INSERT [Low_priority | DELAYED | High_priority] [Ignore][into] Tbl_nameset col_name={expr | DEFAULT}, ... [on DUPLICATE KEY UPDATE col_name=expr, ...] Or:

INSERT [Low_priority | High_priority] [Ignore][into] tbl_name [(Col_name,...)] SELECT ... [on DUPLICATE KEY UPDATE col_name=expr, ...]

First, DELAYED the Use

Use the deferred Insert action to apply the delayed modifier to the INSERT and replace statements. When the delayed insert operation arrives,

The server puts the data row in a queue and immediately returns a status message to the client so that the customer

Can continue operations before the data table is actually inserted into the record. If the reader is from this data

Data is read in the table, the data in the queue is persisted until there is no reader. Then the server

Begins inserting data rows in a deferred data line (Delayed-row) queue. At the same time as the insert operation, the server

Also check to see if there are new read requests arriving and waiting. If there is, the deferred data line queue is suspended,

Allows the reader to continue the operation. When there is no reader, the server begins inserting the deferred data row again.

This process continues until the queue is empty. A few things to note:

· Insert delayed should be used only for INSERT statements that specify a value list. Server ignored for insert DELAYED ... The delayed of the SELECT statement.

· Server ignored for insert DELAYED ... The delayed of the on DUPLICATE UPDATE statement.

· Because the statement returns immediately before the row is inserted, you cannot use last_insert_id () to get the auto_increment value. The auto_increment value may be generated by the statement.

· For SELECT statements, the delayed rows are not visible until the rows are actually inserted.

· Delayed is ignored in the subordinate replication server because delayed does not produce data that is not the same as the primary server in the secondary server. Note that the rows currently in the queue are only saved in memory until they are inserted into the table. This means that if you forcibly abort the mysqld (for example, using kill-9)

Or if mysqld stops unexpectedly, all rows that are not written to the disk will be lost.

Second, IGNORE the Use Ignore is an extension of MySQL relative to standard SQL. If there are duplicate keywords in the new table,

Or, if a warning occurs after strict mode is started, use ignore to control the operation of ALTER TABLE.

If ignore is not specified, the copy operation is discarded when a duplicate keyword error occurs, returning to the previous step.

If ignore is specified, for rows with duplicate keywords, only the first row is used, and the other conflicting rows are deleted.

Also, correct the error value so that it is as close to the correct value as possible. Insert ignore into TB (...) value (...) This does not need to check whether there is, there is ignored, none add Three, on DUPLICATE key update use if you specify on DUPLICATE key update, and insert the row will cause in a unique index or primary If duplicate values appear in key, the old line update is performed. For example, if column A is defined as unique and contains a value of 1, the following two statements have the same effect:

mysql> INSERT into table (a,b,c) VALUES

-On DUPLICATE KEY UPDATE c=c+1;

mysql> UPDATE table SET c=c+1 WHERE a=1;

If the row is inserted as a new record, the value of the affected row is 1, and if the original record is updated, the value of the affected row is 2.

NOTE: If column B is also the only column, the insert is equivalent to this UPDATE statement:

mysql> UPDATE table SET c=c+1 WHERE a=1 OR b=2 LIMIT 1;

If A=1 OR b=2 matches multiple rows, only one row is updated. In general, you should try to avoid using the on DUPLICATE key clause on a table with multiple unique keywords.

You can use the values (col_name) function from the Insert ... in the UPDATE clause. The insert portion of the UPDATE statement refers to the column value. In other words, if a duplicate keyword conflict does not occur, values (col_name) in the update clause can refer to the value of the col_name being inserted. This function is especially useful for multi-row insertions. The VALUES () function is only in the insert ... The UPDATE statement makes sense, and returns null at other times.

Example:

    1. mysql> INSERT into table (a,b,c) VALUES (4,5,6)

    • -On DUPLICATE KEY UPDATE c=values (a) +values (b);

    • This statement works the same as the following two statements:

    • mysql> INSERT into table (a,b,c) VALUES

    • -On DUPLICATE KEY UPDATE c=3;

    • mysql> INSERT into table (a,b,c) VALUES (4,5,6)

    • -On DUPLICATE KEY UPDATE c=9;

    • When you use the on DUPLICATE KEY update, the delayed option is ignored.


More Java,java learning, Java face Test http://techfoxbbs.com


"MYSQL Big Data volume quick insert and statement optimization"

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.