MySQL's strategy for improving the efficiency of data loading operations

Source: Internet
Author: User
Tags mysql insert requires file permissions

"Guide" This article describes MySQL's strategy to improve the efficiency of data loading operations. More often than not, it's about optimizing select queries, because they're the most commonly used queries, and it's not always straightforward to decide how to optimize them. In contrast, loading data into a database is straightforward.

More often than not, it's about optimizing select queries, because they're the most commonly used queries, and it's not always straightforward to decide how to optimize them. In contrast, loading data into a database is straightforward. However, there are also strategies that can be used to improve the efficiency of data loading operations, and the rationale is as follows:

A bulk mount is faster than a single-line load because after each record is loaded, the index cache is not refreshed, and can be refreshed after a batch record is mounted.

Loading the peso after the table has no index is faster to load. If you have an index, you must not only increase the record to the data file, but also modify each index to reflect the new record that was added.

Shorter SQL statements are faster than long SQL statements because they involve less analysis of the server side and are faster because they are sent from the client to the server over the network. Some of these factors seem trivial (especially the last one), but if a large amount of data is to be loaded, even small factors can produce very different results. We can deduce several practical conclusions about how to load data as quickly as possible using the above general principles:

Load DATA (including all of its forms) is more efficient than insert because it loads rows in batches. The index refreshes less, and the server only needs to parse and interpret a single statement rather than a few statements.

Load data is more efficient than load data local. With load DATA, files must be located on the server and must have file permissions, but the server can read files directly from disk. Using the load DATA Local, the client reads the file and sends it over the network to the server, which is slow.

If you must use INSERT, you should use a form that allows you to specify multiple lines in a single statement, such as:

The more rows you can specify in the statement, the better. This reduces the number of statements required and reduces the amount of index refreshes. If you use mysqldump to generate a database backup file, you should use the--extended-insert option to make the dump file contain multiple INSERT statements. You can also use-o P-t (optimization), which enables the--extended-insert option. Conversely, you should avoid using the mysqldump--complete-insert option, which causes the INSERT statement to be a single row, longer execution time, and requires more analysis than statements generated without the--complete-insert option.

Use compressed client/server protocols to reduce network data traffic. For most MySQL clients, you can specify them by using the--compress command-line option. It is typically used only for slower networks, because compression requires a large amount of processor time.

Let MySQL insert the default value; Do not specify the columns in the INSERT statement that will give the default values in any way. On average, this makes the statement shorter and reduces the number of characters sent over the network to the server. In addition, the statement contains fewer values, and the server does less analysis and conversion.

If the table is indexed, you can use bulk inserts (LOAD DATA or multiple-line INSERT statements) to reduce the cost of indexing. This minimizes the impact of index updates, because indexes need to be refreshed only when all row processing is obsolete, not after each row is processed.

If you need to load a large amount of data into a new table, you should create the table and load the data before the index is loaded, which is faster. It is faster to create an index at a time rather than to modify it once per row.

If the index is deleted or disabled before loading, recreating or enabling the index after loading the data may make the load faster. If you want to use a delete or disable policy for data loading, be sure to do some experimentation to see if it's worth it (if you load a small amount of data into a large table, the rebuild and index may take longer than the time it takes to load the data).

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.