Make your MySQL database load data more effectively _ MySQL

Source: Internet
Author: User
Most of the time, you are concerned about optimizing SELECT queries because they are the most common queries and it is not always straightforward to determine how to optimize them. Relatively speaking, it is straightforward to load data into the database. However, there are also policies that can be used to improve the efficiency of data loading. The basic principle is as follows: batch loading is faster than single-row loading, because after each record is loaded

Most of the time, you are concerned about optimizing SELECT queries because they are the most common queries and it is not always straightforward to determine how to optimize them. Relatively speaking, it is straightforward to load data into the database. However, there are also policies to improve the efficiency of data loading operations. The basic principles are as follows:

Batch loading is faster than single-row loading, because after each record is loaded, you do not need to refresh the index cache; you can refresh only after batch record loading.

Loading a table without indexing is faster than loading it after indexing. If there is an index, you must not only add records to the data file, but also modify each index to reflect the new records added.

Short SQL statements are faster because they involve less analysis on the server side and send them from the client to the server over the network. Some of these factors seem insignificant (especially the last one), but if you want to load a large amount of data, even a small factor will produce very different results. We can use the general principles described above to derive several practical conclusions on how to load data as quickly as possible:

Load data (including all its forms) is more efficient than INSERT because it loads rows in batches. Few indexes are refreshed, and the server only needs to analyze and explain one statement instead of several statements.

Load data is more efficient than load data local. When using load data, the FILE must be located on the server and have the FILE permission, but the server can directly read the FILE from the disk. Using load data local, the client reads the file and sends it to the server over the network, which is very slow.

If INSERT is required, multiple rows can be specified in a single statement. for example:

The more rows you can specify in a statement, the better. This reduces the number of required statements and the index refresh volume. If mysqldump is used to generate a database backup file, use the -- extended-insert option to include multiple INSERT statements. You can also use-o p t (optimized) to enable the -- extended-insert option.

Otherwise, you should avoid using the -- complete-insert option of mysqldump. this option will cause the INSERT statement to be a single row and the execution time will be longer, it requires more analysis than Statements generated without the -- complete-insert option.

Use the compressed client/server protocol to reduce network data traffic. For most MySQL clients, you can use the -- compress command line option to specify. It is generally used only for slow networks, because compression takes a lot of processing time.

INSERT the default value for MySQL. do not specify columns that will be given the default value in any way in the INSERT statement. On average, the statement will be shorter, reducing the number of characters transmitted to the server over the network. In addition, if the statement contains less values, the server performs less analysis and conversion.

If the table is indexed, you can use batch INSERT (load data or multiple rows of INSERT statements) to reduce the index overhead. This will minimize the impact of index updates, because indexes must be refreshed only when all rows are processed, rather than after each row is processed.

If you need to load a large amount of data into a new table, you should create the table and load the data before creating the index. this is faster. One index creation (instead of one index per row) is faster.

If you delete or disable indexes before loading, you may need to re-create or enable indexes after loading data. If you want to use a deletion or disabling policy for data loading, you must perform some experiments to check whether this is worthwhile (if you load a small amount of data into a large table, rebuilding and indexing may take longer than loading data ).

You can use drop index and create index to delete and recreate indexes. Another option is to use myisamchk or isamchk to disable and enable indexes. This requires an account on the MySQL server host and the permission to write data to the table file. To disable table indexes, you can enter the corresponding Database Directory and execute one of the following commands:

Use myisamchk for MyISAM tables with. MYI extension and isamchk for ISAM tables with. ISM extension. After loading data into the table, activate the index as follows:

If you decide to disable and activate indexes, use the table repair lock protocol described in chapter 13th to prevent the server from changing the lock at the same time (although the table is not repaired at this time, but you need to modify it like the table repair process, so you need to use the same locking protocol ).

The above data loading principle is also applicable to fixed queries related to clients that require different operations. For example, you generally want to avoid running the SELECT query for a long time on frequently updated tables. Running the SELECT query for a long time will produce a lot of contention and reduce the performance of the write program. One possible solution is to store the records in a temporary table and add these records to the primary table on a regular basis. If you need to access the new record immediately, this is not a feasible method. However, you can use this method as long as you can not access them in a short period of time. Using temporary tables has two advantages. First, it reduces contention with the SELECT query statement on the master table, so the execution speed is faster. Second, the total time for loading records from the temporary table to the master table is less than the total time for loading records separately. the corresponding index cache only needs to be refreshed at the end of each batch load, instead of refreshing after each row is loaded. An application in this policy accesses the MySQL database on the Web page of the Web server. In this case, it is not guaranteed that the record has high permissions to access the master table immediately.

If the data is not exactly a single record inserted in an abnormal shutdown event, another policy to reduce index refresh is to use the DELAYED_KEY_WRITE table creation option of the MyISAM table (this may happen if MySQL is used for some data input work ). This option only occasionally refreshes the index cache, instead of refreshing it after each insert.

If you want to use delayed index refresh within the server range, you only need to use the -- delayed-key-write option to start mysqld. In this case, the index block write operation is delayed until the block must be refreshed to free up space for other index values, or after a flush-tables command is executed, or the index table is closed.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.