Optimization of MySQL statements and tables

Source: Internet
Author: User
-- General steps to optimize SQL -- 1. Learn the execution frequency of various SQL statements through show status and application features/* provide server status information through show status, you can also use the mysqladmin extended-STATUS Command. Show status displays session-level statistics and global statistics as needed. The following parameters count both MyISAM and InnoDB Storage engines: 1. the number of times com_select executes the Select Operation. Only 1 is accumulated in one query. 2. the number of insert operations performed by com_insert. insert operations in batches are accumulated only once. 3. number of times com_update executes the update operation; 4. the number of times com_delete executes the delete operation; */Show status where variable_name = 'com _ select';/* the following parameters are used to count the InnoDB Storage engine, and the accumulative algorithms are slightly different: 1. innodb_rows_read the number of rows returned by the SELECT query; 2. innodb_rows_inserted: number of rows inserted by the insert operation; 3. innodb_rows_updated: number of rows updated by the update operation; 4. innodb_rows_deleted: number of rows deleted by the delete operation Through the above parameters, you can easily understand whether the current database application is dominated by insert and update, query operations, and the approximate execution ratio of various types of SQL statements. The Count of update operations is the count of the number of executions. Both the commit and rollback operations are accumulated. For transactional applications, you can use com_commit and com_rollback to learn about transaction commit and rollback. For databases with frequent rollback operations, it may mean that there is a problem in application writing. In addition, the following parameters help us understand the basic information of the database: 1. number of times that connections attempted to connect to the MySQL server 2. uptime server working time 3. slow_queries: the number of slow queries. 2. You can locate SQL statements with low execution efficiency in the following two ways: 1. you can use the slow query log to locate SQL statements with low execution efficiency. When you start the SQL statements with the -- log-Slow-queries [= file_name] option, mysqld writes a log file containing all SQL statements whose execution time exceeds long_query_time. You can link to relevant chapters in management and maintenance. 2. the slow query log is recorded only after the query is completed. Therefore, when the application reports an execution efficiency error, the slow query log cannot be queried, you can use the show processlist command to view the current MySQL thread, including the thread status and whether to lock the table. You can view the SQL Execution in real time and optimize some lock table operations. */Show processlist;/* 3. Analyze the execution plan of inefficient SQL statements by using the explain statement. After the SQL statements with low efficiency are queried in the preceding steps, we can use explain or DESC to obtain information about how MySQL executes the SELECT statement, including how the table connects to and connects to the SELECT statement execution process. Explain can know when to add an index to the table to obtain a faster SELECT statement that uses indexes to search for records. */Explain select * From message a left join mytable B on. id = B. ID where. id = 1; /* returned results + -------- + --------------- + ------- + -------------- + ---------------- + ----------- + ------- + -------- + -------------- + | ID | select_type | table | type | keys | key_len | ref | rows | extra | + -------- + --------------- + ------- + -------------- + ---------------- + ----------- + ------- + -------- + ----- --------- + | 1 | simple | A | const | primary | 4 | const | 1 | 1 | simple | B | all | null | 9999 | | + -------- + --------------- + ------- + -------------- + ---------------- + ----------- + ------- + -------- + -------------- + select_type: select Type table: Type of the output result set table: indicates the join type of the table. ① when only one row in the table is of type value, system is the best connection type; ② when the Select Operation uses indexes for table join, the value of type is ref; ③ When the Select Table connection does not use indexes, the value of type is always all, indicating that A full table scan is performed on the table. You need to create an index to improve the table connection efficiency. Possible_keys: indicates the index columns that can be used during query. key: indicates the index key_len used: Index length rows: scan range extra: Description and description of the execution. 4. Identify the problem and take corresponding optimization measures, the cause of the problem can be confirmed, and corresponding measures can be taken based on the situation to optimize and improve the execution efficiency. For example, in the above example, we confirmed that the full table scan of Table B resulted in unsatisfactory efficiency. We created an index for the ID field of Table B, the number of rows to be scanned is obviously small. Returned results + -------- + --------------- + ------- + -------------- + ----------- + ------- + -------- + -------------- + | ID | select_type | table | type | keys | key_len | ref | rows | extra | + -------- + hour + ------- + -------------- + ----------- + ------- + -------- + -------------- + | 1 | simple | A | const | primary | 4 | const | 1 | 1 | simp Le | B | const | primary | 4 | const | 1 | + -------- + --------------- + ------- + -------------- + ----------- + ------- + -------- + ---------------- optimize SQL statements for data ==================================================== ============ 1. For tables of the MyISAM type, to import large amounts of data quickly, follow these steps. Alter table tablename disable keys; batch insert data alter table tablename enable keys; the first and second commands are used to enable or disable update of non-unique indexes of the MyISAM table. When importing a large amount of data to a non-empty MyISAM table, you can improve the import efficiency by setting these two commands. To import a large amount of data to an empty MyISAM table, the index is created only after the data is imported first by default, so you do not need to set it. */Alter table mytable disable keys; insert into mytable (ID, username, city, age) values (1, 'name1', 'city1 ', 10), (2, 'name2', 'city2 ', 20), (3, 'name3', 'city3', 30); alter table mytable enable keys; /* 2. For InnoDB tables, this method cannot improve the efficiency of data import. For InnoDB tables, we have the following ways to improve the import efficiency: ① because InnoDB tables are saved in the order of primary keys, therefore, sorting imported data by primary key can effectively improve the efficiency of data import. If the InnoDB table does not have a primary key, an internal column is created by default as the primary key. Therefore, if you can create a primary key for the table, you can use this advantage to improve the efficiency of data import. ② Run set unique_checks = 0 before the data is imported, disable the uniqueness check, and run set unique_checks = 1 after the import to restore the uniqueness check, which can improve the import efficiency. ③ If the application uses the automatic submission method, we recommend that you execute set autocommit = 0 before import, disable automatic submission, and execute set autocommit = 1 after import. Enable automatic submission, it can also improve the import efficiency. */Set unique_checks = 0; Set unique_checks = 1; Set autocommit = 0; Set autocommit = 1; /* optimize the insert statement ================================================== ============= 1. If multiple rows are inserted at the same time, USE insert statements with multiple values. This is faster than using separate insert statements (several times in some cases ). Insert into test values (1, 2), (1, 3), (1, 4 )... 2. If you insert many rows from different customers, you can use the insert delayed statement to get a higher speed. The meaning of delayed is to let the insert statement be executed immediately. In fact, the data is put in the memory queue and is not actually written to the disk; this is much faster than inserting each statement separately; low_priority is the opposite. It is inserted only after all other users have finished reading the table. 3. Store the index file and data file on different disks (using the table creation option ); 4. If you insert data in batches, you can increase the speed by adding the value of the bulk_insert_buffer_size variable. However, this can only be used for MyISAM tables. 5. When loading a table from a text file, use load data infile. This is usually 20 times faster than many insert statements; 6. Replace the insert statement based on the application; 7. Ignore duplicate records by using the ignore keyword Based on the application. */Insert delayed into mytable (ID, username, city, age) values (4, 'name4', 'city4', 40); insert low_priority into mytable (ID, username, city, age) values (5, 'name5', 'city5', 50); replace into mytable (ID, username, city, age) values (5, 'name5 ', 'city5', 50); insert ignore into mytable (ID, username, city, age) values (5, 'name5', 'city5', 50 ); /* optimize the group by statement ============================================== ============= default In this case, MySQL sorts all group by col1, col2 ,..... The query method is as follows: Specify order by col1, col2,... in the query ,.... If the statement explicitly contains an order by clause containing the same columns, MySQL can optimize it without slowing down, even though it still performs sorting. If the query includes group by but you want to avoid consumption of sorting results, you can specify order by null to prohibit sorting. Example: */select * From mytable group by username order by NULL; /* optimize the order by statement ============================================== ============== in some cases, mySQL can use an index in the order by clause without additional sorting. The where condition and order by condition use the same index, and the order by order is the same as the index order, and the order by field is both ascending or descending. For example, the following SQL statements can use indexes. Select * from T1 order by key_part1, key_part2 ,...; -- select * from T1 where key_part1 = 1 order by key_part1 DESC, key_part2 DESC; select * from T1 order by key_part1 DESC, key_part2 DESC; however, the following cases do not use indexes: Select * from T1 order by key_part1 DESC, key_part2 ASC; -- the order by field is mixed with ASC and descselect * from T1 where key2 = constant order by key1; -- the keywords used to query rows are different from those used in order by. Select * from T1 order by key 1, key2; -- use order by for different indexes: optimize the join statement ====================================================== ========= mysql4.1 starts to support SQL subqueries. This technique can use the SELECT statement to create a single column query result, and then use this result as a filter condition in another query. Subqueries can be used to complete SQL operations that require multiple logical steps at a time. At the same time, transactions or tables can be prevented from being locked and can be easily written. However, in some cases, subqueries can be replaced by more efficient join. Suppose we want to retrieve all users without order records, we can use the following query to complete: Select * From customerinfo where customerid not in (select customerid from salesinfo) if you use join ).. to complete this query, the speed will be much faster. Especially when the salesinfo table has an index on customerid, the performance will be better. The query is as follows: Select * From customerinfo left join salesinfoon customerinfo. customerid = salesinfo. customerid where salesinfo. customerid is null join ).. it is more efficient because MySQL does not need to create a temporary table in the memory to perform the query in two steps. Use sequence of insert, update, and delete ======================================== =================== MySQL allows you to change the statement scheduling priority, it can make queries from multiple clients better collaborate, so that a single client will not wait for a long time due to locking. Changing the priority also ensures that queries of specific types are processed faster. First, we should determine the type of the application, determine whether the application is query-oriented or update-oriented, whether to ensure query efficiency or update efficiency, and whether to prioritize query or update. The methods we mentioned below to change the scheduling policy are mainly for the MyISAM storage engine. For the InnoDB Storage engine, the execution of statements is determined by the order in which row locks are obtained. The default scheduling policy of MySQL can be summarized as follows: 1. Write operations take precedence over read operations. 2. Write operations on a data table can only occur once at a time point. Write requests are processed in the order they arrive. 3. Multiple read operations on a data table can be performed simultaneously. MySQL provides several statement modifiers that allow you to modify its scheduling policy: 1. The low_priority keyword applies to delete, insert, load data, replace, and update. 2. The high_priority keyword is applied to select and insert statements. 3. The delayed keyword is applied to insert and replace statements. If the write operation is a low_priority (low priority) Request, the system will not consider it as having a higher priority than the read operation. In this case, if the second reader arrives while the writer is waiting, the second reader is allowed to be inserted before the writer. The writer is allowed to start operations only when there are no other readers. The write operation of low_priority may be blocked forever. The high_priority (high priority) keyword of the SELECT query is similar. It allows select to insert a pending write operation, even if the write operation has a higher priority under normal circumstances. Another effect is that a high-priority SELECT statement is executed before a normal SELECT statement, because these statements are blocked by the write operation. If you want all statements that support the low_priority option to be processed by default at a low priority, use the -- low-priority-Updates option to start the server. Insert high_priority is used to increase the insert statement to a normal write priority, which can eliminate the impact of this option on a single insert statement. */Insert low_priority into mytable (ID, username, city, age) values (7, 'name7', 'city7', 70 ); /* optimize the data table ================================================== ============ 1. What data types are required for optimizing the table's data type table, it depends on the application. Although the length of fields must be redundant during application design, it is not recommended to leave a large number of redundant fields, which wastes storage and memory. Procedure analyse () can be used to determine the types of existing tables. This function can be used to optimize the Data Types of columns in a data table, you can determine whether to implement optimization based on the actual situation of the application. Syntax: Select * From tbl_name procedure analyse (); select * From tbl_name procedure analyse (16,256); each output column provides optimization suggestions for the Data Type of the columns in the data table. The second example tells procedure analyse () not to recommend for Enum types containing more than 16 or 256 bytes. Without such restrictions, the output information may be long; The enum definition is usually hard to read. When optimizing the field type, you can optimize it based on the statistical information and the actual situation of the application. Ii. Improve table access efficiency through splitting. Here we talk about splitting mainly for tables of the MyISAM type. The splitting method can be divided into two situations: 1. Vertical Splitting: vertical splitting only splits frequently accessed fields and infrequently accessed fields in the table into two tables based on the frequency of application access. The frequently accessed fields are as long as possible, this effectively improves the efficiency of Table query and update. 2. Horizontal Split: horizontal split refers to splitting data horizontally into several tables or partition data into multiple partitions Based on the application, this effectively prevents lock problems caused by MyISAM table reading and updating. III. The Standardization Design of the inverse normalization database emphasizes the independence of data. Data should be redundant as little as possible because there is too much redundant data, which means that more physical space is occupied, it also brings problems to data maintenance and consistency check. For applications with many query operations, a single query may need to access multiple tables. If the same data record is redundant in one table, the update cost will not increase much, however, the query efficiency can be significantly improved. In this case, you can consider using redundant data to improve the efficiency. 4. Use the create temporary table syntax for redundant statistical tables. It is a session-based table. The table data is stored in the memory. When the session is disconnected, the table is naturally eliminated. For the statistical analysis of large tables, if the statistical data volume is small, USE insert... It is more efficient for select to move data to a temporary table than to make statistics directly on a large table. 5. select a more appropriate table type. 1. If a severe lock conflict occurs in the application, consider whether to change the storage engine to InnoDB. The row lock mechanism can effectively reduce the occurrence of lock conflicts. 2. If the application has many query operations and does not have strict transaction integrity requirements, you can use the MyISAM storage engine. */Select * From mytable procedure analyse (16,256 ); /* use the connection pool ============================================ for other optimization measures ============== for accessing the database, establishing a connection is expensive. Therefore, it is necessary to establish a "connection pool" to Improve the access performance. We can regard the connection as an object or device, and there are many established connections in the pool. The access areas that originally need to be connected to the database are all connected to the pool, the Pool temporarily allocates connections for access. After the result is returned, the access will return the connection. Reduce access to MySQL =============================================== ============ 1. Avoid repeated searches for the same data: the application needs to clearly understand the access logic of the database and access to the same table. Try to concentrate on the same SQL access and extract results at a time to reduce repeated access to the database. Ii. use MySQL query cache: function: Query cache stores the SELECT query text and the corresponding results sent to the client. If the same query is received subsequently, the server returns the query result from the query cache, instead of parsing and executing the query. Applicability: tables without data updates. When the table is changed (including the table structure and table data), items related to the query cache value are cleared. Main parameter settings for querying cache: */show variables like '% query_cache %'; -- you can also write it as show variables where variable_name like '% query_cache % '; /* have_query_cache indicates that the server has configured the high-speed cache query_cache_size during installation. The size of the cache zone is measured in the unit of mquery_cache_type from 0 to 2, meaning 0 or off, respectively) 1 or on (the cache is enabled, except for select with SQL _no_cache) 2 or demand (only select statements with SQL _cache provide high-speed cache) in show status, you can monitor the query cache performance */Show status like '% qcache %';/* qcache_queries_in_cache the number of queries registered in the cache qcache _ Number of queries added to the cache by inserts qcache_hits cache sampling number qcache_lowmem_prunes the number of queries deleted from the cache due to lack of memory (qcache_not_cached cannot be cached, or because query_cache_type) qcache_free_memory queries the total number of idle memory blocks in the cache qcache_free_blocks queries the total number of idle memory blocks in the cache. 3. Add the cache layer: cache (high-speed cache) memory (memory), hard disk (hard disk) are data access units, but the access speed is very different, in a descending order. For the CPU, it can access data at high speed from the nearest cache, rather than from memory and hard disk to access data at a speed of several orders of magnitude. The data stored in the cache is often the data that the CPU needs to repeatedly access. There is a specific mechanism (or program) to ensure the hit rate (hit rate) of the data in the cache ). Therefore, the speed of CPU data access has been greatly improved after high-speed cache of applications. Because the cache manager is responsible for writing data to the cache, the cached content must be read-only for users. You need to do little work. The SQL statements in the program are not different from those when you access the DBMS directly, and the returned results are not different. Database vendors often provide cache-related parameters in the DB server configuration file. By modifying these parameters, they can optimize Cache Management for our applications. Server Load balancer ================================================= ======== 1. query operations by using MySQL replication: mySQL master-slave replication can effectively distribute update operations and query operations. The specific implementation is a master server that performs update operations, multiple slave servers, and query operations, data is synchronized between the master and slave nodes through replication. Multiple slave servers are used to ensure availability, while different indexes can be created to meet the needs of different queries. If you do not need to copy all tables between the master and slave nodes, you can set up a virtual slave server on the master server and set the tables to be copied to the slave server as the blackhole engine, then define the replicate-do-Table parameter to copy only these tables, so that the BINLOG to be copied is filtered out, reducing the bandwidth of the BINLOG to be transmitted. Because the virtual slave server only serves to filter the BINLOG, and does not actually record any data, the performance impact on the master database server is also very limited. The primary database may have frequent updates or network problems, which may result in data differences between the primary database and the primary database, the application needs to be designed. Ii. distributed database architecture: MySQL supports distributed transactions starting from 5.0.3. Currently, distributed transactions are only supported by the InnoDB Storage engine. The distributed database architecture is suitable for scenarios with large data volumes and high loads, with good scalability and high availability. Load Balancing between multiple servers is achieved by distributing data among multiple servers, Improving Access execution efficiency. Specifically, you can use the MySQL cluster function (NDB engine) or write your own program to implement global transactions. */

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.