Mysql optimization Summary (1)

Source: Internet
Author: User

Mysql optimization Summary (1)

I. SQL _mode mode of mysql:

(1) This mode means that the SQL mode can control server operations, and the SQL mode exists at the global and session level. If you want to know the current value of global or session-level SQL mode, you can use the following statement:

Mysql> SELECT @ GLOBAL. SQL _mode, mysql> SELECT @ SESSION. SQL _mode.

(2) If you want to know the current value of SQL mode, you can use: SELECT @ SQL _mode.

 

2. Some mysql uses the default storage engine InnoDB, and some use the default storage engine MyISAM. These two types are the most commonly used. There are six major differences between the two engines:

 

Storage Type: MyISAM InnoDB
Differences in composition: 1. Each MyISAM is stored as three files on the disk. The name of the first file starts with the name of the table. The extension indicates the file type.
2. frm file storage table definition.
3. The data file extension is. MYD (MYData ).
4. The extension of the index file is. MYI (MYIndex ).
Disk-based resources are InnoDB tablespace data files and their log files. The InnoDB table size is limited by the operating system file size, generally 2 GB
Transaction processing: MyISAM tables emphasize performance, and the execution speed is faster than that of InnoDB tables, but transactions are not supported. InnoDB provides advanced database functions such as transactions and external keys.
Select update, INSERT, and Delete operations If you execute a large number of SELECT statements, MyISAM is a better choice. 1.If you execute a large amount of dataINSERT or UPDATEFor performance considerations, InnoDB tables should be used
2. delete from tableInnoDB does not create a new table, but deletes a row.
3. LOAD TABLE FROM MASTERThe operation does not work for InnoDB. The solution is to first change the InnoDB table to the MyISAM table and then the InnoDB table after the data is imported. However, for the use of additional InnoDB features (such as foreign keys) the table of is not applicable.
Operation on AUTO_INCREMENT 1. Internal processing of an AUTO_INCREMEN column in each table.
2.MyISAM automatically updates this column for INSERT and UPDATE operations. This makes the AUTO_INCREMENT column faster (at least 10% ). After the value at the top of the sequence is deleted, it cannot be reused. (When the AUTO_INCREMENT column is defined as the last column of the Multi-column index, the deleted value from the top of the sequence can be reused ).
3. The AUTO_INCREMENT value can be reset using alter table or myisamch.
4. For fields of the AUTO_INCREMENT type, InnoDB must contain only the index of this field. However, in the MyISAM table, you can create a joint index with other fields.
5. better and faster auto_increment Processing
1. If you specify the AUTO_INCREMENT column for a table, the InnoDB table handle in the data dictionary contains a counter named Automatic growth counter, which is used to assign new values to the column.
2. The automatic growth counter is only stored in the primary memory, rather than on the disk.
3. For more information about the Algorithm Implementation of this calculator, see
4. How does the AUTO_INCREMENT column work in InnoDB?
The number of rows in the table. Select count (*) from table, MyISAM simply reads the number of rows saved. Note that when the count (*) statement contains the where condition, the operations of the two tables are the same. InnoDB does not store the specific number of rows in the table. That is to say, when you execute select count (*) from table, InnoDB needs to scan the entire table to calculate the number of rows.
Lock Table lock

Provides locking on row level and non-locking read in SELECTs consistent with Oracle. in addition, row locks of InnoDB tables are not absolute, if MySQL cannot determine the scan range when executing an SQL statement, the InnoDB table also locks the entire table, for example, update table set num = 1 where name like "% aaa %"

 

 

3. Why does the InnoDB Database need to be used for a large amount of data operations?
The so-called transaction processing is an atomic operation.
For example, if an Innodb table that supports transaction processing is built in a middle, posts are given points. You have posted a post to execute an insert statement to insert the post content. After insertion, You need to execute an update statement to increase your points. Assume that the insert operation is successful, but the update operation is not executed. That is to say, you posted a post but did not increase the corresponding points. This will cause user dissatisfaction. If transaction processing is used, insert and update are all put into the transaction for execution. At this time, data is updated and written to the data only when both insert and update statements are generated. If any of the statements fails, the data is rolled back to the initial state, write is not performed. This ensures that insert and update are executed together.
The mysiam table does not support transaction processing, and the mysiam table does not support foreign keys. Do not need to mention the foreign key? If you do not know, go to the Internet to check it.
At the same time, when the database write operations (insert, update, delete) are performed, the mysiam table locks the table, while the innodb table locks the row. In other words, if you execute an update statement, the mysiam table will lock the entire table and other insert, delete, and update statements will be rejected, the update statement is executed in sequence after it is executed.
The lock row means that if you execute the update statement, only the record will be locked, only other write and update operations for this record will be blocked and executed after the update statement is executed. Write operations for other records will not be affected.
Therefore, when your database has a large number of write and update operations and few queries or high data integrity requirements, choose innodb table. When your database is mainly used for queries, there are fewer updates and writes in comparison, and the business data integrity requirements are not that strict, select the mysiam table. Because the query efficiency and speed of mysiam tables are faster than that of innodb

 

4. How should we optimize big tables with tens of millions of data records?

1). Data capacity: the total number of data records and the total number of bytes of each data record within 1-3 years;
2). Data item: whether there are large fields and whether the values of those fields are updated frequently;
3). SQL condition for Data Query: the column names of which data items frequently appear in the WHERE, GROUP BY, and ORDER BY clauses;
4) data update SQL conditions: the number of columns that frequently appear in the WHERE clause of UPDATE or DELETE;
5). SQL volume statistical ratio, such as: SELECT: UPDATE + DELETE: INSERT =?
6) What is the average daily execution volume of large tables and associated SQL statements?
7). Data in the Table: Update-oriented business or query-oriented business
8) What database physical servers will be used and the database server architecture?
9) How is concurrency?
10) Does the storage engine select InnoDB or MyISAM?
I have a general understanding of the above 10 questions. Everything should be clear about how to design such a large table!
If optimization refers to a created table and cannot change the table structure, we recommend that you use the InnoDB engine to reduce disk I/O load by using more memory points, because I/O is often the bottleneck of the database server, in addition, if you want to optimize the index structure to solve performance problems, we recommend that you modify SQL statements to make them faster. You have to rely only on the index structure. Of course, the premise is that, the index has been created very well. If it is read-oriented, you can consider playing query_cache and adjust some parameter values: sort_buffer_size, read_buffer_size, read_rnd_buffer_size, join_buffer_size

 

V. optimization rules for mysql in and EXISTS:

When the dataset of Table B is smaller than the dataset of Table A, in is better than exists. When the dataset of Table A is smaller than that of Table B, exists is better than in.

Optimization Principle: in small tables drive large tables, that is, small datasets drive large datasets.

The example select * from A where id in (select if from B) is equivalent to for select * from A where B and for select * from A where A. id = B. id.

When the dataset of Table B must be smaller than the dataset of Table A, in is better than exists.

Select * from A where exists (select 1 from B where B. id = A. id)

When the dataset of Table A is less than the dataset of Table B, in is better than exists.

 

Vi. mysql order by statement usage and optimization details:

Order by keyword is used to classify the data in the Record Based on the keyword.

SELECT _ name (s) FROM table_name order by column_name.

1) order by index optimization.

SELECT [column1], [column2],... FROM [TABLE] order by [sort];

2). WHERE + order by index optimization.

SELECT [column1], [column2],... FROM [TABLE] WHERE [CcolumnX] = [value] order by [sort];

Create a joint index (columnX, sort) to optimize order. If columnX corresponds to multiple values, the above index cannot be used to optimize order.

3). WHERE + multiple fields ORDER

SELECT * FROM [TABLE] WHERE uid = 1 order by x, y LIMIT 0, 10;

Creating an index (uid, x, y) to optimize order by is much more effective than creating an index (x, y, uid.

In some cases, mysql can use an index to satisfy the order by clause without additional sorting. The where condition and order by condition use the same index, and the order by order is the same as the index order, and the order by field is both ascending or descending.

Select * from t1 order by key_part1, keypart2 ,...;

Select * from t1 where key_part1 = 1 order by key_part1 DESC, key_part2 DESC;

Select * from t1 where order by key_part1 DESC, key_part2 DESC;

However, the index is not applicable in the following situations:

① SELECT * FROM t1 order by key_part1 DESC, key_part2 ASC;
-- Order by field mixing ASC and DESC
② SELECT * FROM t1 WHERE key2 = constant order by key1;
-- The keywords used to query rows are different from those used in order.
③ SELECT * FROM t1 order by key1, key2;
-- Use order by for different keywords

 

VII,Basic Optimization Configuration of InnoDB:

InnoDB settings

1. innodb_buffer_pool_size-- The default value is 128 M. this is the most important optimization option because it specifies how much memory InnoDB uses to load data and indexes (data + indexes ). for dedicated MySQL servers, we recommend that you specify the range of 50-80% of the physical memory. for example, for machines with 64 GB physical memory, the cache pool should be set to around 50 GB.
If you set this value to a greater value, there may be risks. For example, some MySQL subsystems (subsystems) that do not have enough free memory to be reserved for the operating system or dependent on the file system cache ), including binary logs and InnoDB transaction logs.

2. innodb_log_file_size-- The default value is 48 M. A system with a high write throughput needs to add this value to allow the background checkpoint activity to smoothly write data within a longer period of time to improve performance. setting this value below 4 GB is safe. past practices show that the disadvantage of log files is that they increase the repair time required for crash, but this has been significantly improved in 5.5 and 5.6.

3. innodb_flush_method-- The default value is fdatasync. if you use a hardware RAID disk controller, you may need to set it to O_DIRECT. this prevents the "double buffering" effect when reading the InnoDB buffer pool. Otherwise, two copies (copy) will be formed between the file system cache and the InnoDB cache ).
If you do not use a hardware RAID Controller or use SAN storage, O_DIRECT may cause performance degradation. MySQL user manual and Bug #54306 describe this in detail.

4. innodb_flush_neighbors-- The default value is 1. it should be set to 0 (disabled) for SSD storage because sequential IO does not have any performance benefits. this setting should also be disabled on some hardware that uses RAID, because logically continuous blocks on physical disks cannot be ensured to be continuous.

5. innodb_io_capacity and innodb_io_capacity_max-- These settings will affect how many operations InnoDB performs in the background per second. if you have a deep understanding of hardware performance (for example, how many I/O operations can be performed per second), you can use these features rather than idle it.


There is a good analogy: If a ticket is not sold for a certain flight, it may be a good strategy for some people on the flight to take the flight later, in case of bad weather. that is to say, the background operations will be processed by the way to reduce the competition for possible real-time operations later.

There is a simple calculation: if each disk can read/write (IOPS) up to 200 times per second, the raid 10 disk array IOPS with 10 disks theoretically = (10/2) * 200 = 1000. I said it is "very simple" because RAID controllers can usually provide additional merging and effectively improve IOPS capabilities. for SSD disks, IOPS can easily reach several thousand.

Setting these two values too large may lead to some risks. You certainly do not want background operations to impede the performance of foreground task I/O operations. past experience has shown that setting these two values too high will lead to performance degradation by the internal locks held by InnoDB (according to what I know, this has been greatly improved in MySQL5.6 ).

Innodb_lru_scan_depth-The default value is 1024. This is a new option introduced in mysql 5.6. Mark Callaghan provides some configuration suggestions. Simply put, if innodb_io_capacity is increased, innodb_lru_scan_depth should be added at the same time.


Replication)

If the server needs to support master-slave replication or restore by time point, in this case, we need:

1. log-bin-- Enable binary log. by default, binary logs are not accident-safe (not crash safe), but as I mentioned in earlier articles, we recommend that most users target stability. in this case, you also need to enable: sync_binlog = 1, sync_relay_log = 1, relay-log-info-repository = TABLE and master-info-repository = TABLE.

2. expire-logs-days-- The old logs are retained by default. We recommend that you set them to 1-10 days. It is not helpful to save the logs for a longer time, because the recovery from the backup will be much faster.

3. server-id-- A unique server-id must be set for all servers in a master-slave replication system (replication topology.

4. binlog_format = ROW-- Modified to row-based replication. another article I recently wrote about Row-based replication describes why I really like it because it can improve performance by reducing resource locks. in addition, you must enable two additional settings: transaction-isolation = READ-COMMITTED and innodb_autoinc_lock_mode = 2.

Other configurations (Misc)

1. timezone = GMTSet the time zone to GMT. more and more system administrators are advised to set all servers to GMT ). I personally like this very much, because almost all business is global now. setting your local time zone seems a bit arbitrary.

2. character-set-server = utf8mb4 and collation-server = utf8mb4_general_ciAs described in the previous article, UTF-8 encoding is a better default option for new applications. you can also set skip-character-set-client-handshake to ignore other character sets (character-set) that the application wants to set ).

3. SQL-mode-- MySQL is very tolerant of non-standard data by default and silently truncate data. In my previous article, I mentioned that it is best to set the new application:

STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY.

4. skip-name-resolve-- Disable Reverse Domain name resolution. DNS resolution may be slow/unstable On some systems. Therefore, if you do not need to authorize Based on the host name, we recommend that you avoid this resolution.

 

5. max_connect_errors-- Todd Farmer wrote: "[this function] provides protection against brute-force access attacks without practical significance ". in fact, when skip-name-resolve is set, max_connect_errors does not even work (as described in the previous section ).

Firewall is a more suitable solution. Generally, I block port 3306. Whether it is a public network or an intranet port, only a specific application can access and connect to MySQL.
I usually set max_connect_errors = 100000, so that I can avoid any "Double configuration" to ensure that it will not get in the way.

6. max-connections-- The default value is 151. I have seen many users set it to a relatively large value, mostly between 300 and ~ Between 1000.
This value is usually set to be larger, but I am a little nervous that 16-core machines only have about 2x ~ 10 x connection execution capability.
You may expect that many open connections are idle and sleep, but if they are all active, a large number of new threads (thread-thrash) may be created ).
If conditions permit, you can configure the connection pool (connection-pools) for the application to solve this problem, rather than opening and maintaining a large number of connections;
Of course, it is also feasible for applications that do not use a connection pool (non-pooled) to quickly open and close the connection as soon as possible after the task is executed.
Another solution starting from 5.5 (there are some differences between the MySQL Community edition and the Enterprise Edition) is to use the thread pool plug-in.


Conclusion)

Assume that the MySQL server is configured as follows:
1.64GB physical memory
2. Hardware RAID Controller (assuming I/O can reach 2000 IOPS per second)
3. Master-slave Replication is required)
4. New Applications (eg. Non-legacy systems)
5. firewall protection
6. authorization based on the domain name (hostnames, host name) is not required
7. Global applications do not want to be fixed in a certain time zone.
8. You want the program to be reliable and stable (durable ).

The configuration may be as follows:

# InnoDB settingsinnodb_buffer_pool_size=50Ginnodb_log_file_size=2Ginnodb_flush_method=O_DIRECTinnodb_io_capacity=2000innodb_io_capacity_max=6000innodb_lru_scan_depth=2000# Binary log/replicationlog-binsync_binlog=1sync_relay_log=1relay-log-info-repository=TABLEmaster-info-repository=TABLEexpire_logs_days=10binlog_format=ROWtransaction-isolation=READ-COMMITTEDinnodb_autoinc_lock_mode = 2# Othertimezone=GMTcharacter-set-server=utf8collation-server=utf8_general_cisql-mode="STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY"skip-name_resolvemax-connect-errors=100000max-connections=500# Unique to this machineserver-id=123

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.