MySQL performance optimization

Source: Internet
Author: User
Tags bit set

Home page of http://blog.csdn.net/uestc_huan/article/category/536350 Bo Master

http://blog.csdn.net/uestc_huan/article/details/6071081 Blog Address

This article does not explain how MySQL server parameters are set from a DBA perspective, but from the perspective of programmers and architects, to illustrate some of the MySQL optimization points to note when writing programs and designing systems. Therefore, the setting meaning and tuning of server parameters are not discussed in detail in this article.

About Indexes

1. Calculation of the depth of b+tree used by MySQL:

B + Tree size per block = size/index value of each block

For example, MySQL each block is 4K, the index value is set to 4 bytes of int, then the size of each index block is 4K/4 = 1k.

So storing 100 million rows of data requires a B + tree height of

h = log1000 (100 million) = 3 (base 1000, logarithm of 100 million). The height of B + tree is not very high, the general depth will not exceed 5.

2. To avoid random IO operations, a joint index can be established to avoid random IO operations. The index-value pair is used to create a federated index. This allows you to read the data directly from the index without having to read the actual data from the index to the disk.

For example: Select Age from User_info where Name=lisi;

Then, to establish a federated index of the Name-age, the data can be read directly from the index. If only the index of name is established, then it needs to locate the row data according to Name=lisi, and then read the age from this location.

3. Using a hash index is faster than a B+tree index if it is just a point query and does not require a range query. is simply to need the where aaa=bbb such query conditions, and do not need where aaa> BBB such a query condition.

4. Using Logloader or dump operation Speed will be more than 5 times times faster than a simple SQL statement.

The 5.log is placed on a separate disk to mitigate the IO bottleneck.

6. Slow query is the main factor that affects MySQL performance. Slow queries are processed primarily by parsing slow query logs. Many of the existing tools can be analyzed. such as Mysqldumpslow, Mysql_slow_log_filter, Mysql_slow_log_parser, Mysqlsal and so on.

Understanding of the business

1. Transactions should have ACID properties, (atomicity, consistency, isolation, durability), and consistency and persistence are better understood. Atomicity and isolation require some clarification.

The atomicity of a transaction refers only to the atomicity of the same transaction itself. When one transaction is not executed or half executed, another transaction (in another connection) can be executed concurrently. This involves the concept of transaction isolation (isolation). You can do a test with two clients to connect to the database, start the transaction separately, and execute two processes respectively. The other transaction is blocked only if the Upate,insert,delete statement is executed in the transaction and the number of rows in the affect is greater than 0. And this is also related to the database isolation level.

The following is an explanation of the degree of isolation.

(1). View the current session isolation level

SELECT @ @tx_isolation;

(2). View the current isolation level of the system

SELECT @ @global. tx_isolation;

(3). Set the current session isolation level

Set session transaction Isolatin level repeatable read;

(4). Set the current isolation level of the system

Set global transaction ISOLATION level repeatable read;

(5). command line, when you start a transaction

Set Autocommit=off or start transaction

Understanding of Isolation Levels

(1) READ UNCOMMITTED

You can see uncommitted data (dirty reads), for example: You believe what people say, but maybe he just says it and doesn't actually do it.

(2) Read Committed

Reads the submitted data. However, data results that may be read multiple times are inconsistent (non-repeatable read, Phantom Read). The reading and writing view is: Read the row data, can write.

(3) Repeatable READ (MySQL default isolation level)

In MySQL, new data for other transactions is not visible and does not produce phantom reads. The multi-version concurrency control (MVCC) mechanism is used to solve the phantom reading problem.

(4) Serializable

Readable, non-writable. Write data must wait for another transaction to end.

7. Two transactions can occur with deadlocks,

As an example:

Transaction1

Start transaction;

Update StockPrice Set close=45.5 where stock_id = 4;

Update StockPrice set close=47.7 where stock_id=3;

Commit;

Transaction2

Start transaction;

Update StockPrice Set close=45.5 where stock_id = 3;

Update StockPrice set close=47.7 where stock_id=4;

Commit;

A deadlock occurs when two transactions have executed the first one and no second is executed.

About data types

1. Avoid using the default null numeric value for the sake of faster indexing (although the problem is not already present in the high version of MySQL), and on the other hand, avoid unnecessary puzzles and bugs from the application.

2. The difference between int (1) and int (20) is only on display, which is exactly the same for the calculation and stored procedure.

3. Decimal in the 4.1 version and before, only the storage type, it is less efficient than the float and double. Demical should only be used in financial figures, as it can specify the desired computational precision.

4. BLOBs and text are the most performance-consuming. Because the memory storage mechanism (storage engine) does not support both types, the disk staging table is used for querying both types of data. Avoid using both types of data whenever possible. If you want to sort the two types of data, use ORDER BY substring (column, length) to go to string type. At this point, if the substring is small, it can be turned into a temporary memory table, the speed is much faster. In addition, the maximum efficiency of the character matching search algorithm is not higher than the efficiency of the KMP algorithm.

5. Enum can be substituted for string type. An enum can save up to 65,535 characters. However, sorting is sorted by enum value instead of by string value.

6. Comparison of datetime and timestamp.

Datatime:8 bytes, independent of time zone, default value is null

Timestamp:4 bytes, associated with the time zone, the default insertion value is the current time.

If there is no special need, timestamp is recommended. Do not use integer data to store time, although it is possible, but not recommended, because there is no benefit.

7. It is generally not recommended to use bit data type, tinyint will be more conducive to expansion. The bit set can also be substituted by the bitwise operation of the tiny int.

8. Do not misuse MySQL type auto-conversion function

Query optimization

1. Using cross-Library operation statements may cause master and slaver inconsistencies

is a statement such as db.table, which may cause master and slaver to be inconsistent.

2.mysql is highly efficient for handling connections and releasing connections. It is designed to be a simple and fast way to query. Therefore, if you can split a very complex query into multiple simple queries, it will be more efficient to connect these small queries to the application layer. The main reason is that complex queries can lead to large-area locking and affect efficiency. Multiple small query locks are much smaller in scope. And the cost of the connection, in MySQL basically can be ignored.

Like what:

Select * FROM Tag

Join Tag_post on Tag_post.tag_id=tag.id

Join post on Tag_post.post_id=post.id

Where tag.tag= ' MySQL ';

Split into

Select * from tag where tag= ' MySQL ';

Select * from Tag_post where tag_id=1234;

Select * from Post where post.id in (123,2343,4545);

If the application layer can handle the splitting and grouping of these data, then the efficiency will be higher.

The benefits of doing this:

(1). The cache is more efficient and does not require the generation of intermediate table temporary tables.

(2) In the application layer to do a combination, more conducive to expansion, you can put some of the tables on a separate server, divided into open.

(3) Smaller range of locks

(4) In the execution efficiency is higher than the execution efficiency of the join. MySQL's sub-query is inefficient. It is said that 6.0 will be introduced Semijoin plan to solve. But you can use fewer joins or fewer joins at this time.

3. Several principles

(1) Optimization of query that needs optimization

(2) Positioning the performance bottleneck of the optimized object, IO or CPU or memory

(3) Clear optimization objectives

(4) Starting with explain, you can use the Force index to check the efficiency of different indexes. SELECT * from TABLE1 Force INDEX (FIELD1).

(5) Open profile: Use the command "set profiling" command. Use show profile to view the summary information. Other profile operations are not described in detail here.

(6) Always use to know the result set to drive a large result set

(7) Complete the sorting in the index whenever possible

(8) Do not take extra data

(9) Use the most effective filter conditions.

(10) Avoid using complex joins and subqueries whenever possible

4. Query statements do not operate on tabular data

For example, do not use SELECT * FROM tablename where From_unixtime (operatetime) < xxxxx;

Instead, use SELECT * FROM tablename where Operatetime < Unixtimstamp (xxxxx);

Table Design Optimization

1. The stored data encoding is best consistent with the encoding of the table.

2. Reasonable use of redundant data according to business characteristics, reduce join query.

3. Reasonable use of data types, can be simple not complicated, can be fixed long do not grow, can not use the character type of character.

4. Indexing is important and the index is designed reasonably.

5. Separate the infrequently used segments separately. A field that is large and not commonly used is not stored in a single table with some of the information that is commonly used.

Some statements

(1) Select for update

For query-update atomic operations

(2) LAST_INSERT_ID ()

Used to ensure uniqueness, such as order number

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.