Detailed Mysql performance optimization (i) _mysql

Source: Internet
Author: User
Tags hash mysql query

First, the main application of MySQL scenario
1, Web site system

2. Logging System

3. Data Warehouse System

4. Embedded system

Second, the MySQL frame composition:

Three, MySQL storage engine overview

1) MyISAM Storage Engine

The table for the MyISAM storage engine is in the database, and each table is stored as three physical files named after the table name. First of all, there must be any. frm files that store the table structure definition information that is indispensable to the storage engine, plus. MyD and. myi files, which contain table data (. myd) and index data (. myi) respectively. Each table has and only three files are stored as tables of the MyISAM storage type, which means that no matter how many indexes the table has, it is stored in the same one. Myi file.

MyISAM supports the following three types of indexes:

1, B-tree Index

B-tree index, as the name suggests, is that all the index nodes are stored according to the Balancetree data structure, all the index nodes are in the leaf node.

2, R-tree Index

The R-tree index is stored in a way that is different from the B-tree index, and is primarily designed to index the fields of storage and multidimensional data, so the current version of MySQL also supports only fields indexed by the geometry type.

3, Full-text Index

The Full-text index is the Full-text index we Long said, and his storage structure is b-tree. The main purpose is to solve the inefficiency problem in which we need to use like queries.

2) Innodb Storage Engine

1. Support Transaction Installation

2. Multi-version data reading

3, the improvement of the locking mechanism

4. Implement FOREIGN key

3) Ndbcluster Storage engine

The NDB storage engine, also known as the Ndbcluster storage engine, is used primarily for mysqlcluster distributed cluster environments, and cluster is a new feature that MySQL has been offering since version 5.0.

4) Merge Storage engine

The merge storage engine, also mentioned in the MySQL user manual, is also known as the Mrg_myisam engine. Why? Because the merge storage engine can be simply understood as its function is to achieve the structure of the same MyISAM table, through a number of special packaging to provide a single access portal, to reduce the complexity of the application of the purpose. To create a merge table, not only the structure of the base table is exactly the same, including the order of the fields, but the index of the base table must be exactly the same.

5) Memory Storage engine

Memory storage engine, by name it's easy to know that he is a storage engine that stores data in memory. The memory storage engine does not store any data on disk, only a. frm file with information about the table structure is above the disk. So once the Mysqlcrash or host crash, memory's table has only one structure left. The memory table supports indexes and supports both hash and b-tree two-format indexes. Because it is stored in memory, memory stores the data in a fixed-length space and does not support fields of BLOB and text types. The memory storage engine implements page-level locking.

6) BDB Storage Engine

The BDB storage engine, called the BerkeleyDB storage engine, is not a storage engine that MySQL has developed itself, but is provided by Sleepycatsoftware and, of course, an Open-source storage engine, which also supports transaction security.

7) Federated Storage Engine

The functionality implemented by the Federated storage Engine is basically similar to Oracle's Dblink, which provides access to the data on the remote MySQL server. If we use source code compilation to install MySQL, then we must manually specify the federated storage Engine to be enabled, because MySQL defaults to the storage engine.

8) Archive Storage Engine

The archive storage engine is primarily used to store outdated, infrequently accessed historical data through smaller storage space. The archive table does not support indexing through a. FRM structure definition file, a. arz data compression file and a. ARM's meta information file. Because of the particularity of the data it holds, the archive table does not support deletion, modify the

, only inserts and query operations are supported. The locking mechanism is row-level locking.

9) blackhole Storage Engine

The Blackhole storage engine is a very interesting storage engine, which is a "black hole" in its name. Just like the "/dev/null" device under our UNIX system, no matter what we write, there is no return.

) CSV Storage Engine

The CSV storage engine actually operates on a standard CSV file and does not support indexing. The main use is that you may sometimes need to export the data from the database into a report file, the CSV file is a more standard format supported by many software, so we can insert the resulting report information into the table by first creating a CVS table in the database. To get a copy of the CSV report file.

Iv. related factors affecting the performance of MySQLServer

1 The impact of business requirements on performance

Typical requirements: A total number of forum posts statistics, requirements: real-time update.

2 system architecture and implementation impact on performance

The following types of data are not suitable for storage in the database:

Binary Multimedia data

Flow queue Data

Very large text data

Improve system performance through cache technology:

System configuration and rules of data;

The basic information data of active users;

Personalized customized information data for active users;

Quasi-real-time statistical information data;

Other data that is frequently accessed but with less change;

3 The effect of query statement on system performance

Requirements: Take out the user number (ID) under a group (assuming ID 1), the user nickname (Nick_name), and follow the group Time (user_group.gmt_create) to reverse order, taking out the first 20.

Solution One:

Copy Code code as follows:
SELECT id,nick_name from User,user_group WHERE user_group.group_id=1 and user_group.user_id=user.id order by User_ Group.gmt_create desc limit 100, 20;

Solution Two:

Select User.id,user.nick_name from (
select user_id from
user_group
WHERE user_group.group_id=1
ORDER BY gmt_create desc
limit 100,20) t,user
WHERE t.user_id=user.id;

By comparing the execution plans of the two solutions, we can see the number of records in the first solution that need to participate in the join with the user table MySQL estimates it by statistical data is 31156, that is, all the satisfaction returned through the User_group table group_id= The number of records in 1 (the actual data in the system is 20000). In the second solution, the user table participates in the join with only 20 of the data, and we think the second solution should be significantly better than the first solution.

4 schema design affects the performance of the system

Minimize requests for database access.

Query requests that minimize unwanted data.

5 The impact of hardware environment on system performance

1. Typical OLTP application system

For all kinds of database system environment, the most common OLTP system, which is characterized by a large number of concurrent, overall data, but less data per visit, and access to the data is relatively discrete, active data accounted for the overall data is not too large. For such systems, the database is actually the most difficult to maintain, the most difficult to optimize, the overall performance requirements for the host is also the highest. Because not only the traffic is very high, the amount of data is not small.

For these characteristics and analysis above, we can get an approximate direction to the OLTP.

Although the system overall data volume is large, but the system active data in the total amount of data accounted for a small proportion, then we can expand the memory capacity to as much as possible to cache the active data into memory;

Although IO access is very frequent, but the amount of data per visit is small and very discrete, then our requirements for disk storage is IOPS performance is good, throughput is a secondary factor;

Concurrency is very high, CPU per second to deal with a lot of requests naturally, so CPU processing capacity needs to be relatively strong;

Although the amount of data per interaction with the client is not particularly large, but the network interaction is very frequent, so the host and client interaction network equipment on the capacity of traffic can not be too weak.

2. Typical OLAP application system

The main feature of OLAP system for data analysis is that the data volume is very large, concurrent access is not much, but each access needs to retrieve more data, and the data access is relatively concentrated, there is no obvious active data concept.

Based on the characteristics of OLAP system and the corresponding analysis, the approximate strategy for the hardware optimization of OLAP system is as follows:

The amount of data is very large, so the unit capacity of the disk storage system needs to be as large as possible;

The amount of single access data is large, and access data is more centralized, then the performance requirements for IO system is to have as much IO throughput per second as possible, so you should select the disk with the maximum throughput per second;

Although the IO performance requirements are relatively high, but fewer concurrent requests, so CPU processing capacity is more difficult to become a performance bottleneck, so CPU processing capacity is not too stringent requirements;

Although the number of visits per request is very large, but most of the data will not be returned to the client, the final return to the client data volume is small, so and the client interaction network equipment requirements is not too high;

In addition, because the OLAP system is longer in each operation, can be very good parallelization, so the general OLAP system is composed of a cluster of multiple hosts, and the cluster host and host data interaction between the general is very large, so in the cluster between the host network equipment requirements very high.

3, in addition to the above two typical applications, there is a special type of application systems, their data volume is not particularly large, but access requests and frequently, and most of the read request. There may be tens or tens of thousands of of requests per second, and each request is very simple, and probably mostly only one or a few smaller records, such as a database based DNS service.

Although the amount of data is small, but access is extremely frequent, so you can cache most of the data through larger memory, which can guarantee a very high hit rate, disk IO is small, so the disk does not need special high-performance;

Concurrent requests are very frequent, more than the need for a strong CPU processing capacity to deal with;

Although the application and database interaction is very large, but each interaction data is less, the overall flow will be larger, but generally speaking, the ordinary gigabit NIC is enough.

Introduction of MySQL locking mechanism

Row-level locking (Row-level)

Table-level Locking (Table-level)

Page-level locking (Page-level)

In the MySQL database, there are some non-transactional storage engines, such as myisam,memory,csv, that use table-level locking, while row-level locking is primarily the InnoDB storage engine and the Ndbcluster storage engine. Page-level locking is primarily the way the BerkeleyDB storage engine is locked.

Vi. Optimization of MySQL query

Query statement optimization ideas and principles mainly mention the following several aspects:

1. query optimization needs optimization;

2. Positioning optimization object performance bottleneck;

3. Clear optimization objectives;

4. Starting from explain;

5. Multiple Use profile

6. Always drive a large result set with a small result set;

7. Complete the ranking as much as possible in the index;

8. Take out only the columns you need;

9. Use only the most effective filtration conditions;

10. Avoid complex joins and subqueries as far as possible;

Rational design and utilization of indexes

1) B-tree Index

In general, the B-tree index in MySQL most of the physical files are stored in the balancetree structure, that is, all the actual data need to be stored in the tree Leafnode, and to any one Leafnode shortest path length is exactly the same, So we all call it the B-tree Index. Of course, it's possible that various databases (or MySQL's various storage engines) will change the storage structure slightly while storing their own b-tree indexes. The actual storage structure used for the B-tree index of the InnoDB storage engine is actually b+tree, that is, a small transformation based on the B-tree data structure, with the information on each leafnode that holds the index key, It also stores pointer information that points to the next leafnode adjacent to the Leafnode, primarily to speed up the retrieval of multiple adjacent leafnode.

2) Hash Index

The hash index is not used in MySQL much, it is currently used primarily by the memory storage engine, and the hash index is the default index type in the memory storage engine. The so-called hash index, in fact, through a certain hash algorithm, will need to index the key value of the hash operation, and then the resulting hash value into a hash table. Then every time need to search, the search conditions will be the same algorithm hash operation, and then the hash table with the hash value to compare and obtain the corresponding information.

Hash index can only satisfy "=", "in" and "<=>" query, can not use range query;

Hash index can not be exploited to avoid sorting operation of data;

Hash index can not use partial index key query;

Hash index at any time can not avoid the table sweep surface;

When the hash index encounters a large number of hash values, the performance is not necessarily higher than the B-tree index;

3) Full-text Index

The Full-text index is what we often call Full-text indexing, which currently has only MyISAM storage engine support in MySQL, and not all data types support Full-text indexing. At present, only columns with Char,varchar and text three types of data can be built Full-text indexes.

Indexing can greatly improve the efficiency of data retrieval, but also can improve the performance of sorting grouping operations, but we can not ignore a problem is that the index is completely independent of the underlying data, the data will be the IO and the amount of data to adjust the index of the calculation of the resource consumption.

Whether you need to create an index, several principles: more frequent fields as query criteria should be indexed; fields that are too unique are not suitable for creating indexes individually, even if they are frequently used as query conditions, and fields that are updated very frequently are not suitable for indexing;

Fields that do not appear in the WHERE clause should not create an index;

Optimization of JOIN statements

Minimize the total number of nestedloop loops in a join statement; "Always drive large result sets with small result sets."

The inner loop of the nestedloop is optimized preferentially;

Ensure that the join condition field on the driver table in the JOIN statement has been indexed;

When there is no guarantee that the join condition field of the driven table is indexed and the memory resources are sufficient, don't be too stingy with the joinbuffer setting;

Order By,group by and distinct optimizations

1) The realization and optimization of ORDER by

When you optimize the order by in the query statement, you can greatly improve the performance of an ordered by operation by using the existing indexes to avoid actual sorting calculations.

Optimizing Sorting:

1. Enlarge the setting of max_length_for_sort_data parameter;

2. Remove unnecessary return fields;

3. Increase sort_buffer_size parameter setting;

2) The implementation and optimization of GROUP by

Because group by actually also requires sorting operations, group by is mostly just a grouping operation after sorting, as opposed to order by. Of course, if other aggregate functions are used when grouping, then some aggregation function calculations are also required. As a result, the index can also be used as an order by in the implementation of Group by.

3) Realization and optimization of distinct

Distinct is actually very similar to the operations of group by, except that only one record is taken from each group after the groups by. Therefore, the implementation of distinct and group by implementation is basically similar, not much difference. This can also be done through loose index scans or a compact index scan, which, of course, can only be accomplished by using a temporary table when it is not possible to use the index alone to complete the distinct. However, one difference from group by IS that distinct does not need to be sorted. That is, in query that is just a distinct operation, MySQL uses a temporary table to "cache" the data, but does not filesort the data in the temporary table if it cannot simply use the index to complete the operation.

Next post address: http://www.jb51.net/article/70530.htm

The above is the entire content of this article, I hope to help you learn.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.