MySQL Performance Optimization (II) _ MySQL

Source: Internet
Author: User
This article summarizes and introduces MySQL Performance Optimization in detail. For more information, see VII. performance optimization of MySQL database Schema design
Efficient model design

Moderate redundancy-reduce Join in two queries

Vertical split of large fields-summary table optimization

Horizontal splitting of large tables-type-based splitting optimization

Statistical Table-quasi-real-time optimization

Suitable Data types

There are not too many time storage formats. we usually use DATETIME, DATE, and TIMESTAMP. In terms of storage space, the TIMESTAMP has at least four bytes, while the other two data types are eight bytes, which is doubled. However, the disadvantage of TIMESTAMP is that it can only store the time after January 1, 1970, while the other two time types can store the time from January 1, 1001 at the earliest. If you need to store the time before January 1, 1970, you must discard the TIMESTAMP type. However, if you do not need to use the time before January 1, 1970, try to use TIMESTAMP to reduce storage space usage.

Character storage type

CHAR [(M)] belongs to the static length type. the storage length is calculated based on the number of characters. Therefore, the final storage length is based on the character set, for example, if latin1 is used, the maximum storage length is 255 bytes, but if gbk is used, the maximum storage length is 510 bytes. The CHAR type storage feature is that no matter how long data we actually store, M characters will be stored in the database. if it is not enough to be filled with spaces, M is 1 by default. Although CHAR fills up the space to be stored through spaces, MySQL ignores all spaces at last when accessing data. Therefore, if space is required at the end of actual data, the CHAR type cannot be used for storage.

VARCHAR [(M)] is a type of dynamic storage length, which only occupies the length of the actual storage data. TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT belong to the same storage method. they all belong to the dynamic storage length type. The difference is only the limit of the maximum length.

Transaction optimization

1. dirty read: when a transaction is accessing data and modifying the data, the modification has not been committed to the database, another transaction also accesses the data and then uses the data.

2. non-repeated read: refers to reading the same data multiple times in a transaction. When the transaction is not completed, another transaction also accesses the same data. Therefore, the data read twice in the first transaction may be different because of the modification of the second transaction. In this way, the data read twice in a transaction is different, so it is called non-repeated read.

3. Phantom read: refers to a phenomenon that occurs when a transaction is not executed independently. for example, the first transaction modifies the data in a table, which involves all data rows in the table. At the same time, the second transaction also modifies the data in this table. this modification inserts a new row of data into the table. In the future, the user who operates the first transaction will find that there are still data rows in the table that have not been modified, just like an illusion.

Innodb supports the following information at the transaction isolation level:

1. READ UNCOMMITTED

It is often called Dirty Reads, which is the lowest isolation level in transactions: in normal non-lock mode, the execution of SELECT makes it possible that the data we see may not be the data at the start time of the query. Therefore, this isolation is not Consistent Reads (Consistent read );

2. READ COMMITTED

At this isolation level, DirtyRead is not displayed, but Non-RepeatableReads and PhantomReads may appear ).

3. REPEATABLE READ

The repeatable read isolation level is the default transaction isolation level of InnoDB. At the repeatable read isolation level, neither DirtyReads nor Non-Repeatable Read will occur, but there is still the possibility of PhantomReads.

4. SERIALIZABLE

The SERIALIZABLE isolation level is the highest level in the standard transaction isolation level. After the value is set to the SERIALIZABLE isolation level, the data seen at any time in the transaction is the state of the start time of the transaction, whether or not other transactions have modified some data during this period and committed. Therefore, PhantomReads does not appear at the SERIALIZABLE transaction isolation level.

8. scalable design-data splitting

Vertical Data splitting

Vertical Data splitting can also be called vertical data splitting. Think of a database as a "data block" (table) composed of many large blocks. We vertically cut these "data blocks, then they are distributed across multiple database hosts. This split method is a vertical data split.

Advantages of vertical splitting

◆ Database splitting is simple and clear, and sharding rules are clear;

◆ The application module is clear and clear, and integration is easy;

◆ Data maintenance is convenient and easy to locate;

Disadvantages of vertical splitting

◆ Some table associations cannot be completed at the database level and must be completed in the program;

◆ Tables with extremely frequent access and large data volumes still have stable performance and may not meet the requirements;

◆ Transaction processing is more complex;

◆ When the splitting reaches a certain degree, the scalability will be limited;

◆ Read splitting may result in complicated system transition and difficult to maintain.

Horizontal data splitting

Vertical Data splitting can be simply understood as splitting data by table by module, while horizontal data splitting is no longer based on tables or functional modules. In general, a simple horizontal split mainly refers to distributing a table with extremely ordinary access to multiple tables according to certain rules of a field. each table contains part of the data.

Advantages of horizontal splitting

◆ Table association can be completed at the Database end;

◆ There will be no bottlenecks in tables with large data volumes and high loads;

◆ The overall architecture of the application is relatively small;

◆ Transaction processing is relatively simple;

◆ As long as the splitting rules can be defined, it is basically difficult to meet scalability restrictions;

Disadvantages of horizontal splitting

◆ Splitting rules are more complex and it is difficult to abstract a segmentation rule that can satisfy the entire database;

◆ Data maintenance becomes more difficult in the future, making it more difficult to manually locate data;

◆ The modules of the application system have a high degree of coupling, which may cause some difficulties for subsequent data migration and splitting.

Possible problems in data splitting and integration

1. introduction of Distributed Transactions

A distributed transaction across multiple databases can be split into multiple small transactions that are only on a single database, and each small transaction can be controlled in an application. Of course, the requirement for this is that our Russian applications must be robust enough, and of course it will bring some technical difficulties to the applications.

2. Cross-node Join

We recommend that you use an application for processing. first, retrieve the corresponding driver result set from the MySQLServer where the driver table is located, then, the data is retrieved from the MySQL Server where the driver table is located based on the driver result set.

3. Cross-node merge sorting paging

Extract data from multiple data sources in parallel, and then the application summarizes and processes the data.

9. scalability design-utilization of Cache and Search

By introducing Cache (Redis and Memcached), we can reduce database access and increase performance.

By introducing Search (Lucene, Solr, ElasticSearch), you can use the Search engine's efficient full-text indexing and word segmentation algorithms, as well as efficient data retrieval, to solve the full-text fuzzy search and classification statistics query functions that are completely impossible for databases and traditional Cache software.

The above is all the content of this article, and I hope you will like it.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.