"MySQL" High performance MySQL learning notes, fourth, schema and data type optimization

Last Update:2018-03-15 Source: Internet

Author: User

Tags integer numbers one table

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

"MySQL" High performance MySQL learning notes, fourth, schema and data type optimization

Good logical and physical design is the cornerstone of high performance, and the schema should be designed according to the query statement that the system will execute.

The inverse paradigm is designed to speed up certain types of queries, which can slow down another type of query, such as adding a count table and a summary table, which is a good way to optimize queries, but these tables can be costly to maintain.

1. It is generally better to choose a smaller data type for optimization.

? You should try to use the smallest type of data that can be stored correctly, and smaller data types are usually faster because they consume less disk, memory, and CPU cache, and require less CPU cycles to process.

It's simple.

? Operations that are simpler data types typically require less CPU cycles. For example, integer numbers are less expensive than word Fu Cao because character sets and collation rules (collations) make it more complex to compare characters relative to integer numbers. For example, you should use the Interger storage IP address (inet_aton)

Try to avoid null

? Generally, it is best to specify column not NULL. If the query contains nullable columns, it is more difficult for MySQL to optimize because nullable columns make indexes, index statistics and value comparisons very complex, nullable columns use more storage space, and when null columns are indexed, an extra byte is required for each index record. However, changing a nullable column to not NULL provides a small performance boost, but if you plan to create an index on a column, you should avoid designing nullable columns.

1.1 Integer type

integer Type	occupy space	Range
TINYINT	8	[ -2^7,2^7-1]
SMALLINT	16	[ -2^15,2^15-1]
Mediumint	24	[ -2^23,2^23-1]
Int	32	[ -2^31,2^31-1]
BIGINT	64	[ -2^63,2^63-1]

An integral type has an optional unsigned attribute, which means that negative values are not allowed, which can increase the line of the original positive number by one more times. Signed and unsigned types use the same storage space and have the same performance. Integral types are calculated from each other, and are calculated using the 64-bit bigint as the intermediate type.

1.2 Real Type

Real numbers are numbers with fractional parts, and you can use decimal to store integers larger than bigint.

The decimal type is used to store accurate decimals, which supports accurate calculations. For example, a decimal (18,9) will store 9 digits on each side of the decimal point, using a total of 9 bytes, where the number preceding the decimal point is used. Decimal allows a maximum of 65 digits.

When a floating-point type stores a value of the same range, it usually consumes less space than a decimal, and the internal calculation uses a double as the calculation type.

Because of the need for additional space and computational overhead, as far as possible only in the accurate calculation of the mouse to use the decimal, when the amount of data is large, you can consider using bigint instead of decimal, said the need to store the currency units by the number of decimal places multiplied by the corresponding multiples.

1.3 String Type varchar

? The varchar type is used to store variable-length strings, which is more space-saving than fixed-length, and varchar requires 1 or 2 extra bytes to record the length of a string, if the maximum length of a column is less than or equal to 255 bytes, it is represented by 1 bytes, otherwise 2 bytes are used. VarChar saves storage space, so it is also useful for performance. However, because the row is longer, if the actual storage length of the side length column is increased at update time, this results in additional work if a row occupies more space and there is no more storage space within the page, in which case the INNODB needs to split the page so that the row can be placed in the page.

? varchar Use occasions: 1. The maximum length of a string column is much larger than the average length, and the column is updated very rarely

? 2. UTF-8 This composite character set (each character is stored with a different number of bytes)

? MySQL retains space at the end of varchar when it is stored and retrieved. InnoDB can store too long varchar as blobs.

Char

? Fixed-length string, where MySQL removes whitespace at the end of Char when it is stored. Creates a unique conflict between "a" and "a". How the data is stored depends on the storage engine, and the behavior of padding and intercepting spaces is done at the MySQL service layer.

? Longer columns consume higher memory, and MySQL usually allocates a fixed amount of memory to hold internal values, especially if the use of memory temporary tables for sorting or operation is always particularly bad.

Blob

? stored in a binary way, with no collation and character set. Contains Tinyblob,blob,mediumblob,longblob

Text

? stored in a string, with collations and character sets, including Tinytext,text,mediumtext,longtext.

Unlike other types, MySQL treats each BLOB value and text value as a separate object, and the storage engine usually does special processing when it is stored, and when the Blob and text values are too large, InnoDB uses a dedicated "external" storage area to store it. Use the pointer in the original row to point to the external storage area. Colleagues these two data formats can only have prefix indexes.

Enum

? Enumeration is not recommended (you can refer to the original book for more information)

1.4 Date and Time type datetime and timestamp

? DateTime is now recommended, with a larger range, regardless of time zone, 8 bytes

1.5-bit data type

? InnoDB uses a minimum integer type sufficient for each bit column to store, using the bit type does not save much storage space, MySQL treats bit as a string type, and when retrieving the value of bit (1), the result is a string containing binary 0 or 1.

2.MySQL Pattern Design Traps 2.1 too many columns

? The MySQL storage Engine API works by copying data in a row-buffered format at the server layer and storage engine layer, and then decoding the row buffer content into individual columns at the server layer. The cost of converting a coded column from a row buffer to a line structure is very high, and the cost of the conversion depends on the number of columns.

2.2 Too many associations

? A rough rule of thumb, if you want query execution to be fast and concurrency good, a single query is best associated with 12 tables

2.3NULL value

? When you need to store a de facto "null" in the list, you can use 0, a special value, or an empty string instead. MySQL stores null values in the index, and Oracle does not.

3. Paradigm and Inverse paradigm

? In a normalized database, each fact data appears only once,

? In a reverse-normalized database, information is redundant and may be stored in multiple places.

3.1 Advantages and disadvantages of normalization:

? The normalized update operation is faster, requiring only less data to be changed.

? The normalized table is smaller, better placed in memory, and faster to perform.

? There is no redundant data to reduce the distinct or group by operation.

Disadvantages:

? Often, associations are expensive and may invalidate some index policies.

3.2 Advantages and disadvantages of anti-paradigm advantages:

? All data is in one table and can be avoided.

? Not associated with even full table scan, is also sequential IO.

Disadvantages:

? Redundancy of redundant data, updated more slowly

? Table large, put in memory, occupy large, easy to extrude hot data

4. Faster reading, slower writing

? To improve the speed of reading queries, it is often necessary to build additional indexes, add redundant columns, or even create cache tables and summary tables, which can increase the burden of writing queries.

? Slow write operations are not the only cost of a read operation becoming faster, but may also increase the concurrency of read and write operations.

5. Speed up the ALTER TABLE operation

? The ALTER table operation is a big problem for extra-large tables.

? MySQL performs most of the steps to modify the table structure:

? 1. Create an empty table with a new structure

? 2. Isolate all data from the old table to insert a new table

? 3. Delete old tables

? In general, most ALTER TABLE operations will cause the MySQL service to break access to the table.

? For common scenarios, there are two common tricks:

? 1. Now perform an ALTER TABLE operation on a machine that does not serve, and then switch

? 2. Shadow copy, which is the same as the previous step, but updates the old table data of the new table by means of a trigger, and then renames

? All modify column operations cause the table to be rebuilt.

5.1 Modify the FRM (table structure) file only

? The following actions are likely to not need to be rebuilt:

? Remove the Auto_increment property of a column

? Add, remove, or change enum and set constants

? Step (This operation is chestnuts out):

? 1. Create an empty table with the same structure and make the necessary modifications

? 2. Execute flush TABLES with READ LOCK. Close all tables in use and prevent tables from being opened

? 3. Swap the frm file

? 4. Execute unlock tables to release the second step of the read lock.

6. Summary

? 1. Avoid designing overly complex database schemas

? 2. Use a small, simple and appropriate data type to avoid using null values whenever possible

? 3. Try to store similar or related values using the same data type.

? 4. Variable-length strings are likely to be pessimistic in allocating memory at the maximum length when staging tables and sorting.

? 5. Try to define the primary key using the self-increment integer sequence

? 6. Avoid using MySQL no longer recommended features

? 7. Treat Bit,enum,set with caution

MySQL, high performance MySQL learning note, fourth, schema and data type optimization

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More