This paper does not include database server configuration optimization, only from the angle of application design to consider the design of database optimization scheme.
Mainly from three parts: 1. Data type Optimization 2. Database design optimization 3.SQL query optimization, careful audience will feel less what, index, of course, indexed, index is important, this last said.
So both the macro design level, as well as the micro-data type level, this is the saying goes: On the hall, got the kitchen, fighting got the big one ...
If the database encounters a performance bottleneck, it usually needs to be designed from these 4 aspects, rather than isolated one-sided considerations.
First, data type optimization
1. Principles:
1) The smaller the better
For example, you can use tinyint solution do not use smallint, can use int do not use float, can use varchar (10) Do not use varchar (100), because small data types occupy less disk space, memory and CPU resources, consume less CPU cycles , varchar (10) and varchar (100), while occupying the same storage space, but the former has more advantages, the following introduction of the varchar type will be mentioned, so the absolute is the smaller the better, of course, to meet the storage requirements, which seems to be nonsense, who still do not know, which has such a stupid x architect , do you? This is true, like some of the so-called architects in the work really xx, the actual work encountered in the X, smile on the line.
2) The simpler the better.
Operations of simple data types require less CPU cycles, for example, manipulating an integral type is less costly than manipulating a string type.
2. Integral type
tinyint (8)->smallint->mediumint (a)->int->bigint (64), the number in parentheses is how many bits of storage space is used.
Another integral type has unsigned attributes, such as tinyint unsigned can store 0~255, and tinyint can store -128~127.
3. Real numbers
float and double are imprecise types, and decimal is an exact type, and the float type is faster than decimal because the CPU supports floating-point computations directly. When you have a large amount of data, you can use bigint instead of decimal, such as multiplying the amount by 1 million, and then storing the results in bigint to increase the calculation speed.
4. String type
char and varchar, this earth people all know varchar is to become longer, char is fixed long, do not say.
VARCHAR uses 1 to 2 bits to describe the length of the actual string. VarChar can cause fragmentation problems, such as the definition of varchar (1000), insert 1 characters when the initial insertion, and then update the record to 1000, which may cause the current page to be stored, causing the page to split, resulting in a new page to store this part of the data, Although the general database reserves some extra page space, this is still not completely avoided.
The trailing spaces are deleted when char is stored. If you store a short fixed-length character, such as Y/n,char, you need only one bit, and varchar needs an extra length to store the string.
varchar (10) is the same as varchar (1000)? Storage space is the same, but varchar (1000) needs to consume more memory, especially when using temporary table sorting, so the less the better the principle is not wrong.
5. Large-segment Blob and text
They are all used to store large segments, such as big file content, picture content, and so on, and their difference is that BLOBs store binary data, while text has character set and collation. Generally, you can open the store as text with Notepad and the other storage as a blob.
6. Date and time
MySQL supports the minimum time granularity to seconds, this is a bit thick, 1 seconds for the computer is already very long.
Timestamp is stored from 1970 to the current timestamp, and datetime stores the actual time value. Timestamp's space is more efficient, but timestamp can only be stored until 2038, and if your system is going to use this, you can give it up decisively.
7. Primary key
1 to ensure that the table and the association between the table data types consistent, if inconsistent to the production environment encountered a performance problem to change, it is not a joke.
2 The primary key uses the integral type performance to be higher.
If a string is used, it causes:
A. Poor insertion performance. Because inserts need to maintain the index, if you use random strings, such as the UUID in Java, because it is unordered, you need to maintain the index to a different location, which has a greater performance cost.
B.select is slower because logically adjacent rows are distributed in different places of disk and memory.
C. Random results in lower cache hit rates. Because the cached data is physically contiguous, but not logically contiguous, it leads to more misses, causes frequent flush caching, and loses the sense of caching, but increases consumption.