Good logic design and physical design is the cornerstone of high performance, should be based on the system to execute the query statement to design the schema. This often requires a balance of factors.
For example, anti-paradigm design can speed up certain types of queries, but it may also slow down other types of queries. For example, adding a count table and a summary table is a good way to refine the query.
However, the maintenance costs of these tables can be high. MySQL's unique features and implementation details have a significant impact on performance.
Simple principles for selecting an optimized data type:
1. Smaller usually better
In general, you should try to use the smallest data type that correctly stores your data.
2. Simple is good
Operations of simple data types typically require less CPU cycles.
3. Avoid null as much as possible
If the query contains nullable columns, it is more difficult for MySQL to optimize because nullable columns make indexes, index statistics, and value comparisons more complex.
Nullable columns use more storage space, and special handling is required in MySQL.
The first step in schema design is to build a table, and the first step in building a table is to design the field, but understanding the MySQL data type can also help with the design field.
Let's start by understanding MySQL's basic data type.
1. Integer type
TINYINT, Smallint,mediumint,int,bigint
8,16,24,32,64, bit storage space.
MySQL can set a width for the certificate type, such as int (11), but this is meaningless for most applications; it does not limit the value's legal scope. For storage and compute
Int (1) and int (20) are the same
2. Real type
Float,double,decimal (for precise calculation)
3. String type
Varchare
The most common type of string. It is more space-saving than a fixed-length type because it only applies the necessary space. There is one case exception if the MySQL table is created using row_format=fixed.
varchar requires a length of 1 or 2 extra bytes to record the string. If the maximum length of a column is less than or equal to 255, a byte is used to record the length.
CHAR
is a fixed length. MySQL always allocates enough space based on the defined string. When you store a char value, MySQL deletes all trailing spaces.
Char values are padded as needed to facilitate comparisons.
Char is suitable for storing very short strings. Char is also better than varchar for frequently changed data because the fixed-length char type is not prone to fragmentation.
Blob and text
Both blob and text are string data types that are designed to store large data and are stored in binary and character mode, respectively.
Using enumerations (enums) instead of string types
4. Date and Time type
Datetime
This type can hold a wide range of values. From 1001 to 9999, the precision was wonderful.
TIMESTAMP
Storage space is small, and depending on the time zone changes, there is a special automatic update capability. Precision is wonderful.
If you want to store time that is smaller than the granularity of seconds, you can store a microsecond-level timestamp with the bigint type.
5. Bit data type
BIT
Selection of identifiers:
Integer type
Integers are usually the best choice for identifying columns. Because they are quick and can use Auto_increment
Enum and set types
String type
If possible, avoid using string types as identifiers, because they are very space-consuming and often slower than numeric types.
Especially be careful when using strings as identity columns in the MyISAM table. MyISAM uses a compressed index on a string by default, which makes the query slow.
For a completely ' random ' string, you also need to be more careful, such as MD5 (), SHA1 (), or UUID () to produce a string.
The new values generated by these functions are arbitrarily distributed in a large amount of space, which causes the INSERT and some SELECT statements to become slow:
1. Because the insertion value is randomly written to a different location in the index, it makes the INSERT statement slower. This causes page splitting, random disk access, and clustered index fragmentation for the clustered storage engine.
2.select statements can become slower because logically neighboring behaviors are distributed in different places on disk and memory.
3. Random values cause the cache to be poorly performing for all types of query statements.
Pitfalls in Schema design
1. Too many columns
2. Too many associations
3. The Almighty Enumeration
Paradigm and Inverse paradigm
Three main paradigms:
1. must have primary key, column is not divided
2. When a table is a composite primary key, the non-primary key field does not depend on the partial primary key (that is, it must depend on all primary key fields)
3. All non-primary attributes in relational mode R (U,F) do not have transitive dependencies on any candidate keywords
The inverse paradigm, as the name implies, does not follow the three paradigms.
First, let's look at the pros and cons of paradigms:
Advantages:
Normalized update operations are usually faster than inverse paradigms.
When data is well-normalized, there is little or no duplication of data, so you only need to modify less data
The normalized table is usually small, and can be better placed in memory, so the operation will be faster.
There is little extra data that means fewer distinct or group by statements are needed to retrieve list data.
Disadvantages:
Overly dependent associations. This is expensive and may invalidate some index policies.
Inverse paradigm Advantages and disadvantages:
Because all the data in a table, can be very good to avoid association.
If the association is not required, the worst case for most queries, even if the index is not used, is a full table scan, which may be much faster than the association when the data is larger than memory.
Because this avoids the random I/O.
A separate table makes better use of the indexing strategy.
Mixed paradigm and inverse paradigm are the mainstream of the schema.
Cache tables and Summary tables
Easy to query, difficult to maintain.
Materialized view
Counter table
MySQL Schema design optimization