MySQL Design specification

Source: Internet
Author: User

1. Preface
This specification is a number of development specifications and techniques collated in the development of the project, and expects to take advantage of MySQL's features to get better performance. It is mainly provided to the people who need to do application development based on MySQL, which is convenient for more efficient development.
1.1 Database Design
The goal of database design is three: function implementation, scalability, usability. Design needs to balance all aspects of business technology, make a good choice. The architectural design of the database is most important, and 80% of the performance benefits come from the architectural design benefits.
2.1 Version Selection
The official version recommends the use of MySQL5.7.14, the branch version is recommended Percona 5.6, 5.7.

2.2 Architecture Design
MySQL database architecture design mainly consider reading and writing separation, sub-database table, hotspot data, avalanche effect and overload protection, reading and writing optimization.
Read/write separation is the design of the master-slave library, at least two servers, both sides of the data is synchronous, the main library is responsible for writing data, from the library responsible for reading data.
Sub-Library: As far as possible, the access to the different business data in different databases to store, so as to improve the efficiency of concurrent access.
Sub-table: As far as possible, the large data volume of the business table with some sort of identification to be divided into different tables. Consider storing historical and realistic data separately.
Hotspot data: For data that is frequently reused, it must be cached in memory and cannot be read from disk every time. can use global memory variables, memories cache and so on.
Avalanche effect means that when the concurrency is large, access to some tables causes a large number of locks to appear, so that subsequent database access will establish more database connections, resulting in degraded database performance or even downtime. The main solution is to optimize the database, optimize the design from the business, release the lock and system resources in time, use the connection pool and so on.
Read-Write Optimization: Reading optimization and hotspot data is similar, mainly through in-memory cache data to achieve as little as possible to read the hard disk, as much as possible to read memory. Write optimization is mainly implemented by primary key and index.

2.3 Schema Design
2.3.1 Control the number of libraries and tables
MySQL is a single-process multithreaded architecture database, which is similar to SQL Server, but differs from the Oracle Multi-Process architecture (Oracle's Windows version is also a single-process multithreaded architecture). This means that the MySQL database instance behaves as a process on the system.
From the performance considerations, a single MySQL library can not be too large, the total space capacity is generally not more than 100G, the library is not more than 500 tables, because the MySQL table table structure files, data files, index files are stored in the operating system in the same directory of the schema, When the number of tables in a schema exceeds 100, that is, when more than 300 files are in the same directory, the cost of operating system-managed files increases significantly, affecting server performance.
2.3.2 Control single-table data volume
Table Design Main considerations are: IO efficient, full table traversal, table repair fast, improve concurrency; ALTER TABLE fast.
The single-table data volume recommended control in pure int is not more than 1000W, with char not exceeding 500W, because when MySQL handles large tables (char table >500w rows, or int table >1000w), performance starts to decrease significantly, so you have to control the single-table capacity in different ways:
A, according to the data of hot and cold, data classification storage, historical archive.
B, using the sub-library/Sub-table/partition table, horizontal split control of the single-table capacity.
C, for the OLTP system, control the resource consumption of single transaction, encounter large transaction can be disassembled, adopt piecemeal mode, avoid special effects on the public.
D. No more than 500 tables in a library.
E, the number of table fields should not be too many, not more than 50.
2.3.3 Data Redundancy design
The purpose of database redundancy design:
A, when there is no foreign key, reduce the multi-table join query.
B, easy to distribute design, allow moderate redundancy, for capacity expansion to allow moderate cost.
C, based on business freedom optimization, based on I/O or query design, no need to follow the paradigm structure design.

Application Scenarios for redundancy design:
A, the original display program involves multiple tables of queries, hoping to simplify the query.
B, data table splitting is often based on the primary key, and the original data table often exists non-primary key-based query, can not be completed in the table structure.
C, there are more statistical requirements (count, sum, etc.), inefficient.

Examples of redundant design ideas:
A, based on the display of redundant design, such as:
Message table messages, there are fields from_userid,to_userid,msg,send_time four fields, and the display program needs to display the sender's name and gender.
You typically add redundant fields From_username and from_user_sex to the message table.
B, query-based redundancy design, such as:
User sub-table, divides the user library into several data tables. User-name-based queries and UserID-based queries are high concurrent requests. User sub-table can be divided into multiple tables based on UserID, and the corresponding redundancy table based on the user name.
C, statistical-based redundancy design, such as:
COUNT (*) operation, if you do not need accurate results, you can directly show the table status like ... Obtained, need accurate result, can add key-value pair in the cache layer, update the key-value in real time. Asynchronously updates a redundant field in the database, or in a redundant table.
2.3.4 Controlling transaction size
Limit large SQL (big sql), large transactions (big Transaction), large batches (big Batch). When you encounter large SQL, you can consider splitting into several small SQL based on the business, try not to do operations in the database, complex operations to the terminal CPU, as simple as possible application of MySQL.
Actions such as MD5 () or order by Rand () or calculated fields are not performed on the database table.

2.3.5 Storage Engine selection
The InnoDB engine is used by default. InnoDB is suitable for almost 99% of MySQL scenarios, and the MySQL 5.7 system table is changed to InnoDB, and what reason to stick to myisam it.

2.3.6 Character Set
Use priority utf8mb4 > UTF8 > Latin1

2.3.7 Table primary key
Displays the specified self-increment int/bigint unsigned not NULL as the primary key, and try not to use the UUID/HASH/MD5 type as the primary key.

2.4 Coding Specifications
2.4.1 Naming Conventions
Note: The database name length should be kept within 14 characters!
MySQL object name is up to 64 characters long, for easy reading, we require the object name to be controlled within 32 characters. and database name, table name, field name, index name, etc. strongly recommended only lowercase characters, numbers, underline combination, do not use Desc,select, show, update and other MySQL keywords, temporary table plus TMP suffix, statistics plus statistic suffix, Log table plus log suffix, and so on.

Object Specification
Table T_ Application Name _ Module Name _ Function name
Fields of the table English words or abbreviations, avoid keywords
View V_ Table Name
Stored Procedures P_ Table Name
Function f_ function Description
Package Pkg_ function Description
Trigger Tri_ Table Name
Primary key Primary
Index idx_ Field 1_ Field 2
Unique index uniq_ Field 1_ Field 2

The name of the table should reflect the stored data content as much as possible.

Design of 2.4.2 table field
The name of the field is dominated by word or word abbreviations, avoiding database keywords such as all, type, and so on.
MySQL field type:

Column type Scope of expression Storage requirements
1 tinyint[(M)] [UNSIGNED] [Zerofill] 128 to 127 or 0 to 255 1 bytes
2 smallint[(M)] [UNSIGNED] [Zerofill] 32768 to 32767 or 0 to 65535 2 bytes
3 int[(M)] [UNSIGNED] [Zerofill] 2147483648 to 2147483647 or 0 to 4294967295 4 bytes
4 bigint[(M)] [UNSIGNED] [Zerofill] 9223372036854775808 to 9223372036854775807 or 0 to 18446744073709551615 8 bytes
5 decimal[(M[,d]) [UNSIGNED] [Zerofill] The maximum number of digits (M) is 65, and the maximum number of decimal digits (D) is 30 Variable length
6 DATE Yyyy-mm-dd 3 bytes
7 Datetime YYYY-MM-DD HH:MM:SS (range from 1001 to 9999) 8 bytes
8 TIMESTAMP YYYY-MM-DD HH:MM:SS (range from 1970 to 2037) 4 bytes
9 CHAR (M) 0<m<=255 (suggested CHAR (1), exceeding this length with VARCHAR) M characters (related to the space and character set, etc.)
10 VARCHAR (M) 0<m<65532/n M characters (N size by character set, and whether it is Chinese or alphanumeric, etc.)
11 TEXT 64K characters The occupied space is related to the character set, etc.

Description

1. All dynamic-length strings use the VARCHAR type, similar to state, with a finite class of fields, and use a string that can be used to make a significant representation of the actual meaning, instead of using numbers such as int;
2. Fixed-length strings use the char type, and all single characters use the char type, rather than the varchar type;
3. You can use the TEXT type to hold character class data only when the number of characters may exceed 20,000. All fields that use the TEXT type must be separated from the original table, and the primary key of the original table is composed of a separate table for storage;
4. Fields that need to be accurate to the time (month, day, and seconds) can use DateTime or timestamp, but be aware of the scope of each expression and whether the timestamp feature is required , accurate to microseconds the recommended time type is converted to the Shaping bigint store (the recommended priority is to use the bigint type to store the date);
5. All fields that only need to be accurate to day use the DATE type and should not use the timestamp or datetime type;
6. The field of the self-increment sequence type can only use INT or BIGINT, and is clearly identified as unsigned (UNSIGNED), unless there is a negative number, the BIGINT type is used only if the value of the field is more than 4.2 billion; you can have tinyint do not use smallint, If you can use smallint, don't use int, you can use int instead of bigint
7. The index field uses the NOT NULL:MYSQL NULL type and Oracle's NULL to be in the index, and if it is a combined index, this null type of field can greatly affect the efficiency of the entire index. In addition, NULL processing in the index is also special and takes up additional storage space.
8. Fields with decimal points or fields with high precision require decimal, which disables float.
Design of 2.4.3 Index
The index is named according to "idx_ field name", and the index name uses lowercase.
The number of fields in the index does not exceed 5.
The unique key consists of 3 of the following fields, and when the fields are shaped, a unique key is used as the primary key.
When there are no unique keys or unique keys that do not conform to the criteria in 5, use the self-increment (or get through the generator) ID as the primary key.
The unique key does not repeat with the primary key.
The order of the indexed fields takes into account the number of the field values to be weighed back, and the number is placed in front.
The field for ORDER By,group by,distinct needs to be added after the index.
The number of indexes on a single table is within 5, and if multiple fields in a single table are used alone in the query requirements, they need to be evaluated by a DBA. Query performance problems cannot be solved, should be reconstructed from the product design.
Use explain to determine if the SQL statement uses the index reasonably, and try to avoid extra columns from appearing: Using File sort,using temporary.
The UPDATE, DELETE statement needs to be indexed based on the Where condition.
For string columns of more than 50 lengths, it is a good idea to create a prefix index rather than an entire column index (for example: ALTER TABLE T1 ADD Index (User (50)), which can effectively improve index utilization, although the disadvantage is that it does not use a prefix index when sorting the column. The length of the prefix index can be based on the statistics of the field, which is generally slightly larger than the average length.
Reasonable creation of a federated index (avoiding redundancy), (A,B,C) equivalent to (a), (b), (A,B,C), note not including (B,C), (c), (d).

2.4.4 table comments, field comments
The field in MySQL is more cumbersome to annotate, you need to use the ALTER TABLE statement, so try to add table comments and field comments when you create a new table. The comment for the Type field (the meaning of the field's initial value must be included in the note).

3.SQL specification
DDL specification:
? field specifies NOT null default XXX
? All fields, tables need to have comments comment
? All new tables, engines can only be used InnoDB
? All new tables have a self-increment ID to do the primary key
? Limit the number of new table text fields by a certain percentage
? Specify the number of new table indexes according to a certain percentage
? According to business knowledge, post_id, PUID, userid all have index
DML Specification:
? load data is not allowed
Delete Modify operation, where clause must contain primary key, unique index column, or good business sensitivity column
? prohibit multi-table join
? prohibit pending SQL, such as insert INTO select
? All INSERT statements must specify a field
? all Update/delete recommend using absolute values, following the reentrant principle

MySQL Design specification

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.