MySQL Development specification

Source: Internet
Author: User
Tags rand sorts

1. The Library name, table name, field name must use lowercase letters and be split with underscores.

A) MySQL has configuration parameters lower_case_table_names, cannot be changed dynamically, Linux system default is 0, that is, the library table name is stored in the actual situation, case sensitive. If it is 1, it is stored in lowercase and is not case sensitive. If it is 2, it is stored in the actual situation but is compared in lowercase.

b) If mixed case, there may be abc,abc,abc and other tables coexist, easy to cause confusion.

c) The field name is explicitly case-sensitive, but the actual use is not differentiated, that is, you can not create two names, but not the same case of a field.

D) In order to unify the specification, the Library name, table name, field name use lowercase letters.

2. The library name, table name, field name are forbidden for more than 32 characters.

Library names, table names, field names support up to 64 characters, but more than 32 characters are forbidden for uniform specification, easy identification, and reduced throughput.

3. Using InnoDBStorage engine.

InnoDB engine is the default after the MySQL5.5 version of the breaking, support transactions, row-level locks, better data recovery capability, better concurrency performance, while the multi-core, large memory, SSD and other hardware support better, support data hot backup, etc., so innodb compared to MyISAM has a clear advantage.

4. The Library name, table name, field name prohibit the use of MySQL reserved words.

When attributes such as library name, table name, field name, and so on contain reserved words, the SQL statement must refer to the property name in inverted quotation marks, which makes it very complex to write SQL statements, escape variables in shell scripts, and so on.

5. Prohibit the use of partitioned tables.

Partitioned tables have strict requirements for partitioning keys, and partitioned tables become more difficult to perform DDL, sharding, and single-table restores after the table becomes larger. It is therefore forbidden to use partitioned tables, and it is recommended to manually sharding the business side.

6. We recommend using unsigned to store non-negative values.

The same number of bytes, the value of the non-negative storage range is larger. If the tinyint has a symbol of-128-127, unsigned is 0-255.

7. It is recommended to use the int unsigned storage IPV4.

The unsinged int storage IP address occupies 4 bytes, and CHAR (15) occupies 15 bytes. In addition, the computer handles integer types faster than string types. Use the int unsigned instead of char (15) to store the IPV4 address and convert it through the MySQL function Inet_ntoa and Inet_aton. The IPV6 address currently has no conversion function and requires a decimal or two bigint to store it.

For example:

SELECT Inet_aton (' 209.207.224.40 '); 3520061480

SELECT Inet_ntoa (3520061480); 209.207.224.40

8. It is strongly recommended to use tinyint instead of the enum type.

The enum type requires an online DDL when the enumeration value needs to be modified or added, and the cost is greater; The enum column value, if it contains a numeric type, can cause a default value to be confused.

9. Use varbinary to store case-sensitive variable-length strings or binary content.

varbinary is case-sensitive by default, with no character set concept and fast speed.

the 10.INT type is fixed with 4 bytes of storage, such as INT (4) that represents only the display character width of 4 bits and does not represent the storage length.

Numeric type the number after the parentheses is only the width and not the storage range, for example, int (3) shows 3 bits by default, the space is padded, the normal display is exceeded, Python, Java client and so on do not have this function.

11. Distinguish between the use of datetime and timestamp. The year type is used for storage years. The date type is used for storing dates. Storage time (accurate to seconds) it is recommended to use the timestamp type.

Both datetime and timestamp are accurate to the second, with preference for timestamp, because timestamp has only 4 bytes and DATETIME8 bytes. At the same time, timestamp has automatic assignment and auto-Update feature. Note: In versions 5.5 and earlier, if there are multiple timestamp columns in a table, only one column can have the Automatic Update feature.

How do I use automatic assignment properties for timestamp?

A) automatic initialization and Automatic Update: Column1 TIMESTAMP DEFAULT current_timestamp on update current_timestamp

b) Just auto-initialize: Column1 TIMESTAMP DEFAULT Current_timestamp

c) Automatic update, initialized with a value of 0:column1 TIMESTAMP default 0 on update current_timestamp

d) The value initialized is 0:column1 TIMESTAMP default 0

12. All fields are defined as not NULL.

A) for each row of the table, each null column requires additional space to identify.

b) The B-Tree index does not store null values, so if the indexed field can be null, the index efficiency will decrease.

c) It is recommended to replace null values with 0, special values, or empty strings.

13. Split large and low-frequency fields into separate tables to store, separating hot and cold data.

Facilitates efficient use of cache, prevents read-in of useless cold data, and less disk IO, while maintaining thermal data resident memory to improve cache hit ratio.

14. Prohibit storing plaintext passwords in the database.

Use the encrypted string to store the password, and ensure that the password is not decrypted, while using a random string to add salt to ensure password security. Prevent database data by the company's internal personnel or hackers, using a dictionary attack and other means of brute force to crack user passwords.

15. The table must have a primary key, it is recommended to use the unsigned self-increment column as the primary key.

The table does not have a primary key, and InnoDB defaults to the hidden primary key columns; tables without a primary key are very difficult to locate data rows and reduce the efficiency of row-based replication.

16. Disable redundant indexing.

Indexes are double-edged swords that increase maintenance burdens and increase IO pressure. (A,b,c), (A, b), which is a redundant index. The prefix index can be used to achieve acceleration and reduce maintenance burden.

17. Disable duplicate indexing.

Primary key A;uniq index A; Duplicate indexes increase maintenance burdens and disk space, without any benefit.

18. Do not build indexes on low cardinality columns, such as "gender".

In most scenarios, an exact lookup of indexes on low cardinality columns has no advantage over non-indexed full-table scans and increases the IO burden.

19. Use the overlay index to reduce IO and avoid sorting.

The overlay index can get all the required fields from the index, thus avoiding two lookups of the back table and saving IO. In the InnoDB storage Engine, secondary index (non-primary key index, also known as secondary Index, level two index) does not store the row address directly, but instead stores the primary key value. If a user needs to query a data column that is not contained in secondary index, it needs to be queried two times by secondary index to find the primary key value and then querying to the other data column through the primary key. An overlay index can obtain all the required data in an index, and is therefore more efficient. The primary key query is a natural overwrite index. For example select Email,uid from User_email WHERE uid=xx, if the UID is not a primary key, it is appropriate to add the index as index (UID,EMAIL) for performance gains.

20. Replace or with in. The in SQL statement should contain no more than 1000 values.

In is a range lookup, MySQL internally sorts the list values of in to find, which is more efficient than or.

21. The table character set uses UTF8 and can request the use of the UTF8MB4 character set if necessary.

A) The UTF8 character set stores a Chinese character that occupies 3 bytes and stores English characters in one byte.

b) UTF8 Unified and Universal, there will be no transcoding garbled risk.

c) If you encounter a storage requirement for emoji such as emoj, you can request to use the UTF8MB4 character set.

22. Use UNION ALL instead of union.

UNION all does not require any further sorting of the result set.

23. Prohibit the use of order by rand ().

Order BY Rand () adds a pseudo-column to the table, calculates the rand () value for each row of data with the rand () function, and then sorts it based on that row, which usually generates a temporary table on the disk and is therefore very inefficient. It is recommended that you first use the rand () function to obtain a random primary key value and then obtain the data from the primary key.

24. It is recommended that a reasonable paging method be used to improve paging efficiency.

The first form of paging:

SELECT *
From T
where thread_id = 771025
and deleted = 0
ORDER BY gmt_create ASC limit 0, 15;
SELECT * FROM t

where thread_id = 771025

and deleted = 0

ORDER BY gmt_create ASC limit 0, 15;

Principle: Once all fields are sorted and returned according to the filter criteria.

Data Access Cost = Index io+ index all record results corresponding table data IO

Disadvantage: This kind of writing is turned to the back of the execution efficiency is worse, the longer, especially when the table data volume is very large time.

Scenario: When the intermediate result set is small (below 10000 lines) or the query condition is complex (refers to multiple different query fields or multiple table joins) applies.

The second type of paging:

Select T.* from (

Select ID from t

where thread_id = 771025 and deleted = 0 ORDER by gmt_create ASC limit 0, a, t

where a.id = T.id;

Premise: Assuming that the T table primary key is an ID column and has an overlay index secondary key: (thread_id, deleted, gmt_create)

Principle: First use the overriding index to sort the primary key ID according to the filter condition, and then take the join operation to remove the other fields.

Data Access Cost = Index io+ the table data io for the result (15 rows in the example) after the index is paged.

Pros: Each page is consumed with the same resources and time as the first pages.

Scenario: applies when the Query and sort fields (that is, the fields involved in the WHERE clause and the ORDER BY clause) have corresponding overwrite indexes, and when the intermediate result set is large.

25.SELECT only gets the necessary fields and disables the use of select *.

Reduce network bandwidth consumption;

Can effectively use the coverage index;

Table structure changes have no effect on the program.

The 26.SQL avoids the function of the indeterminate result such as now (), Rand (), Sysdate (), Current_User ().

In a statement-level replication scenario, a function that causes the master-slave data to be inconsistent, and an indeterminate value, produces an SQL statement that cannot take advantage of query CACHE.

27. Adopt the appropriate sub-database sub-table strategy. such as the Thousand library ten tables, 10 library hundred meters and so on.

The use of appropriate sub-database sub-table strategy is conducive to rapid business development in the late stages of horizontal split, while the sub-Library can effectively utilize the MySQL multi-threaded replication features.

28. Reduce the number of interactions with the database and use bulk SQL statements as much as possible.

Use the following statement to reduce the number of interactions with DB:

A) INSERT ... On DUPLICATE KEY UPDATE

b) REPLACE into

c) INSERT IGNORE

d) INSERT into VALUES ()

29. Split complex SQL into multiple small SQL, avoiding large transactions.

Simple SQL is easy to use to MySQL's query CACHE; Reduce lock table time especially MyISAM; you can use multicore CPUs.

30. Multiple alter operations on the same table must be combined into a single operation.

MySQL changes the table most operations require a lock table and rebuild the table, while the lock table will have an impact on the online business. To reduce this effect, multiple alter operations on the table must be combined into a single operation. For example, to add a field B to table T, and to index an existing field AA, the usual practice is to split into two steps:

ALTER TABLE t add column B varchar (10);

Then increase the index:

ALTER TABLE T add index IDX_AA (AA);

The correct approach is to:

ALTER TABLE t add column B varchar (ten), add index IDX_AA (AA);

31. Avoid using stored procedures, triggers, views, custom functions, and so on.

These advanced features have performance issues and many unknown bugs. Putting the business logic into the database makes the database DDL, scale out, sharding, and so on more difficult.

32. No application account with super privilege exists.

Safety first. Super privileges can cause read only to fail, resulting in more bizarre problems and difficult to track.

33. Submit the online table Change requirements, you must specify all the SQL statements involved (including INSERT, DELETE, UPDATE), easy for DBAs to review and optimize.

The index is not the only SELECT statement that needs to be used. UPDATE, delete all need to navigate to the data before the change can be executed. Therefore it is necessary for the business to provide all the SQL statements to facilitate DBA review.

34. Do not store business logic in the MySQL database.

A database is a stateful service, and changes are complex and slow, and if you put the business logic into a database, it will limit the rapid growth of your business. It is recommended to advance the business logic to the front-end or intermediate logic layer, and to use the database as the storage layer to realize the separation of logic and storage.

35. Example of building a table statement

MySQL Development specification

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.