Database environment
- Dev: Development environment, development of read-write, can modify table structure. Developers can modify the table structure and can modify the data at will, but need to ensure that other development colleagues are not affected.
- QA: test environment, development readable, developer can modify table structure through tools.
- SIM: Simulates the environment, develops read-write, and initiates an on-line request, which is pre-executed on this environment, which is also available for deployment on-line walkthroughs or stress tests.
- Real: Production database from the library (quasi-real-time synchronization), read-only environment, does not allow the modification of data, not allowed to modify the table structure, for online problem search, data query and other uses.
- Online: Online environment, developers do not allow direct on-line environment for database operations, if necessary operations must find the DBA to operate and record accordingly, prohibit the stress test.
These environments of the machine, must be clear, read and write account separation, and identification, can distinguish the specific business. For example, user name W_account,r_ account represents read and write accounts respectively, account is read and write accounts.
Naming conventions
Basic naming principles
- Use meaningful English words, which are separated by an underscore in the middle of the vocabulary. (Do not use pinyin)
- Use only English letters, numbers, underscores, and start with the English alphabet.
- Libraries, tables, fields are all lowercase, do not use camel-named.
- Avoid using Oracle, MySQL reserved words, such as DESC, keywords such as index.
- Name Forbidden more than 32 characters, must see the meaning of the name, the proposed use of nouns is not a verb
- database, data tables use prefixes
- Temporary library, table name must be prefixed with TMP and date suffix
- Backup library, table must be prefixed with bak and date suffix
Why are libraries, tables, and fields all lowercase?
In MySQL, the database and table pairs are in the directories and files under those directories. Thus, the sensitivity of the operating system determines the case sensitivity of database and table naming.
- Under Windows is case-insensitive.
- Linux under casing rules:
- Database name and table name are strictly case-sensitive;
- Table aliases are strictly case-sensitive;
- The alias of the column name and column is ignored in all cases;
- Variable names are also strictly case-sensitive;
What if I've set up a camel-style naming solution? You need to add Lower_case_table_names = 1 to the configuration file My.ini in MySQL.
Table naming
- Tables of the same module use the same prefix as possible, and the table name conveys the meaning as much as possible. All log tables start with Log_
Field naming
- An English word or shorthand that expresses its actual meaning. The field of Boolean meaning is prefixed with "Is_", followed by the past participle of the verb.
- Words field the same meaning between tables have the same name. A field of the same meaning between tables, named after the table name _ field name that removes the module prefix.
- The foreign key field represents its association relationship with the table name _ field name.
- The primary key of the table is generally agreed to be the ID, the self-increment type, is the foreign key of the other table to use xxx_id way to indicate.
Index naming
- Non-unique indexes must be named according to "idx_ Field name _ Field name [_ Field names]"
- Unique indexes must be named according to "uniq_ Field name _ Field name [_ Field names]"
Constraint naming
- PRIMARY KEY constraint: Pk_ table name.
- Unique constraint: Uk_ table name _ field name. (You need to have unique check logic in your application.) )
Trigger naming
- Trg_ Table Name _ operation.
Function procedure naming
- Use verb + noun form to express its meaning.
Sequence naming
Table Design Specifications
1, the table engine depends on the actual application scenario; Log and report Class table suggested with the MyISAM, with the transaction, audit, the amount of related tables recommended with InnoDB engine. If not stated, the InnoDB engine is used when the table is built. The difference between MyISAM and InnoDB
2, the default uses the UTF8MB4 character set, the database collation uses UTF8MB4_GENERAL_CI, (because the database definition uses the default, the data table can no longer be defined, but for the sake of insurance, the recommendation is written).
Why does the character set not select UTF8, the collation does not use UTF8_GENERAL_CI?
MySQL with UTF8 encoding cannot save a placeholder is a 4-byte emoji expression. To enable back-end projects that fully support client input for emoji emoticons, the upgrade code for UTF8MB4 is the best solution. In cases where the JDBC connection string is set to characterencoding to UTF8 or emoji data is still not properly inserted in the above configuration, you need to specify the character set of the connection in the code as UTF8MB4.
3. All tables and fields apply the Comment column property to describe the true meaning of this table, the field, and, for example, the enumeration value suggests that the content used in the field be defined.
4. If no description is given, the first ID field in the table must be a primary key and automatically grow, prohibiting data passing as a condition within a non-transactional context. It is forbidden to use the varchar type as the primary key statement design.
5, if no description, the table must contain the Create_time and Modify_time fields, that is, the table must contain the record creation time and the modified time of the field
6, if no description, the table must contain Is_del, to indicate whether the data is deleted, in principle, database data does not allow physical deletion.
7. Save data in one field with as little storage space as possible
- can use int without char or varchar
- to use tinyint without int
- to store non-negative values using unsigned. The
- does not recommend the use of enum, set type, use tinyint instead of
- to use short data types, such as when the value range is 0-80, use tinyint UNSIGNED
- Storing an exact floating-point number must use decimal instead of the float and double
- time fields, except for special cases where an int is used to record the Unix_timestamp
- storage year type. The
- stores dates using the date type.
- Storage time (accurate to seconds) it is recommended to use the timestamp type because timestamp uses 4 bytes and datetime uses 8 bytes. The
- recommends using the int unsigned storage IPV4.
- do not use text, blob type
- as much as possible to prohibit the use of varbinary, blob storage pictures, files, and so on in the database. It is recommended to use other means of storage (TFS/SFS), MySQL only save pointer information.
- single record size is forbidden to exceed 8k (column length (Chinese) * * (UTF8) + column Length (English) *)
What is the difference between datetime and timestamp?
Same point: The timestamp column is displayed in the same format as the datetime column. The display width is fixed at 19 characters, and the format is Yyyy-mm-dd HH:MM:SS.
Different points:
- TIMESTAMP
- 4 bytes Storage, time range: 1970-01-01 08:00:01 ~ 2038-01-19 11:14:07
- Values are saved in UTC, involving time zone conversions, converting to the current time zone when stored, and then converting back to the current time zone when retrieved.
- Datetime
- 8 bytes Storage, Time range: 1000-01-01 00:00:00 ~ 9999-12-31 23:59:59
- Actual format storage, independent of time zone
How do I use automatic assignment properties for timestamp?
- Use the current time as the default value for TS: TS TIMESTAMP defaults current_timestamp.
- When the row is updated, update the value of TS: TS TIMESTAMP default 0 on update current_timestamp.
- You can combine 1 and 2: TS TIMESTAMP DEFAULT current_timestamp on UPDATE current_timestamp.
How do I use the int unsigned storage IP?
Use the int unsigned instead of char (15) to store the IPv4 address and convert it through the MySQL function Inet_ntoa and Inet_aton. The Ipv6 address currently has no conversion function and requires a decimal or two bigint to store it.
8, if no note, all fields are set not NULL, and set the default value;
9. Prohibit storing plaintext password in database
10, if no note, all the Boolean fields, such as Is_hot, is_deleted, must set a default value, and set to 0;
11, if no note, the sort field order_id in the program by default in descending order;
12, the shape definition does not add the length, for example uses the int, but is not int[4]
Int[m] , what does the M value mean?
Note that numbers that follow the parentheses of a numeric type only represent width and are not related to the storage range. Many people think that int (4) and int (10) have a value range of (-9999 to 9999) and (-9999999999 to 9999999999), which is wrong. In fact, when the M-value in the integral type is used in conjunction with the Zerofill attribute, the column values can be achieved. Regardless of the number of M values in Int[m], the range of values is (2147483648 to 2147483647 signed), (0 to 4294967295 unsigned).
The display width does not limit the range of values that can be saved within the column, nor does it limit the display of values that exceed the specified width of the column. When combined with optional extended attribute Zerofill, the default supplemental space is replaced with 0. For example: for a column declared int (5) Zerofill, the value 4 is retrieved as 00004. Note that if you save a value that exceeds the display width in the integer column, you will encounter problems when MySQL generates temporary tables for complex joins, because in these cases MySQL believes that the data fits the original column width, and if you specify Zerofill for a numeric column, MySQL automatically adds the unsigned property to the column.
13. Use varbinary to store case-sensitive variable-length strings
When to use char, when to use varchar?
Char and varchar types are similar, but they are saved and retrieved in different ways. They are also different in terms of their maximum length and whether trailing spaces are retained. The length of the char and varchar type declarations represents the maximum number of characters that you want to save. For example, a CHAR (30) can occupy 30 characters.
- The length of the char column is fixed to the length declared when the table was created. The length can be any value from 0 to 255. When you save a char value, a space is filled to the right of them to reach the specified length. When a char value is retrieved, the trailing space is removed. No case conversions are made during the storage or retrieval process.
- The value in the varchar column is a variable-length string. The length can be specified as a value between 0 and 65,535. (The maximum effective length of varchar is determined by the maximum row size and the character set used.) The overall maximum length is 65,532 bytes).
In contrast to char, the varchar value is saved with only the required number of characters, plus one byte to record the length (two bytes if the column declaration is longer than 255). The varchar value is not populated when it is saved. The trailing spaces on the value save and retrieve are still preserved, conforming to standard SQL.
Char is suitable for storing the MD5 hash value of the user's password, which is always the same length. For frequently changing values, char is better than varchar, because fixed-length rows are not prone to fragmentation, and for very short columns, char is more efficient than varchar. char (1) string takes only one byte for a single-byte character set, but varchar (1) consumes 2 bytes because 1 bytes are used to store the length information.
Index Design Specification
MySQL's query speed relies on a good index design, so indexing is critical for high performance. A reasonable index will speed up the query (including the speed of the update and delete, MySQL will load the page containing the row into memory, and then update or delete operations), the unreasonable index will slow down. The MySQL index looks similar to the pinyin and radical lookups of the Xinhua dictionary, and when the pinyin and the radical index does not exist, it can only be found by page-by-page paging. When the MySQL query cannot use the index, MySQL makes a full table scan and consumes a lot of IO. Purpose of the index: de-weight, accelerate positioning, avoid sorting, overwrite index.
What is an overlay index?
In the InnoDB storage engine, there is no direct storage of row addresses in secondary index (non-primary key indexes), and the primary key value is stored. If a user needs to query a data column that is not contained in secondary index, it needs to be queried two times by secondary index to find the primary key value and then querying to the other data column through the primary key. The concept of an overlay index is that the query can be done in an index, the coverage index is more efficient, and the primary key query is a natural overwrite index. Reasonable index creation and reasonable use of query statements, when used to overwrite the index can achieve performance gains. For example, select Email,uid from User_email WHERE uid=xx, if the UID is not a primary key, you can add the index as index (UID,EMAIL) at the appropriate time for a performance gain.
Basic Specifications for indexes
1, index number control, the number of indexes in a single table is not more than 5, the number of fields in a single index not more than 5.
- Comprehensive assessment of data density and distribution
- Consider querying and updating scales
Why can't there be too many indexes in a single table?
InnoDB's secondary index is stored using B+tree, so the b+tree needs to be adjusted at update, DELETE, insert, and too many indexes slow down the update.
2, the prefix index to the string, the prefix index length of not more than 8 characters, it is recommended to prioritize the prefix index, if necessary, can add pseudo-columns and build the index.
- Do not index fields such as Blob/text, do not index large fields, doing so will make the index occupy too much storage space
What is a prefix index?
The prefix index, in other words, is to index the first few characters of the text (specifically, when several characters are indexed) so that the index is smaller, so the query is faster. The prefix index can effectively reduce the size of the index file and increase the speed of the index. But the prefix index also has its drawbacks: MySQL cannot use prefix indexes in ORDER by or GROUP by, nor can they be used as an overlay index (covering index).
Syntax for establishing a prefix index: ALTER TABLE table_name ADD KEY (column_name (prefix_length));
3. Primary KEY criteria
- Table must have a primary key
- Do not use columns that are frequently updated
- Try not to select string columns
- Do not use UUID MD5 HASH
- Default use of non-null unique keys
- Recommended selection of self-increment or generator number
4. Important SQL must be indexed, core SQL takes precedence over overwrite index
- The Where Condition column of the UPDATE, DELETE statement
- ORDER by, GROUP by, distinct fields
- fields for multi-table joins
5. The field with the highest degree of differentiation is placed in front
- Select the better filter of the field in the front, such as the number, UserID, etc., type,status and other filtering generally not recommended in the front
- The index is based on the left prefix principle, when a federated index (A,B,C) is established, the query condition only contains (a) or (a, b) or (a,b,c) to go to the index, (a,c) as a condition can only use the column a index, so this time to determine the return of a column must not be too much, Otherwise the statement design is unreasonable, (b,c) can not go index
- Reasonable creation of federated indexes (avoid redundancy), (A,B,C) equivalent to (a), (b), (A,B,C)
6. Index Taboo
- Do not index on low cardinality columns, such as "gender"
- Do not perform mathematical operations and function operations on indexed columns
- Do not index small tables that are commonly used
7, try not to use foreign keys
- Foreign keys are used to protect referential integrity and can be implemented on the business side
- Actions on parent and child tables affect each other, reducing availability
- The InnoDB itself limits the online DDL
Limitations of indexing in MYSQL
- MYISAM storage Engine The sum of index lengths cannot exceed 1000 bytes
- BLOB and TEXT type columns can only create prefix indexes
- MYSQL does not currently support function indexing
- MYSQL cannot use an index when it is not equal to (! = or <>).
- After a filter field uses a function operation such as ABS (column), MySQL cannot use the index.
- MySQL cannot use index when join condition field type is inconsistent in join statement
- When using the like operation, if the condition starts with a wildcard character (such as '%abc ... '), MySQL cannot use the index.
- MYSQL cannot use a Hash index when using non-equivalence queries.
Statement Design specification
1. Using pre-compiled statements
- More efficient than passing SQL statements with only arguments
- One parse, multiple use
- Reduce SQL injection probability
2. Avoid implicit conversions
- Will cause the index to fail
3. Make full use of the prefix index
- Must be the leftmost prefix
- It is not possible to use two range conditions at the same time
- Queries that do not use% leading, such as like "%ab"
4, do not use negative query, such as not in/like
- Unable to use index, resulting in full table scan
- Full table scan results in reduced buffer pool utilization
5. Avoid using stored procedures, triggers, UDFs, events, etc.
- Let the database do the best thing
- Reduced business coupling, sacle out, sharding
- Avoid bugs
6, avoid the use of large table join
- MySQL is best at a single-table primary key/Two-level index query
- Join consumes more memory, resulting in temporary tables
7. Avoid mathematical operations in the database
- MySQL is not good at mathematical arithmetic and logic judgment
- Unable to use index
7, reduce the number of interactions with the database
- INSERT ... On DUPLICATE KEY UPDATE
- REPLACE into, insert IGNORE, insert into VALUES (), (), ()
- UPDATE ... WHERE ID in (10,20,50,...)
8. Reasonable use of paging
- Number of pages to limit page impressions
- You can only click on the previous page, next page
- Using deferred correlation
How to use paging correctly?
If there is a paging statement similar to the following: SELECT * FROM table ORDER by ID LIMIT 10000, 10
Since the processing of limit offset in MySQL is to remove all data from the Offset+limit, then offset is removed and the bottom limit is returned. Therefore, when the value of offset is large, the query performance of MySQL will be very low. can be resolved using the ID > N method:
The use of ID > N is limited in the way, for the problem of the ID discontinuity, it can be solved by passing the last ID method at the same time when turning the page.
1234567 |
//output, find the maximum minimum ID in the current result set//Next pagehttp://example.com/page.php?last=100 SELECT * from table where ID< +ORDER by ID desc limit< c14> Ten // prevhttp://example.com/page.php?first=110 SELECT * from table where ID> Theorder by ID desc limit< c14> Ten |
The big drawback of this approach is that if there is an insert/delete operation in the browse, the page turn will not be updated, and the total number of pages may still be calculated based on the new count (*), which may eventually result in some records not being accessed. To fix this problem, you can continue to introduce and cache the current page number and whether there are insertions/deletions that affect the total number of records after the last page turn
Other variant modes:
1 |
SELECT * FROM table where ID >= (select ID from table or Der by id limit #offset #, 1) |
9, reject large SQL, split into small SQL
- Make the most of query CACHE
- Take advantage of multi-core CPUs
10, use in instead of or,in value of not more than 1000
11. Prohibit the use of order by rand ()
12. Use explain diagnostics to avoid creating temporary tables
The explain statement (executed in the MySQL client) can obtain information on how MySQL executes the SELECT statement. By performing a explain on the SELECT statement, you can know if MySQL uses information such as indexes, full table scans, temporary tables, sorting, and so on when executing the SELECT statement. Try to avoid MySQL for full table scans, temporary tables, sorting, and so on. See official documentation.
13. Use UNION ALL instead of union
What is the difference between union all and union?
The Union and UNION ALL keywords combine two result sets into one, but they differ in both usage and efficiency.
The Union will filter out duplicate records after the table link is made, so the resulting set of results will be sorted after the table is connected, the duplicate records are deleted and the results returned. Such as:
123 |
SELECT * from test_union1 Union SELECT * from test_union2 |
This SQL takes out the results of two tables at run time, then sorts the duplicate records with the sort space, and finally returns the result set, which may cause the disk to be sorted if the table data volume is large.
The union all simply merges two results and returns. Thus, if there are duplicate data in the two result sets returned, the returned result set will contain duplicate data.
In terms of efficiency, union All is much faster than union, so if you can confirm that the combined two result sets do not contain duplicate data, then use Union ALL, as follows:
123 |
SELECT * from test_union1 Union All SELECT * from test_union2 |
14, the program should catch SQL exception processing mechanism
15. Prohibit a single SQL statement to update multiple tables at the same time
16, do not use SELECT *, select statement only get the required fields
- Consumes CPU and IO, consumes network bandwidth
- Cannot use overwrite index
- Reduce the impact of table structure changes
- Because of the large, select/join may generate temporary tables
17. UPDATE, DELETE statement does not use limit
18. The INSERT statement must explicitly indicate the field name without using INSERT into table ()
19. Insert statement using batch submission (INSERT into table VALUES (), (), () ... ), the number of values does not exceed 500
20. Use COUNT (*) instead of Count (Primary_key) and COUNT (1) for the number of records in the statistics table Note: only for MyISAM
21, data update is recommended to use a Level two index first query the primary key, and then the data update according to the primary key
22, prohibit the use of cross-Library query
23, prohibit the use of sub-query, it is recommended to convert the subquery into an associative query
24, for the VarChar Type field program processing, please verify the user input, do not exceed its preset length;
Specification of sub-table
Single-year data volume of more than 500w or data capacity of more than 10G to consider the sub-table, the need to consider the historical data migration or the application to delete historical data, the use of an equal balance of the table or according to Business rules table can be. The data table to be divided must be consulted with the DBA for a table policy
- Hash table with hash, table name suffix using decimal number, subscript starting from 0
- By date time table to conform to YYYY[MM][DD][HH] format
- Adopt the appropriate sub-database sub-table strategy. such as the Thousand library ten tables, 10 library hundred meters and so on
- It is forbidden to use partition table, partition table is strict to partition key, partition table is more difficult to perform DDL, sharding, single table recovery after the table becomes bigger.
- Split large and low-frequency fields, separating hot and cold data
Code of Conduct
- Batch import, export data must be notified to DBA for assistance in advance observation
- Prevents background management and statistical class queries from being performed on-line from the library
- No application account with super privilege exists
- Timely notification of DBA assistance in the event of a non-database-caused failure of the product
- Promotions or new features on-line must notify DBA in advance for traffic assessment
- Database data loss, timely contact DBA for recovery
- Multiple alter operations on single table must be combined into one operation
- Do not store business logic in MySQL database
- Database project selection and design of major projects must be informed of DBA involvement in advance
- Communicate with DBA in advance to determine maintenance and backup priorities for particularly important library tables
- Do not bulk update, query database other specifications during peak business hours
- Submit Online table Change request, must specify all relevant SQL statements
Other specifications
Log class data is not recommended to be stored on MySQL, giving priority to HBase or oceanbase, as needed for storage please look for DBA evaluation using compressed table storage.
MySQL naming, design, and usage specification MySQL naming, design and use code