DBA Operational Specifications
1, involved in business modification/deletion of data, in the business party, the CTO's mail approval before the executable, before the implementation of the backup, if necessary reversible.
2, all on-line demand must go to work Order system, oral notice as invalid.
3, when the table structure changes to the large table, such as modifying the field properties will cause the lock table, and will cause delay from the library, thus affecting the online business, must be in the service low peak period after 0:00 execution, and unified with the tool Pt-online-schema-change avoid lock table and reduce delay execution time.
Examples of Use:
#pt-online-schema-change--alter= "Add index Ix_id_no (id_no)" \
--no-check-replication-filters--recursion-method=none--user=dba \
--password=123456 d=test,t=t1--execute
For MongoDB to create an index to create in the background, avoid locking the table.
Examples of Use:
Db.t1.createIndex ({idcardnum:1},{background:1})
4, all online business libraries must be built MHA high-availability architecture, to avoid a single point of problem.
5, to the business party open permissions, password to use MD5 encryption, at least 16 bits. permissions, such as no special requirements, are select query permissions, and do the library table-level restrictions.
6. Delete the default blank password account.
Delete from Mysql.user where user= ' and password= ';
Flush privileges;
7, the summary library to open the Audit audit log function, when there is a problem can be traced.
Code of Conduct
8, prohibit a MySQL instance to store multiple business databases, will cause the business coupling is too high, once the problem will be implicating, increase the difficulty of locating the fault problem. Usually with multi-instance solution, one instance of a business library, non-interference.
9, prohibit in the main library to perform background management and statistical class function query, this complex class of SQL will cause the CPU rise, which will affect the business.
10, Batch cleaning data, need to develop and DBA review, should avoid the peak hours of business execution, and observe the service status during the execution.
11, promotion and so on should communicate with DBA in person in advance, carry on the flow assessment, such as increase the machine memory or extend the architecture in advance, prevent the DB to have the performance bottleneck.
12, prohibit the online database stress test.
Basic specifications
13. Prohibit storing plaintext passwords in the database.
14. Use the InnoDB storage engine.
支持事务,行级锁,更好的恢复性,高并发下性能更好。InnoDB表避免使用COUNT(*)操作,因内部没有计数器,需要一行一行累加计算,计数统计实时要求较强可以使用memcache或者Redis。
15, table character set unified use UTF8.
Does not generate garbled risk.
16, all tables and fields need to add Chinese comments.
Convenient for others, convenient for themselves.
17, do not store pictures, files and other big data in the database.
Pictures, files more suitable for the GFS Distributed file system, the database stored hyperlinks.
18. Avoid using stored procedures, views, triggers, events.
MySQL is an OLTP application, the most good at simple to increase, delete, change, check operation, but the application of Logic computing analysis class, it is not suitable, so this part of the demand is best achieved through the program.
19, avoid the use of foreign keys, foreign keys to protect referential integrity, can be implemented on the business side.
A foreign key causes coupling between the parent and child tables, greatly affecting SQL performance, excessive lock waits, and even deadlocks.
20, the transaction consistency requirements are not high business, such as log table, such as priority to choose to deposit MongoDB.
Its own support for the Sharding Shard feature enhances the ability to scale horizontally, developing without having to adjust the business code too much.
Library table Design Specification
21. The table must have a primary key, such as a self-increment primary key.
This ensures that the data rows are written sequentially, that the performance of the SAS traditional mechanical hard drive writes is better, and that the associated query is better based on the primary key, and also facilitates data warehouse extraction. From a performance standpoint, using the UUID as the primary key is the most bad way to make the insertion random.
22. Prohibit the use of partitioned tables.
The benefit of partitioned tables is that, for development, you can easily implement a split of a table without modifying the code, using the backend DB settings, such as splitting the Time field. But this involves a problem, the field of the query must be a partition key, otherwise it will traverse all the partition table, and will not bring a performance improvement. In addition, the partitioned table is still a table in the physical structure, and when we change the table structure, it does not result in a performance improvement. Therefore, you should use the form of a cut table to do the split, such as the program needs to do a query on historical data, can be associated with the query through union ALL. In addition, as time goes by, the historical data tables are no longer needed, simply dump them from the vault, which is a convenient way to migrate to the backup machine.
Field Design Specification
23. Use decimal instead of float and double to store the exact floating-point number.
The disadvantage of floating-point numbers is that it can cause accuracy problems, see the following example:
mysql> CREATE TABLE T3 (C1 float (10,2), C2 decimal (10,2));
Query OK, 0 rows affected (0.05 sec)
mysql> INSERT into T3 values (999998.02, 999998.02);
Query OK, 1 row affected (0.01 sec)
Mysql> select * from T3;
+-----------+-----------+
| C1 | C2 |
+-----------+-----------+
| 999998.00 | 999998.02 |
+-----------+-----------+
1 row in Set (0.00 sec)
You can see that the value of the C1 column is changed from 999998.02 to 999998.00, which is caused by the inaccuracy of the float floating-point number type. Therefore, the precision-sensitive data, such as currency, should be represented or stored in fixed-point numbers.
24. Use tinyint instead of enum type.
With enum enum type, there will be problems with the extension, such as the user's online status, if this increase: 5 means do not Disturb, 6 means that the meeting, 7 is invisible to the friend visible, then add the new enum value to do the DDL modification table structure operation.
25, the field length as far as possible according to the actual needs of distribution, do not arbitrarily allocate a large capacity.
The general principle of selecting a field is to keep small, so that you can use a field with fewer bytes. For example, the primary key, it is strongly recommended to use int integer type, without uuid, why? Save space AH. What is space? Space is efficiency! Press 4 bytes and press 32 bytes to locate a record, who is fast who is slow too obvious. The effect is more pronounced when several tables are involved in joins. Smaller field types consume less memory, consume less disk space and disk I/O, and consume less bandwidth.
There are a lot of developers in the Design table field, as long as the value type of all with int, but this is not necessarily appropriate, such as the user's age, generally speaking, the age is mostly between 1~100岁, length is only 3, then with int is not suitable, can use tinyint instead. Another example of user online status, 0 means offline, 1 means online, 2 means leave, 3 means busy, 4 means stealth, etc., in fact, such as this situation, with int is not necessary, waste space, using tinyint can fully meet the need, int occupies 4 bytes, and tinyint takes 1 bytes.
int integer signed (signed) the maximum value is 2147483647, while unsigned (unsigned) maximum is 4294967295, if your demand does not store negative numbers, then the proposed change to signed (unsigned), you can increase the range of int storage.
There is no difference between int (10) and int (1), 10 and 1 are only widths, which is useful when setting the Zerofill extended attribute, for example:
[Email protected] (test) 10:39>create table test (ID int (TEN) Zerofill,id2 int (1));
Query OK, 0 rows affected (0.13 sec)
[Email protected] (test) 10:39>insert into test values ();
Query OK, 1 row affected (0.04 sec)
[Email protected] (test) 10:56>insert into test values (1000000000,1000000000);
Query OK, 1 row affected (0.05 sec)
[Email protected] (test) 10:56>select * from test;
+------------+------------+
| ID | Id2 |
+------------+------------+
| 0000000001 | 1 |
| 1000000000 | 1000000000 |
+------------+------------+
2 rows in Set (0.01 sec)
26, the field is defined as NOT NULL to provide a default value.
From the application layer point of view, you can reduce the program to determine the code, such as you want to query a record, if there is no default, you have to determine whether the field corresponding to the variable is set, if not, you have to use Java to put the variable "or 0, if the default value, the judging condition can be directly skipped.
Null values are difficult to query optimization, which makes index statistics more complex and requires special processing within MySQL.
27. Do not use text or blob types as much as possible.
Increased storage space consumption, slow read speed.
Index specification
28, the index is not more the better, according to the actual need to create.
An index is a double-edged sword that can improve query efficiency but also reduces the speed of insertions and updates and consumes disk space. An appropriate index is critical to the performance of your application, and it is extremely fast to use indexing in MySQL. Unfortunately, indexes also have associated overhead. Each time you write to a table (such as INSERT, Updateh, or delete), if you have one or more indexes, MySQL also updates each index so that the index increases the cost of writing to each table. The benefits of an index's performance gain can only be enjoyed when a column is used in a WHERE clause. If you do not use an index, it will have no value and will result in maintenance overhead.
29. The field of the query must create an index.
For example: 1, SELECT, UPDATE, DELETE statement where Condition column, 2, multi-table join field.
30, not in the index column for mathematical operations and function operations.
Unable to use index, resulting in full table scan.
Example: SELECT from t WHERE year (d) >= 2016;
Because MySQL does not support function indexing as Oracle does, even if the D field is indexed, it is scanned directly across the table.
Should be changed to----->
SELECT from T WHERE d >= ' 2016-01-01 ';
31. Do not create indexes on low cardinality columns, such as ' gender '.
Sometimes, full table browsing is faster than having to read indexes and data tables, especially if the index contains an evenly distributed set of data. A typical example of this is gender, which has two evenly distributed values (both male and female). It takes about half of the line to read by sex. It is faster to scan a full table in a case.
32, do not use the% leading query, such as like '%xxx '.
Unable to use index, resulting in full table scan.
Low-efficiency queries
SELECT from t WHERE name is like '%de% ';
----->
Efficient query
SELECT from T WHERE name is like ' de% ';
33, do not use reverse query, such as not in/not like.
Unable to use index, resulting in full table scan.
34. Avoid redundant or duplicate indexes.
The Federated Index Ix_a_b_c (A,B,C) is equivalent to (a), (A, B), (A,B,C), and the Index (a), (a, b) is superfluous.
SQL Design Specification
35. Do not use SELECT * To obtain only the necessary fields.
消耗CPU和IO、消耗网络带宽;无法使用覆盖索引。
36. Replace or with in.
Low-efficiency queries
SELECT * from t WHERE loc_id = ten or loc_id = or loc_id = 30;
----->
Efficient query
SELECT * from T WHERE loc_in in (10,20,30);
37. Avoid inconsistent data types.
SELECT * FROM t WHERE id = ' 19 ';
----->
SELECT * FROM t WHERE id = 19;
38, reduce the number of interactions with the database.
INSERT into t (ID, name) VALUES (1, ' Bea ');
INSERT into t (ID, name) VALUES (2, ' Belle ');
INSERT into t (ID, name) VALUES (3, ' Bernice ');
----->
INSERT into t (ID, name) VALUES (1, ' Bea '), (2, ' Belle '), (3, ' Bernice ');
Update ... where ID in (1,2,3,4);
Alter table tbl_name Add column col1, add column col2;
39, reject large SQL, split into small SQL.
Low-efficiency queries
SELECT from tag
JOIN tag_post on tag_post.tag_id = tag.id
JOIN post on tag_post.post_id = Post.id
WHERE tag.tag = ' MySQL ';
Can be decomposed into the following queries instead
----->
Efficient query
SELECT from tag WHERE tag = ' MySQL '
SELECT from tag_post WHERE tag_id = 1234
SELECT from Post WHERE post_id in (123, 456, 567, 9098, 8904);
40. Prohibit the use of order by rand ()
SELECT * from T1 WHERE 1=1 ORDER by RAND () LIMIT 4;
---->
SELECT from t1 WHERE ID >= ceil (RAND ()) LIMIT 4;
?
The DBA's 40-mil rule.