[Mysql] database usage specifications, mysql database specifications

Source: Internet
Author: User
Tags md5 hash percona percona server

[Mysql] database usage specifications, mysql database specifications
I. MySQL Problems

  • The optimizer does not support complex SQL statements.

  • Poor Support for SQL standards

  • The large-scale cluster solution is not mature, mainly refers to Middleware

  • ID generator, global auto-increment ID

  • Asynchronous logical replication, data security issues

  • Online DDL

  • Incomplete HA solution

  • The backup and recovery solutions are still complex and dependent on external components.

  • Too little information is displayed to users, making troubleshooting difficult

  • Numerous Branches, making it hard to choose

Ii. database environment Introduction

Generally, Internet companies' databases are divided into five database environments:

  • Dev: Development environment, development can read and write, can modify the table structure; commonly used 163 of database tables; developers can modify the table structure, can modify the data in it; but need to ensure that it does not affect other development colleagues

  • Qa: Test environment. developers can read and write data. developers can use tools to modify the table structure.

  • Sim: Simulate the environment, develop readable and writable, and use the web Platform; when initiating a launch request, it will be preexecuted in this environment first, this environment can also be used to deploy online drills or stress tests and read and write data.

  • Real: Production database slave database (quasi-Real-Time Synchronization), read-only environment, not allowed to modify data, not allowed to modify the table structure; for online problem search, data query, etc.

  • Online: Online environment. Developers are not allowed to perform database operations directly in the online environment. If you need to perform operations, You must contact the DBA to perform the operations and make corresponding records.

Machines in these environments must have clear permission division, separate read/write accounts, and have a degree of identification to differentiate specific services. For example, if the user name is w_wap, r_wap can tell that the read/write account is a wap application.

Iii. Database Development specifications

The development specifications include the basic naming and constraints, field design specifications, index specifications, and usage specifications.

Normative meaning
  • Ensure online database schema specifications

  • Reduce the probability of a problem

  • Convenient automated management

  • Long-term adherence to specifications is a win-win situation for developers and DBAs.

Constraints
  • Select UTF8 as the table character set. To store emoj expressions, use UTF8mb4 (supported after MySQL 5.5.3)

  • The storage engine uses InnoDB

  • Use varchar varbinary as long as possible

  • Images and files are not stored in the database

  • Add comments when designing a table
  • The data volume of a single table is limited to less than 0.1 billion. The physical size of a single table cannot exceed 10 GB, and the average length of a row cannot exceed 8 KB.

  • Do not perform database stress testing online

  • Prohibit direct database connection from test and development environments

create database test_crm default character set=utf8;
Basic naming rules
  • Reserved Words are not allowed for database name, table name, and field name

  • The database name, table name, field name, and index name are separated by lowercase letters.

  • The database name, table name, field name, and index name must not be too long. It cannot contain more than 32 characters. Use as few characters as possible to express the purpose of the table.

  • Temporary database and temporary table names must be prefixed with tmp and suffixed with date

  • The backup database and table must be prefixed with bak and suffixed with a date.
  • The database name, table name, field name, and index name use nouns as database names. Only English is used, and Chinese Pinyin is not required.

  • The database name must contain letters, lowercase letters, and 3-7 letters.

  • If the database name contains multiple words, the words are separated by underscores (_).

Table sharding Specification
  • Partition Table prohibited

  • Split large fields and low access frequency fields to separate cold and hot data

  • HASH is used for table sharding. The table name suffix uses the decimal number, and the subscript starts from 0.

  • Table sharding by seasonal time must comply with YYYY [MM] [DD] [HH] format

  • Adopt appropriate database/table sharding policies

Field Specification
  • All fields are defined as not null, unless you really want to save NULL, but I cannot think of the need to use Null

  • The smaller the field type, the better. UNSIGNED is used to store non-negative integers. There are not many negative numbers in actual use.

  • Use TIMESTAMP to store the time, and use unsigned int to store IPv4 addresses instead of CHAR (15). This method can only store IPv4 addresses, but cannot store IPv6 addresses.

  • Use VARCHAR to store variable-length strings. Of course, note that M in varchar (M) refers to the number of characters rather than the number of bytes;

  • Use DECIMAL instead of FLOAT and DOUBLE to store precise floating point numbers

  • Try not to use BLOB TEXT

  • Use TINYINT to replace the ENUM type and convert the character to a number.

  • Do not store plaintext passwords in Databases

  • Use VARBINARY to store case-sensitive variable strings

Index Specification
  • The number of fields in a single index cannot exceed 5, and the number of indexes in a single table cannot exceed 5. The index design follows the B + Tree index leftmost prefix matching principle.

  • Select a column with a high degree of discrimination as the index.

  • Use a prefix index for a string. The prefix index length cannot exceed 8 characters.
  • Prefix indexes are recommended. If necessary, you can add pseudo columns and create indexes.

  • The index can cover 80% of the major queries, which is not comprehensive and solves the main contradiction of the problem.

  • Appropriate indexes should be created for DML, order by, and group by fields.

  • Avoid implicit conversion of Indexes

  • Avoid redundant Indexes

  • Primary Key: The table must have a primary key. Columns with frequent updates are not used; string columns are not selected; UUID MD5 HASH is not used; non-null unique keys are used by default, we recommend that you select Auto-incrementing or poster
  • Important SQL statements must be indexed: WHERE condition columns of UPDATE and DELETE statements; fields of ORDER BY, GROUP BY, and DISTINCT; fields of Multi-table JOIN

  • Covering indexes for core SQL statements

  • Do not create an index on a low base column, such as "gender"

  • Mathematical operations and function operations are not performed on index Columns

  • Try not to use the external foreign key. The foreign key is used to protect the integrity of the reference and can be implemented on the business side. operations on the parent and sub-tables affect each other to reduce the availability. INNODB itself limits online DDL.

  • Do not make a forward % query, such as like "% AB"

  • Do not use negative queries, such as not in/like "cannot use indexes, resulting in full table Scan

In the implicit conversion example, the field is defined as varchar, but the input value is an int, which will cause a full table scan, requiring the program to perform a type check.

Field: remark varchar (50) NOT Null

mysql>SELECT id, gift_code FROM gift WHERE deal_id = 640 AND remark=115127;1 row in set (0.14 sec)mysql>SELECT id, gift_code FROM pool_gift WHEREdeal_id = 640 AND remark=‘115127’; 1 row in set (0.005 sec)
SQL specifications
  • Make the explain pre-compiled statement and pass only parameters, which is more efficient than passing SQL statements and reduces the SQL Injection probability.

  • Full use of prefix Indexes

  • Try not to use stored procedures, triggers, functions, and so on, so that the database can do what it is best.

  • Avoid using JOIN for large tables. The MySQL optimizer is too simple to optimize the join policy.

  • Avoid performing mathematical operations and other computing tasks in the database

  • SQL merge mainly refers to merging multiple values in DML to reduce interaction with the database.

  • Reasonable paging, especially large Paging

  • The UPDATE and DELETE statements do not use LIMIT, which may cause inconsistency between the master and slave nodes.

  • Replace or with in. The in value cannot exceed 1000.

  • Disable order by rand ()

  • SQL statements avoid using temporary tables

  • Use union all instead of union

  • The program should handle SQL exceptions.

  • Prohibit a single SQL statement from updating multiple tables at the same time

  • When reading data, SELECT only the required columns. Do not SELECT * each time to avoid serious random read problems, especially reading some TEXT/BLOB columns.
  • Generally, the performance of subqueries is poor. We recommend that you convert it into JOIN statements.

  • When multi-table join queries are performed, the types of joined fields should be as consistent as possible, and all associated fields must have indexes.

  • When querying a multi-Table connection, use a table with a small result set (note: this refers to the filtered result set, which is not necessarily a small full table data volume) as the driving table.

  • When multiple tables are joined and sorted, the sorting field must be in the driving table. Otherwise, the index cannot be used in the sorting column.

  • Use multiple composite indexes instead of multiple independent indexes. For example, do not create independent indexes for columns with a base (Cardinality) that is too small (for example, the total number of unique values of this column is less than 255 ).

  • For SQL statements similar to the paging function, we recommend that you first associate with the primary key and then return the result set, which is much more efficient.

Iv. DBA standard version Selection
  • MySQL Community edition, the largest user group

  • MySQL Enterprise Edition, paid

  • Percona Server with many new features

  • MariaDB version, few domestic users

We recommend that you select the MySQL Community edition> Percona Server> MariaDB> MySQL Enterprise Edition.

Main Content
  • SQL audit, DDL audit, and operation time, especially OnlineDDL

  • High-risk operation check, data backup before Drop

  • Log analysis mainly refers to MySQL slow logs and error logs

  • Data backup solution

Online DDL

Native MySQL needs to lock the table when executing DDL statements, and the business cannot write data during the lock table, which has a great impact on the service. MySQL has poor support for this aspect.

Pt-online-schema-change is recommended.

The advantages of using pt-online-schema-change include:

  • No blocking write

  • Comprehensive condition detection and delayed load policy control

Restrictions on Using pt-online-schema-change include:

  • It takes a long time to change the table (compared to modifying the table directly using alter table)

  • The modified table must have a unique key or primary key.

  • Concurrent Modification on the same port cannot be too many

MySQL cluster Solution
  • Based on master-slave replication;

  • Based on middleware/proxy

  • NDB-based Engine

  • Galera-based protocol

MHA is recommended first:AvailableOne master, multiple slaves, OrDual-master, multi-slaveIn this mode, MHA or MMM can be used to manage the entire cluster. The latest MHA also supports the GTID mode of MySQL 5.6.

MHA has obvious advantages:

  • Open-source, Perl-based development, clear code structure, and easy secondary development;

  • The solution is mature. During failover, MHA will make strict judgments to minimize data loss and ensure data consistency;

  • Provides a general framework for custom development based on your own situation, especially for judging and switching operation steps;

  • Supports binlog server to improve binlog transfer efficiency and further reduce the risk of data loss.

However, MHA has some limitations:

  • Ssh trust needs to be interconnected between nodes, which is a challenge for some companies' security systems, because if a node is cracked by hackers, other nodes will suffer;

  • The provided scripts also need to be further supplemented and improved. Of course, the general usage is still enough.

Splitting Problems
  • Solve the problem of excessive single-host writing pressure and capacity

  • Vertical Split and horizontal split

  • Moderate splitting. Do not over-split

  • It is best to have an intermediate layer to control the splitting logic, otherwise the splitting will be too costly to manage.

Data backup
  • Full backup VS Incremental Backup

  • Hot Standby VS Cold Standby

  • Physical backup VS logical backup

  • Delayed backup

  • Full binlog backup

Recommended method:

  • Hot Backup + physical backup

  • Core Business: delayed backup + logical backup

  • Full binlog backup

The main tasks are as follows:

  • Centralized backup policy scheduling management

  • Xtrabackup hot standby

  • Backup result statistical analysis

  • Backup data consistency check

  • Distributed File System storage and backup

Reasons for the Distributed File System Used in the backup system:

  • Solve Storage Allocation Problems

  • Solve the problem of low storage NFS backup Efficiency

  • Centralized storage management

  • Better data reliability

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.