2014-10-13 15:04:34
goal : Support scalability. Optimize the structure of the database to improve query performance and support smooth scaling of tables.
anti-pattern : Clone table and Clone column
1. Split a very long table into multiple smaller tables, using a specific data field in the table to name the split tables.
2. Split a column into multiple columns and use different values from other columns to name the split column.
To achieve the goal of reducing the number of records per table, you have to create tables with many columns, or create many tables. But in 2 scenarios,
You will find that as the amount of data grows, more and more tables or columns will be available.
Cons: (1) constantly generating new tables. To split the data into separate tables, you need a rule that defines which data belongs to which tables.
(2) Manage data integrity (different tables define different constraints).
(3) Multiple steps are required to synchronize the data.
(4) Ensure uniqueness: You need to ensure that the primary keys in all the separated tables are unique. If you need to move a record from one table to a table,
It is necessary to ensure that the primary key value of the moved record does not conflict with the primary key record in the target table.
(If the primary key is a self-growing int type, it is difficult to keep the ID consistent; if it is a GUID type, it is easy to control).
(5) Cross-table query: Time and time growth, creating more and more tables, it is necessary to constantly update the program code to introduce these newly created tables.
(6) Synchronizing metadata: If the table is split, when a new column of data is added, the new column needs to be added to all the tables.
(7) Managing referential integrity: A separate table is a timely association table rather than a parent table, which can also cause some problems.
(8) Identity metadata split columns: Columns may also be categorized according to metadata. You can create a table with many columns that install their category extensions.
how to recognize anti-patterns : May be anti-pattern when the following conditions occur
1. We need every ... Create a table or column?
2. What is the maximum number of tables or columns supported by the database?
3, the Discovery program added record failed: Because Wang Jiawei New Year to add a new table
4, how to query a number of tables? The columns for each table are the same.
"If you need to query a lot of structure-like tables, you should say that the data is all stored in a single table, using an extra attribute column to group the data"
5, how to pass the table name as a list? These table names are generated dynamically in queries that need to be based on the argument.
Rational use of anti-patterns :
A reasonable usage scenario for manually separating tables is " archive data "-----Remove historical data from the database that is used daily.
This is usually done in cases where the query for outdated data becomes very rare.
If you don't have the time to query both current and historical data, it's a good practice to move old data from the currently active table to somewhere else.
Archiving data into new tables that are compatible with the current table structure can support queries that occasionally do data analysis while making daily data query edges very efficient.
"Backing up and replying to a medium-sized database is much more convenient than manipulating a database that stores terabytes of data."
Although it is not wrong to model a data object and map everything in an entire object to a separate database, it is reasonable to size more than
The database separation of critical values can simplify the work of database management.
】
solution : Partitioning and standardizing
When the data for a table is very large, in addition to manually splitting the table, you can use horizontal partitioning , vertical partitioning , and using association tables to improve query performance.
1. Horizontal partitioning (or sharding ): Splits the table according to the data rows. And don't worry about the flaws of those dividers.
Just define some rules to split a logical table, and the database will manage all the rest of the work for you.
Physically, the table is actually split, but you can still execute SQL query statements as if you were querying a single table.
2. Vertical Partitioning : When certain columns are very large or rarely used, it is advantageous to split the table by column.
The column size of the BLOB type and text type is variable and can be very large. To improve the performance of storage in queries, these databases automatically
These types of columns are stored separately from the other columns in the table. If you make a query that does not contain a BLOB type and text type, you can
More efficient access to other columns has improved query performance.
3. resolving metadata splitting columns : Creating association Tables
Conclusion : Don't let data multiply metadata.
SQL anti-pattern Learning note 9 Meta data splitting