Traps in the 4.2MySQL schema design
Because the MySQL implementation mechanism causes some specific errors, how to avoid them, and slowly:
1. Too many columns
The MySQL storage Engine API works by copying data in a row-buffered format at the server layer and storage engine layer, then decoding the buffered content into columns at the server layer, and converting the encoded columns from row buffers into rows of data is expensive . The MyISAM fixed-length line matches the server row structure exactly, no conversion is required, but the row structure of the variable-length row structure InnoDB always needs to be converted, and the cost of conversion depends on the number of columns.
2. Too many associations
Entity-Attribute-value EAV: Poor design mode, MySQL restricts each association operation to a maximum of 61 tables, but the EAV database requires a lot of self-correlation; a rough rule of thumb is that if you want queries to execute quickly and with good concurrency, a single query is best associated with 12 tables ;
3. Prevent excessive use of enumerations
Take care to prevent overuse of enumerations, use foreign keys to associate to a dictionary table or lookup table to find specific values, and in MySQL, to do an alter TABLEwhen adding values in the enumeration column table; MySQL5.0 earlier ALTER TABLE Blocking operations, 5.1 update versions, not adding values at the end of the list also require ALTER TABLE
4, not the invention is not invent here null
It is recommended to use 0, special value, empty string instead, try not to null, but do not go extreme, in some scenarios, using NULL is better:
CREATE TABLE ... (//Full 0 (impossible date) will cause many problems dt datetime NOT NULL default ' 0000-00-00 00:00:00 ' ... )
MySQL stores null values in the index , and Oracle does not
4.3 Paradigm and inverse paradigm
4.3.1 Advantages and disadvantages
1, normalization of the update operation faster
2, when the data is well-normalized, there is very little duplication of data, only need to modify less data
3, the normalized table is smaller, can be better put into memory, perform faster operation
4, rarely redundant data, retrieve list data less need distinct, GROUP BY statement
Disadvantages:
Need to correlate, have a price and may invalidate the index
Advantages and disadvantages of 4.3.2 inverse paradigm
Avoid correlation, data may be much faster than memory (avoids random I/O)
4.4 Cache tables and Summary tables
Cache table:
Works well for optimizing search and retrieval of query statements.
Stores tables that can easily fetch data from other tables (slower per fetch)
Summary table: Save a table that aggregates data by using the GROUP BY statement
Use to decide whether to maintain data in real time or to rebuild regularly, periodically rebuild : save resources, fewer fragments, and index of sequential Organization (efficient)
When rebuilding, ensure that the data is still available during operation, through the "shadow table" , the Shadow table: A table created behind the real table, after the completion of the table operation, you can switch the shadow table and the original table by the atomic rename operation
4.4.1 materialized View
pre-computed and existing on-disk tables can be refreshed and updated via various policies, MySQL is not natively supported, and can be implemented using the Justin Swanhart tool Flexviews:
Flexviews Composition:
Change data fetching, read server binary logs and resolve changes to related rows
A series of stored procedures that help you create and manage the definition of a view
Some tools to apply changes to materialized views in a database
Flexviews can incrementally recalculate the contents of a materialized view by extracting changes to the source table: No need to query raw data (efficient)
4.4.2 Counter Table
Counter table: Cache A user friend number, file download times, etc., it is recommended to create a separate table storage counters , to avoid the query cache failure;
Update plus transactions, only serial execution, in order to higher concurrency, the counter can be saved in more than one row, each time a random row of updates, to statistical results, aggregate query; (this I read two or three sides, may be more stupid, is the same counter to save a lot of points, each time to choose one of the updates, the final summation, it seems not very , read a few more times.)
4.5 Speed the ALTER TABLE operation
The most modified table structure for MySQL is: Create empty tables with new results, isolate all data from old tables, insert new tables, delete old tables
mysql5.1 and updates contain support for some types of "online" operations, which do not require a full lock table, and the latest version of InnoDB (the only InnoDB in MySQL5.5 and later versions) supports sorting to build indexes, faster and more compact layouts;
In general, most ALTER TABLE causes the MySQL service to break, using techniques for common scenarios :
1. Perform an ALTER TABLE operation on a machine that does not serve, and then switch to the main library of the extraction service
2, shadow copy, with the required table structure to create Zhang table-independent new table, by renaming, delete the table Exchange two tables (above)
Not all alter tables cause table rebuilds, and in theory you can skip the steps to create a table: column defaults actually exist in the table's. frm file, so can directly modify the file without altering the table itself, but MySQL has not yet adopted this optimization method, all of the modify column will cause the table to be rebuilt;
ALTER column: Pass frm file changes the default values for columns: ALTER TABLE allows you to modify columns with ALTER COLUMN, modify column change column, three different operations;
ALTER TABLE sakila.film ALTER COLUMN rental_duration set default 5;
4.5.1 Modify the frm file only
MySQL sometimes rebuilds the table when it is not necessary, and if willing to take some risks, make some other types of modifications without rebuilding the table: The following operation may not work properly, back up the data first
The following operation does not require rebuilding the table:
1, remove the auto_increment of a column
2, add, remove, change enum and set constants, if you remove the constants used, the query returns an empty string
The basic technique creates a new frm file for the desired table result, and then replaces the frm file of the table that already exists with it:
1. Create an empty table with the same structure and make the necessary modifications
2. Execute flush tables with read lock: Close all the tables being used and prohibit any tables from being opened
3. Exchanging frm files
4, the implementation of unlock tables release of the 2nd step of the Read lock
Example slightly
4.5.2 Quickly create MyISAM indexes
1, in order to efficiently load data into the MyISAM table, the common technique: Disable the index, load the data, restart the index: because the work of indexing is delayed until the data is loaded, the index can be built by sorting at this time, making the index tree less fragmented and more compact
However, if the unique index is not valid (disable keys), MyISAM constructs a unique index in memory and checks for uniqueness for each loaded row, and once the index size exceeds the valid memory, the load operation becomes slower;
2, in the modern version of InnoDB, there is a similar technique: first delete all non-unique index, then add new columns, and finally rebuild the deleted index (dependent on InnoDB fast Online index creation function) Percona Server can do these things automatically;
3, like the previous ALTER TABLE hacker method to speed up this operation, but need to do more work and take risks, which is useful to load data from the backup, such as already know all data is effective, and no need to do the unique Check
Create a table with the required table structure, excluding indexes (such as the load data file and the loaded table is empty, MyISAM can be sorted to build the index)
Load data into a table to build the MyD file
Create another empty table on the structure you want, this time to include the index, which creates the. frm. Myi file
Read Lock and Refresh table
Rename the second table of the frm file MYI, let MySQL think this is the first table of the file
Release Read lock
Use repair table to rebuild the table's index, which will be sorted to build the all index, including the unique index
4.6 Summary
Good schema design principles are common to use, but MySQL has its own implementation details to be aware of, generally speaking: as much as possible to keep anything small and simple is always good;MySQL likes simplicity (ok, me too)
It is best to avoid using bit
Use a small and simple suitable type;
Define an identity column with an integral type as much as possible
Avoid over-design, such as schema design that can lead to very complex queries, or many columns;
Null values should be avoided as much as possible unless there is a precise need in the real data model
Try to store similar, related values using the same type, especially the columns used in the association condition
Note variable-length strings, which can cause pessimistic allocation of memory by Max length when staging tables and sorting
Avoid the use of abandonment features, such as specifying the precision of floating-point numbers, or the display width of integers
Use enum and set carefully, although they are easy to use, but do not misuse, and sometimes they become traps.
The paradigm is good, but the inverse paradigm is sometimes necessary; it can also be a great benefit to pre-compute, cache, or generate a summary table
ALTER TABLE locks the table most of the way and rebuilds the entire table (painful) This chapter provides a number of risky approaches that most scenarios must use in other more general ways
Related articles:
"MySQL Database" chapter III Interpretation: Server performance profiling (top)
"MySQL Database" chapter III Interpretation: Server performance profiling (bottom)