MySQL Architecture optimization Combat Series 1: Data type and index tuning full resolution

Source: Internet
Author: User
Tags time and date

First, data type optimization

Data type

    • Integer

Number types: integers and reals

tinyint (8), smallint (16), Mediuint (24), int (32), bigint (64) numbers indicate the corresponding maximum number of storage bits, such as tinyint ( -127---), tinyint unsigned Indicates that negative numbers are not allowed, the range is (0--255).

an int (11) in a regular database simply means that the number of control display characters is 11, and int (1) and int (20) are stored and computed the same, that is, int (1) can store 1111 (4 digits) as well.

    • Real

The real number is divided into parts.

    • float and double types support approximate calculations using standard floating-point arithmetic

    • Float occupies 4 bytes double takes 8 bytes

    • The decimal type is used to hold exact decimals

    • Decimal (18,9) 18 indicates the total number of digits before and after the decimal point 9 indicates the number of digits after the decimal

    • MySQL 5.0 + 4 bytes Save 9 digits

    • Decimal (18,9) occupies 9 bytes before the decimal point 4 bytes After the decimal point is 1 decimal places after 4 bytes

    • String type

varchar and char types

    • VARCHAR holds variable-length strings, consumes less storage space than fixed-length types, and consumes only the space needed.

      VARCHAR uses an extra 1 to 2 bytes of storage length, columns less than 255 use 1 bytes to save the length, greater than 255 is saved with 2 bytes, and varchar retains the space at the end of the string.

    • Char is fixed length, when you save Char value **mysql Remove any trailing space * *, and the space-time lattice is populated to the end of the string.

      Many char columns are more efficient than varchar, such as char (1) occupies 1 bytes for a single-byte character set, and varchar (1) consumes two bytes because 1 bytes are saved in length.

Generosity is not wise, it is allocating the space that is really needed.

    • Blob and text types

The only difference between blob and text is that BLOBs hold binary data, no character sets, and collations.

Select the optimized data type

    • Smaller, usually better.

Use less disk, memory, and CPU to ensure that the saved values are not underestimated, but the text has a character set and collation. MySQL cannot index the full length of these data types, nor can it use indexes for sorting.

    • It's simple.

Comparing integers is less expensive than comparing characters, using MySQL built-in types to save time and date, and using integers to save IP.

    • Try to avoid null

MySQL makes it difficult to optimize nullable column queries to program variable size indexes on fixed indexes (indexes on integer columns), no values can be substituted with 0 or empty strings, and null columns to NOT NULL result in a small performance boost.

    • Determine type

Like numbers, strings, time, and visual types can be determined, but like datetime and timestamp, the same type can be saved. Timestamp uses only half of the datetime space. You can save the time zone.

    • Use enum instead of string type

The enum column can hold 65535 different strings stored in a lookup table where MySQL is stored internally in the list.

The internal storage is the location of the string, and the actual table is stored in a string.

Create a table Fruit category field is an enum type containing 4 different kinds of fruit:


Insert 4 data, i.e. 4 different fruits. Where the last pineapple (pineapple) does not have an enum value, the empty data is inserted.

Discovers whether the field category holds or strings, in fact, the inside has already associated these strings to the location of the enum character.

Support for string search and location search

The disadvantage of EMU is that the ALTER TABLE structure is required if there is no corresponding enum before inserting the data.

The advantage of enum is that it consumes less storage space.

It is said that the enum is also better for joining query performance.

    • Date and Time type

DateTime save is from 1001 to 9999, the precision is seconds, and the stored value is 2016-05-06 22:39:40.

Timestamp saves the number of seconds since midnight January 1, 1970, the same as the Unix timestamp, providing 4 bytes of storage that can only represent 1970 through 2038. The default timestamp value is not NULL.

From_unixtime () function in MySQL converts a Unix timestamp to a date

Unix_timestamp () Convert Date to UNIX timestamp

If you want the precision of the following seconds to hold the date and time, you can use the bigint type to save the timestamp format in milliseconds, or use a double to hold the fractional portion of the second.

    • Select an identifier

Integer types are often the best choice for identifiers, are fast, and can use auto_increment to avoid using strings for identifiers, occupy a lot of space, and be slower than integer types.

    • Special types of data

usually use varchar (15) To save the IP address, in fact, the IP address is an unsigned 32-bit integer, not a string, the decimal point for readability only.

MySQL provides Inet_aton () Inet_ntoa (), which is used for IP addresses and integers before conversion.

Second, index optimization

Index Basics

Indexing helps MySQL to get data structures efficiently, and the larger the index (called "key") data in MySQL, the more important it is. The index is like a book, in order to find a specific topic in the book, view the table of contents, get page numbers.

Select Fruit_name from fruit where id = 5 The index column is located in the ID column, and the index looks up by value and returns any rows that contain that value.

If you index multiple columns of data, the order of the columns is important.

Storage Engine Description

    • MyISAM Storage Engine

    • Table Lock: MyISAM table-level lock

    • Automatic recovery of data is not supported: check and perform possible fixes before using power loss

    • Transactions not supported: no guarantee that a single command will be completed, multiple line update errors only some rows are updated

    • Only the index is cached in memory: Mysiam caches only the internal indexes of the process

    • Tight storage: Rows are saved only together

    • InnoDB Storage Engine

    • Transactional: InnoDB supports transactions and four levels of transaction isolation

    • FOREIGN key: InnoDB the only storage engine that supports foreign keys the CREATE TABLE command accepts foreign keys

    • Row-level Locks: locks are set at the line level with good concurrency

    • Multi-version: Multi-version concurrency control

    • Aggregate by primary key: Index aggregates by primary key

    • All indexes contain primary key columns: Indexes reference rows by primary key if you do not maintain a short index on the primary key, you will grow very large.

    • Optimized cache: InnoDB data and memory caches to the buffer pool to automatically build the hash index

    • Uncompressed index: Index does not use prefix compression, blocking auto_increment:innodb using table-level locks to generate new auto_increment

    • Count () with no cache: MyISAM saves the number of rows in the table InnoDB the count () in the full table or index scan

Index type

The index is implemented in the storage engine, not the service layer.

    • B-tree Index

Most of the index types mentioned are B-tree types, which can be used in the CREATE table and other commands MyISAM use prefix compression to reduce the index, INNODB does not compress the index, Myiam index references the rows that are indexed by row storage physical location, InnoDB reference rows by primary key value, the B-tree data store is ordered, the indexed columns are saved sequentially, the data access is accelerated, and the storage engine does not scan the entire table for the required data.

    • B-tree Index Instance

A query type that uses the B-tree index is good for full-key values, key-value ranges, or key-prefix lookups, and is useful only if the leftmost prefix of the index is used for super-search.

Match full name: Full key value match and all column matches in index

Find a man named Tang Kang born in 1991-09-23

Match leftmost prefix: B-tree find a man named Tang

Match column prefix: The first part of a value that matches a column finds the person whose last name begins with T

Match Range Value: Index find person whose last name is greater than tang less than Zhu

Match exactly one part and match another part of a range:

Find the person whose surname is Tang and the name begins with the letter K to precisely match the last_name column and to

First_Name Querying for Scopes

Query that accesses indexes only: B-tree supports queries that only access indexes and does not access rows

    • B-tree Limitations

B-tree Limitations: (Index order in case: Last_Name first_name DOB)

If the lookup does not send the leftmost start of the indexed column, it is useless to look for all the people named Kang, nor to find all the people who were born in a certain day, because these columns are no longer indexed to the leftmost, and you cannot use the index to find someone whose last name ends with a specific character.

You cannot skip indexed columns, that is, you cannot find all people whose last name is Tang and are born on a specific date, and if you do not define a value for the first_name column, MySQL can use only the first column of the index.

The storage engine cannot optimize any column to the right of the first range condition, such as the query is where last_name = ' Tang ' and first_name like ' K% ' and dob= ' 1993-09-23 ' Access can only use the index header two columns.

The importance of index column order!

    • Hash index

Currently only the memory storage engine supports the displayed hash index, and the memory engine is not commonly used by me, so let's just downplay it.

    • R-tree (spatial index)

MyISAM supports spatial indexes and can use geometry spatial data types.

The spatial index does not require that the WHERE clause use the index leftmost prefix to index the data comprehensively, and can efficiently use any combination of data lookups with the Mercontains () function.

    • Full-Text Indexing

Fulltext is a special index of the MyISAM table, and the keyword from the text is not directly compared to the value in the index.

Full-text indexes can be mixed with b-tree indexes, and the index value does not affect each other.

The full-text index is used for the match against operation instead of the normal where clause.

    • Prefix index and index selectivity

You typically index several characters instead of all values to save space and get good performance, while also reducing selectivity.

Index selectivity is the ratio of non-repeating index values to the total number of rows. High-selectivity indexes have the benefit of finding matching filters for more rows, and a unique index selection rate of 1 is the best state.

Blob columns, text columns, and long varchar columns, you must define a prefix index, and MySQL does not allow indexing of their full text.

    • Prefix index and index selective instance

Build data

#复制一份与cs_area表结构

#插入1600数据

#模拟真实数据

#表area有name列 need to index the name column prefix

#计算得比值接近0.9350 just fine.

#分别取 3 4 5-bit Name value calculation

#可知name列添加5位前缀索引就可以了

#Mysql不能在order by or group by query uses a prefix index and cannot be used as an overwrite index

    • Clustered index

A clustered index is not a separate index type, but a way to store the data.

Innodb's clustered index actually holds the B-tree index and data rows, "aggregation" means that the actual data rows and associated key values are saved together, and each table can have only one clustered index, so you cannot save the rows in two places at a time. (because the clustered index is not common to me, we skip the ~)

    • Overwrite Index

Indexes support efficient lookup of rows, and MySQL can also use indexes to receive columns of data. This does not have to read the row data, and when an indexed query is initiated, the extra column of the explain interpreter sees the using index.

#满足条件: #

# The fields of the select query must have an index full overlay

Select Last_name,first_name where both last_name and first_name must have an index

#不能在索引执行like操作

    • Using index scans for sorting

How MySQL Sorts results: Using file sorting, scanning an ordered index

The type Liejo in explain is "index" to describe the MySQL scan index. Simply scan the index quickly, if MySQL does not overwrite the query with an index, it will have to look for each row found in the index.

MySQL can have the same index as the table user Index (Uid,birthday) that is exercised for sorting and finding.

Using the sort index:

    • Avoid redundant and duplicate indexes

Duplicate index: The same type, in the same order in the same column to create the index, such as the table User ID column to add a unique (ID) constraint, id not null.


Primary key constrains index (ID), in fact these are the same index!

Extra index: If present (A) the index should extend it satisfies (A, A) index
(a B) index <==> (b)
(b) Index <==> (A)
(A, B) A leftmost prefix (b,a) B the leftmost prefix

    • Index instance Research

Design User table fields: Country, state/region, city, sex, age, eye, color function: Support Group criteria Search user support user sort user last online time

    • Supports multiple filter conditions

Do not add an index to a poorly selective column

    • Optimize sorting

Index and table Maintenance

Table Maintenance three goals: Find and repair damage, maintain accurate index statistics, and reduce fragmentation

    • Find and fix table corruption

Check Table command: Determine if the table is corrupt and can catch most table and index errors

Repair Table command: Repairing corrupted tables

MYISAMCHK: Offline Repair Tool

    • Update index statistics

Analyze table Cs_area Update index statistics for optimizer optimized SQL

The show index command checks the cardinality of the index

    • Reduce index and data fragmentation

The MyISAM engine uses the Optimize table to clear the Shard INNODB engine using ALTER TABLE. Engine =. Re-create INDEX

Regularization and non-regularization

    • Regularization and non-regularization

    • Regularization database: Each factor will only be expressed once, teacher table teacher (id,school_id), School table School
      (school_id,school_name) Advantages: Update information only changes one table disadvantage: Simple School name query requires associated table
      Non-regularization database: Information is duplicated or stored in multiple places

    • Teacher Table Teacher (id,school_id,school_name) School Table School
      (School_id,school_name)

    • Advantages: Easy to direct statistics for the school name of the teacher

      Cons: Update a table that needs to change one more

    • Regularization and non-regularization use: for example, to count the number of users to post the user table can add a field num_message save the total number of posts to avoid high-density query statistics

    • Cache and Summary Tables

Example: Count the exact amount of information released over the past 24 hours

    • Table Periodic creation

Cycle creation can get efficient tables without fragmentation and full-sorted indexes

Note: This method clears the data, just to get an index table that is not fragmented and efficient.

Count tables: For example, cache the number of user friends, file downloads usually create a separate table to maintain the fast maintenance counters.

Schedule Task periodic aggregate function queries to update the corresponding fields.

MySQL Architecture optimization Combat Series 1: Data type and index tuning full resolution

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.