MySQL Database optimization Summary 2

Source: Internet
Author: User
Tags mysql index

Description: The environment for this article is CentOS 5.5 bit/mysql 5.1.50

Introduction: The use of MySQL for some time, during the period did a lot of MySQL optimization, design, maintenance work, the two days have time to do a simple summary, to facilitate their own memories, but also hope to be a little help to everyone.

I Hardware configuration Optimization

    • CPU Selection: Multi-core CPU, High frequency CPU
    • Memory: Larger memory
    • Disk selection: Faster speed, RAID, array cards,
    • Network environment choice: As far as possible to deploy in the LAN, SCI, optical cable, gigabit network, dual network cable to provide redundancy, 0.0.0.0 multi-port binding monitoring

II OS-level optimization

    • Use a 64-bit operating system for better use of large memory.
    • Set Noatime,nodiratime

[Email protected]_server1 ~]$ Cat/etc/fstab

label=//ext3 defaults,noatime,nodiratime 1 1

/dev/sda5/data XFS defaults,noatime,nodiratime 1 2

    • Optimizing Kernel Parameters

net.ipv4.tcp_keepalive_time=7200

net.ipv4.tcp_max_syn_backlog=1024

Net.ipv4.tcp_syncookies=1

Net.ipv4.tcp_tw_reuse = 1

Net.ipv4.tcp_tw_recycle = 1

NET.IPV4.NEIGH.DEFAULT.GC_THRESH3 = 2048

NET.IPV4.NEIGH.DEFAULT.GC_THRESH2 = 1024

NET.IPV4.NEIGH.DEFAULT.GC_THRESH1 = 256

Net.ipv4.conf.default.rp_filter = 1

net.ipv4.conf.default.forwarding = 1

Net.ipv4.conf.default.proxy_arp = 0

Net.ipv4.tcp_syncookies = 1

Net.core.netdev_max_backlog = 2048

Net.core.dev_weight = 64

Net.ipv4.tcp_rmem = 4096 87380 16777216

Net.ipv4.tcp_wmem = 4096 65536 16777216

net.ipv4.tcp_rfc1337 = 1

Net.ipv4.tcp_sack = 0

Net.ipv4.tcp_fin_timeout = 20

Net.ipv4.tcp_keepalive_probes = 5

Net.ipv4.tcp_max_orphans = 32768

Net.core.optmem_max = 20480

Net.core.rmem_default = 16777216

Net.core.rmem_max = 16777216

Net.core.wmem_default = 16777216

Net.core.wmem_max = 16777216

Net.core.somaxconn = 500

Net.ipv4.tcp_orphan_retries = 1

Net.ipv4.tcp_max_tw_buckets = 18000

Net.ipv4.ip_forward = 0

Net.ipv4.conf.default.proxy_arp = 0

Net.ipv4.conf.all.rp_filter = 1

KERNEL.SYSRQ = 1

Net.ipv4.conf.default.send_redirects = 1

net.ipv4.conf.all.send_redirects = 0

Net.ipv4.ip_local_port_range = 5000 65000

Kernel.shmmax = 167108864

Vm.swappiness=0

    • Increase file descriptor limits

Vim/etc/security/limits.conf

Plus

* Soft Nofile 65535

* Hard Nofile 65535

    • File System Selection XFS

/dev/sda5/data XFS defaults,noatime,nodiratime 1 2

III MySQL Design optimization
III.1 Storage Engine Selection

    • Myisam: Database concurrency is small, read and write less, and can be very good use of indexes, SQL statements relatively simple application, TB Data Warehouse
    • INNODB: Large concurrent access, write operations more, there are foreign keys, transactions and other requirements of the application, the system memory is large.

III.2 naming rules

    • Most development language naming conventions: such as Myadress
    • Most open source idea naming rules: my_address
    • Avoid random naming

III.3 field type selection

General principles for the selection of field types:

    • Select the appropriate field type according to your needs, and the field type as small as possible when you meet your needs.
    • Allocate only the minimum number of characters that meet your needs, not too generously.

Cause: Smaller field types smaller characters take up less memory, consume less disk space, consume less disk IO, and consume less bandwidth.

iii.3.1 Integral type:

See as:

Type

Bytes

Minimum value

Maximum Value

(Signed/unsigned)

(Signed/unsigned)

TINYINT

1

-128

127

0

255

SMALLINT

2

-32768

32767

0

65535

Mediumint

3

-8388608

8388607

0

16777215

Int

4

-2147483648

2147483647

0

4294967295

BIGINT

8

-9223372036854775808

9223372036854775807

0

18446744073709551615

According to the minimum integer that satisfies the requirement, the selection principle can not be used bigint with Int.

Store the IP with an unsigned int instead of char (15).

iii.3.2 floating Point type:

Type

Bytes

Precision Type

Usage Scenarios

FLOAT (M,D)

4

Single precision

Not high precision, small value

DOUBLE (m,d) (REAL)

8

Double precision

The accuracy is not high, the value is large

DECIMAL (m,d) (NUMERIC)

M+2

Custom Precision

Scenarios with high accuracy requirements

iii.3.3 Time Type

Type

Range of values

Storage space

0 value notation

DATE

1000-01-01~9999-12-31

3 bytes

0000-00-00

Time

-838:59:59~838:59:59

3 bytes

00:00:00

Datetime

1000-01-01 00:00:00~9999-12-31 23:59:59

8 bytes

0000-00-00 00:00:00

TIMESTAMP

At some point in 19,700,101,000,000-2037 years.

4 bytes

00000000000000

Year

Year (4): 1901~2155 year (2): 1970~2069

1 bytes

0000

iii.3.4 character types

Type

Maximum length

Occupy storage

char[(M)]

M bytes

M-byte

varchar[(M)]

M bytes

M+1 bytes

Tinyblod,tinytext

2^8-1 bytes

l+1 bytes

Blob,text

2^16-1 bytes

l+2

Mediumblob,mediumtext

2^24-1 bytes

l+3

Longblob,longtext

2^32-1 bytes

l+4

ENUM (' value1 ', ' value2 ',...)

65,535 Members

1 or 2 bytes

SET (' value1 ', ' value2 ',...

64 Members

1,2,3,4 or 8 bytes

Note: L indicates the meaning of variable length

The choice of varchar and char depends on the engine and the specific situation, based on the following principles:

    1. Use char if the column data items are of the same size or have a small difference.
    2. If the size of the column data items varies considerably, varchar is used.
    3. For MyISAM tables, use char as much as possible, especially for MyISAM and ISAM data tables, which often require modification and are prone to fragmentation, with the disadvantage of taking up disk space.
    4. For the InnoDB table, because its data row internal storage format does not differentiate between fixed-length data rows and variable-length rows of data (all data rows share a header section that holds pointers to individual data columns), it is not necessarily better to use the char type than the varchar type. In fact, because the char type typically takes up more space than the varchar type, it is more advantageous to use the varchar type from the point of view of reducing space consumption and reducing disk I/O.
    5. As long as there is a varchar type field in the table, all char fields automatically become varchar types, so it is recommended that the fixed-length and variable-length data be separated.

III.4 Encoding Selection

Single byte latin1

Multibyte UTF8 (kanji accounted for 3 bytes, English letters occupy one byte)

If it contains Chinese characters, it is best to use the UTF8 type uniformly, to avoid the occurrence of garbled characters.

III.5 Primary Key Selection principle

Note: The primary key design here is mainly for the InnoDB engine.

    1. Can only represent the row.
    2. Explicitly defines a primary key for a numeric type's self-increment field that can be used only for the master key and not for other purposes.
    3. The MySQL primary key should be single-column to increase the efficiency of the connection and filtering operations.
    4. Primary key field type as small as possible, can use smallint without int, can use int without bigint.
    5. Try to ensure that the primary key field is not updated to prevent changes in the primary key fields, causing data storage fragmentation and reducing IO performance.
    6. The MySQL primary key should not contain dynamically changing data such as timestamps, creation time columns, modified time columns, and so on.
    7. The MySQL primary key should have a computer automatically generated.
    8. The primary key field is placed in the first order of the data table.

It is recommended to use the numeric type master key and use the Auto_increment property to increase it automatically.

III.6 other areas to be aware of

    • NULL OR NOT NULL

If possible, set each field as not NULL, unless there are special requirements for the following reasons:

    1. Using an index with a null column will take up more disk space because the index NULL column needs to be stored in the outer space.
    2. When comparing, the program is more complex.
    3. Columns with NULL are special, SQL is difficult to optimize, and if it is a combined index, this null type of field can greatly affect the efficiency of the entire index.

    • Index

Disadvantages of indexing: greatly accelerates queries, reducing the number of data rows scanned and locked.

Disadvantages of indexing: Taking up disk space, slowing data update speed, and increasing disk IO.

The following guidelines are available for adding indexes:

    1. Select a uniqueness index.
    2. Index fields that often require sorting, grouping, and union operations.
    3. Index The fields that are commonly used as query criteria.
    4. Limit the indexed data, not the more the better.
    5. Try to use an index with a small amount of data, and you can consider a prefix index for large segments.
    6. Delete indexes that are no longer used or are seldom used.
    7. Overwrite the index with core SQL precedence.
    8. Avoid using the string master key.

    • Inverse Paradigm Design

Appropriate use of redundant inverse paradigm design, space-changing time can sometimes be very efficient.

IV MySQL software optimization

    • Turn on MySQL replication for read-write separation, load balancing, and load-sharing of read loads across multiple slave servers to improve server processing power.
    • Improve performance with the recommended GA version
    • Data splitting of big data with new partitioning capabilities

V MySQL Configuration optimization

Note: Once the global parameters are set, the pre-occupied resources are started with the server.

    • Key_buffer_size parameters

MySQL index buffer, if the use of MyISAM words to focus on setting this parameter, according to (key_reads/key_read_requests) judge

    • Innodb_buffer_pool_size parameters

INNODB the most important engine parameters for data, index, and log buffering, judging by (hit Riatos and file I/O)

    • Wait_time_out parameters

Thread connection Timeout time, try not to set a very large, recommended 10s

    • Max_connections parameters

The maximum number of connections allowed by the server, try not to set too large, because the settings are too large to cause memory overflow, need to be determined by the following formula:

SET @k_bytes = 1024;

SET @m_bytes = @k_bytes * 1024;

SET @g_bytes = @m_bytes * 1024;

SELECT

(

@ @key_buffer_size + @ @query_cache_size + @ @tmp_table_size +

@ @innodb_buffer_pool_size + @ @innodb_additional_mem_pool_size +

@ @innodb_log_buffer_size +

@ @max_connections *

(@ @read_buffer_size + @ @read_rnd_buffer_size + @ @sort_buffer_size +

@ @join_buffer_size + @ @binlog_cache_size + @ @thread_stack

) )

/@g_bytes as MAX_MEMORY_USED_GB;

    • Thread_concurrency parameters

Number of concurrent threads, (cpu+disk) * *, based on (request queue and tickets shown in OS)

    • Sort_buffer_size parameters

Get faster--order by,group by,select distinct,union DISTINCT

    • Read_rnd_buffer_size parameters

Get a faster--order by when sorting operations by key

    • Join_buffer_size parameters

Join connection uses the buffer size of the full table scan connection, judging by Select_full_join

    • Read_buffer_size parameters

Buffer size reserved for queries when full table is scanned, judging by Select_scan

    • Tmp_table_size parameters

Temporary memory table settings, if more than the settings will be converted to disk table, according to the parameters (Created_tmp_disk_tables) to determine

    • Innodb_log_file_size parameter (default 5M)

Logging the InnoDB engine's redo log file, setting a larger value means a longer recovery time.

    • Innodb_flush_method parameter (default Fdatasync)

Linux systems can use O_direct to process data files and avoid OS-level Cache,o_direct mode to improve IO Submission performance of data files and log files

    • Innodb_flush_log_at_trx_commit (default 1)
    1. 0 means log writes to the cache once per second and flush log to disk.
    2. 1 means that log writes to the cache after each transaction commit and flush log to disk.
    3. 2 means that after each transaction commit, the log data is written to the cache and the flush log is executed once per second to disk.

VI MySQL statement-level optimization

    1. Performance check of the reading statements, in the InnoDB count the number of rows, suggest another table, the use of MyISAM, regular statistics. The general statistical data does not require too precise circumstances to apply.
    2. Try not to do operations in the database.
    3. Avoid negative queries and% prefixes for fuzzy queries.
    4. Do not perform operations on indexed columns or use functions.
    5. Do not query data in a production environment program using SELECT * from. Queries only the columns that need to be used.
    6. The query uses limit as much as possible to reduce the number of rows returned, reducing data transfer time and bandwidth waste.
    7. The WHERE clause uses functions on the query columns as much as possible, because the use of functions on query columns is not indexed.
    8. Avoid implicit type conversions, such as the character type must be ", digital type must not use '."
    9. All of the SQL Keywords capitalized, develop good habits, to avoid the duplication of SQL statements resulting in system resources waste.
    10. When querying a table, remember to put the small result set in front and follow the principle of small result set driving large result set.
    11. Turn on slow queries and periodically use explain to optimize SQL statements in slow queries.
    12. 1. To optimize the query, avoid full-table scanning as far as possible, and first consider establishing an index on the columns involved in the Where and order by.

      2. You should try to avoid null values in the WHERE clause, otherwise it will cause the engine to abandon using the index for a full table scan.

      SQL code: Select ID from t where num is null;

      You can set the default value of 0 on NUM, make sure that the NUM column in the table does not have a null value, and then query:

      SQL code: Select ID from t where num=0;

      3. Try to avoid using the! = or <> operator in the WHERE clause, or discard the engine for a full table scan using the index.

      4. You should try to avoid using or in the WHERE clause to join the condition, otherwise it will cause the engine to abandon using the index for a full table scan.

      SQL code: Select ID from t where num=10 or num=20;

      You can query this:

      SQL code: Select ID from t where num=10 the union ALL select ID from t where num=20;

      5.in and not in should also be used with caution, otherwise it will result in full table scans, such as:

      SQL code: Select ID from the Where num in (All-in);

      For consecutive values, you can use between instead of in:

      SQL code: Select ID from t where num between 1 and 3;

      6. The following query will also cause a full table scan:

      SQL code: Select ID from t where name is like ' c% ';

      To be more efficient, consider full-text indexing.

      7. If you use a parameter in the WHERE clause, it also causes a full table scan. Because SQL resolves local variables only at run time, the optimizer cannot defer the selection of access plans to run time; it must be selected at compile time. However, if an access plan is established at compile time, the value of the variable is still unknown and therefore cannot be selected as an input for the index. The following statement will perform a full table scan:

      SQL code: Select ID from t where [email protected];

      You can force the query to use the index instead:

      SQL code: Select ID from T with (index name) where [email protected];

      8. You should try to avoid expression operations on the fields in the WHERE clause, which will cause the engine to discard the full table scan using the index.

      SQL code: Select ID from t where num/2=100;

      You can query this:

      SQL code: Select ID from t where num=100*2;

      9. You should try to avoid function operations on the fields in the WHERE clause, which will cause the engine to discard the full table scan using the index. Such as:

      SQL code: Select ID from t where substring (name,1,3) = ' abc '; #name ID starting with ABC

      should read:

      SQL code: Select ID from t where name is like ' abc% ';

      10. Do not perform functions, arithmetic operations, or other expression operations on the left side of "=" in the WHERE clause, or the index may not be used correctly by the system.

      11. When using an indexed field as a condition, if the index is a composite index, you must use the first field in the index as a condition to guarantee that the system uses the index, otherwise the index will not be used, and the field order should be consistent with the index order as much as possible.

      12. Do not write meaningless queries, such as the need to generate an empty table structure:

      SQL code: Select Col1,col2 into #t from T where 1=0;

      This type of code does not return any result sets, but consumes system resources and should be changed to this:

      SQL Code: CREATE TABLE #t (...);

      13. It is a good choice to replace in with exists in many cases:

      SQL code: Select num from a where num in (select num from B);

      Replace with the following statement:

      SQL code: Select num from a where exists (select 1 from b where num=a.num);

      14. Not all indexes are valid for queries, SQL is query-optimized based on the data in the table, and when there is a large number of data duplication in the index columns, SQL queries may not take advantage of the index, as there are fields in the table ***,male, female almost half, so even if you build The index also does not work for query efficiency.

      15. The index is not the more the better, although the index can improve the efficiency of the corresponding select, but also reduce the efficiency of insert and UPDATE, because the INSERT or update when the index may be rebuilt, so how to build the index needs careful consideration, depending on the situation. The number of indexes on a table should not be more than 6, if too many you should consider whether some of the indexes that are not commonly used are necessary.

      16. You should avoid updating clustered index data columns as much as possible, because the order of the clustered index data columns is the physical storage order of the table records, which can consume considerable resources once the column values change to the order in which the entire table is recorded. If your application needs to update clustered index data columns frequently, you need to consider whether the index should be built as a clustered index.

      17. Use numeric fields as much as possible, if the field containing only numeric information should not be designed as a character type, which will reduce the performance of queries and connections and increase storage overhead. This is because the engine compares each character in a string one at a time while processing queries and joins, and it is sufficient for a numeric type to be compared only once.

      18. Use Varchar/nvarchar instead of Char/nchar as much as possible, because the first variable length field storage space is small, can save storage space, second, for the query, in a relatively small field in the search efficiency is obviously higher.

      19. Do not use SELECT * from t anywhere, replace "*" with a specific field list, and do not return any fields that are not available.

      20. Try to use table variables instead of temporary tables. If the table variable contains a large amount of data, be aware that the index is very limited (only the primary key index).

      21. Avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.

      22. Temporary tables are not unusable, and they can be used appropriately to make certain routines more efficient, for example, when you need to repeatedly reference a dataset in a large table or a common table. However, for one-time events, it is best to use an export table.

      23. When creating a temporary table, if you insert a large amount of data at one time, you can use SELECT INTO instead of CREATE table to avoid causing a large number of logs to increase speed, and if the amount of data is small, create table and Insert.

      24. If a temporary table is used, be sure to explicitly delete all temporary tables at the end of the stored procedure, TRUNCATE table first, and then drop table, which avoids longer locking of the system tables.

      25. Avoid using cursors as much as possible, because cursors are inefficient and should be considered for overwriting if the cursor is manipulating more than 10,000 rows of data.

      26. Before using a cursor-based method or temporal table method, you should first look for a set-based solution to solve the problem, and the set-based approach is generally more efficient.

      27. As with temporary tables, cursors are not unusable. Using Fast_forward cursors on small datasets is often preferable to other progressive processing methods, especially if you must reference several tables to obtain the required data. Routines that include "totals" in the result set are typically faster than using cursors. If development time permits, a cursor-based approach and a set-based approach can all be tried to see which method works better.

      28. Set NOCOUNT on at the beginning of all stored procedures and triggers, set NOCOUNT OFF at the end. You do not need to send a DONE_IN_PROC message to the client after each statement that executes the stored procedure and trigger.

      29. Try to avoid large transaction operation and improve the system concurrency ability. The SQL optimization method uses indexes to traverse tables more quickly. The index established by default is a non-clustered index, but sometimes it is not optimal. Under a non-clustered index, the data is physically randomly stored on the data page. A reasonable index design should be based on the analysis and prediction of various queries. Generally speaking:

      A. There are a large number of duplicate values, and often have a range of queries (>,<,> =,< =) and order BY, the group by the occurrence of the column, you can consider the establishment of cluster index;

      B. Frequent simultaneous access to multiple columns, and each column contains duplicate values to consider the establishment of a composite index;

      C. Composite indexes to make the key query as much as possible to form an index overlay, its leading column must be the most frequently used column. Indexes can help improve performance but not as many indexes as possible, but too many indexes in the opposite direction cause the system to be inefficient. Each index is added to the table, and maintenance of the index collection will be done with the corresponding update work.

      30. Periodic analysis of tables and checklists.

      Syntax for parsing tables: ANALYZE [LOCAL | No_write_to_binlog] TABLE tb1_name[, Tbl_name] ...

      The above statement is used to analyze and store the keyword distribution of the table, the results of the analysis will enable the system to obtain accurate statistics, so that SQL can generate the correct execution plan. If the user feels that the actual execution plan is not the expected execution plan, executing the analysis table may solve the problem. During parsing, the table is locked with a read lock. This is useful for myisam,dbd and InnoDB tables.

      For example, analyze a data table: Analyze table table_name
      Check the syntax of the table: Check table Tb1_name[,tbl_name] ... [option]...option = {QUICK | FAST | MEDIUM | EXTENDED | CHANGED}

      The purpose of the checklist is to check if one or more tables have errors, check table is useful for MyISAM and InnoDB tables, and the keyword statistics are updated for MyISAM tables

      Check table also checks if the view has errors, such as the table referenced in the view definition does not exist.

      31. Optimize the table regularly.

      Syntax for tuning tables: OPTIMIZE [LOCAL | No_write_to_binlog] TABLE tb1_name [, Tbl_name] ...

      If you delete a large part of a table, or if you have made more changes to a table with variable-length rows (a table with a VARCHAR, blob, or text column), you should use the Optimize Table command for table optimization. This command merges the space fragments in the table and eliminates the wasted space caused by the deletion or update, but the Optimize table command only works on MyISAM, BDB, and InnoDB tables.

      Example: Optimize table table_name

      Note: The table will be locked during analyze, check, and optimize execution, so be sure to take action when the MySQL database is not busy.

      Add:

      1, in the vast number of queries as far as possible to use the format conversion.

      2. Order BY and Gropu by: using both the order by and the GROUP by phrase, any index contributes to the performance improvement of SELECT.

      3. Any action on a column will result in a table scan, which includes database tutorial functions, calculation expressions, and so on, to move the operation to the right of the equals sign whenever possible.

      4, IN, or clauses often use worksheets to invalidate the index. If you do not produce a large number of duplicate values, you can consider taking the sentence apart. The disassembled clause should contain an index.

      5. Use smaller data types whenever possible to meet your needs: for example, using Mediumint instead of INT

      6, try to set all the columns to NOT NULL, if you want to save null, manually set it, rather than set it as the default value.

      7. Use VARCHAR, TEXT, BLOB type as little as possible

      8. If your data is only a few of the few you know. It is best to use the ENUM type

      9, as Graymice said, to build an index.

      10, the rational use of the transport table and partition table to improve the data storage and extraction speed.

      Transferred from: http://blog.chinaunix.net/uid-20639775-id-3154234.html

MySQL Database optimization Summary 2

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.