Summary of MySQL database optimization method

Last Update:2014-12-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Intermediary transaction SEO diagnosis Taobao guest Cloud host technology Hall

Database optimization is a complex task because it ultimately requires a good understanding of system optimization. Even though the system or application system does not know much about the optimization effect is good, but if you want to optimize the effect of better, then you need to know more about it.

1, optimization overview

The most important factor in making the system run fast is the basic design of the database. And you have to be aware of what your system is going to do, and the bottlenecks that exist.

The most common system bottlenecks are the following:

Disk Search. It slowly searches the disk for blocks of data. For modern disks, the average search time is basically less than 10 milliseconds, so it's theoretically possible to do 100 disk searches per second. This time is not much improved for new disks, and for a single table. The way to speed up search time is to separate the data into multiple disks.

Disk read/write. When the disk is in the correct position, you need to read the data. For modern disks, disk throughput is at least 10-20mb/seconds. This is easier than the optimization of disk search because data can be read in parallel from multiple media.

CPU cycles. The data is stored in main memory (or it is already in main memory), which requires processing the data to get the desired results.

Memory bandwidth. When the CPU is to store more data in the CPU cache, the main memory bandwidth is the bottleneck. This is not a common bottleneck in most systems, but a factor to be aware of.

Limitations of 1.1 MySQL Design

When using the MyISAM storage engine, MySQL uses a fast data table lock to allow multiple reads and one write at a time. The biggest problem with this storage engine is that it takes place on a single table with stable update operations and slow queries. If this situation exists in a table, you can use another table type.

MySQL can work in both transaction and non transaction tables. In order to be able to use a non-transaction table smoothly (cannot be rolled back when an error occurs), there are several rules:

All fields have default values

If you insert an "error" value into a field, such as inserting an excessive value into a numeric Type field, MySQL will place the field value as the "most likely value" instead of giving an error. The value of a numeric type is 0, the smallest or largest possible value. String type, not an empty string is the maximum length that a field can store.

All evaluation expressions return a value and report a condition error, for example, 1/0 returns NULL.

These rules implicitly mean that you cannot use MySQL to check the contents of a field. Instead, it must be checked in the application before it is stored in the database.

1.2 Portability of application design

Because the various databases implement their own SQL standards, this requires that we try to use portable SQL applications. Query and insert operations are easy to migrate, but are more difficult because of the more restrictive requirements. It becomes more difficult to make an application run fast on a variety of database systems.

In order for a complex application to be portable, look first at what kind of database system the application is running on, and then see what features the database system supports. There are some deficiencies in each database system. In other words, due to some compromise in design, the performance difference is caused.

You can use the MySQL crash-me program to look at the selected database server can be used on functions, types, restrictions and so on. Crash-me does not check for various possible features, but it is still a reasonable understanding, about 450 tests. An example of a crash-me type of information is that it tells you that you cannot make a field name longer than 18 characters if you want to use Informix or DB2.

The CRASH-ME program and MySQL benchmarks are implemented for each quasi database. By reading how these benchmark programs are written, you can probably do what you can to make the program independent of the various database ideas. These programs can be found in the MySQL source code ' sql-bench ' directory. Most of them are written in Perl and use the DBI interface. Because it provides a variety of access methods independent of the database, DBI is used to solve various porting problems.

If you want to strive to be independent of the database, this requires some good ideas for bottlenecks in various SQL Servers. For example, MySQL is very fast for MyISAM types of tables when retrieving and updating records, but there are problems with concurrent slow read and write records. As Oracle, it has a big problem accessing the records that have just been updated (until the results are flushed to disk). Transactional databases generally generate summary tables from the log table this is not a good performance, because in this case, it is almost useless to log a lock.

In order for the application to be truly independent of the database, the interface for manipulating data must be defined as simple and extensible. Because C + + is available on many systems, it is appropriate to use C + + as the base class for a database.

If you use specific features that are unique to some databases (for example, the Replace statement is unique only in MySQL), you need to write workarounds to implement this functionality in other databases. Although these workarounds may be slow, they enable other databases to achieve the same functionality.

In MySQL, you can use/*! in query statements * * syntax to add mysql-specific keywords. In many other databases, however,/**/is treated as a comment (and is ignored).

If there are times when higher performance is more important than the accuracy of the data, as in some Web applications, this can use an application layer to cache the results, which can have higher performance. Update the cache appropriately by letting old data expire after a certain amount of time. This is a way to handle peak load times, in which case you can increase the cache capacity and expiration time until the load becomes normal.

In this case, the table information should include the capacity to initialize the cache and how often to refresh the datasheet normally. One option for implementing the application-tier cache is to use the MySQL query cache. When query caching is enabled, the database determines which results can be reused based on some details. It greatly simplifies the application.

1.3 What do we do with MySQL?

MySQL was originally intended for major customers during MySQL's first development, and they were the 2 largest retailers in Sweden for cargo storage data management.

We get the cumulative results of trading profits from all the stores every week, giving store owners useful information to help them analyze if they are better at advertising to influence their customers.

The amount of data is quite large (about 7 million of the cumulative results per month), and it also needs to show data for 4-10 years. We get our customers ' needs every week and they ask to get the latest report on the data ' instantaneously '.

We store all the information for each month in a compressed ' trading ' table to solve the problem. We have a few simple sets of macros that can be based on different criteria to get results from a stored ' transactions ' table based on field groupings (product groups, customer IDs, stores, and so on). We use a small Perl script to dynamically generate a report in the form of a Web page. This script parses the Web page, executes the SQL statement, and inserts the results. We can also use PHP or Mod_perl to do the job, but we don't have these 2 tools.

To get the graphic data, we also wrote a simple C language tool for executing SQL queries and making the results into GIF images. This tool is also dynamically executed after the Perl script parses the Web page.

In many cases, a new report can be generated by simply copying an existing script to modify the SQL query statement. Sometimes you need to add more fields to the existing cumulative table or create a new one. This is a very simple operation because we have all of our trading tables on disk (a total of about 50G transactions and 20G of other customer information).

We also allow customers to directly access the cumulative table through ODBC, so that those advanced users can use the data to do their own experiments. The system works well and has no problem processing data on a modest sun Ultra SPARC Workstation (2x200mhz). Eventually the system was ported to Linux.

1.4 MySQL Benchmark Suite

A benchmark suite is a way to tell users what SQL queries perform better or worse. Note that this datum is single-threaded, so it measures the minimum time that the operation is performed. We intend to add a benchmark suite for multithreaded testing in the future.

To use a benchmark suite, the following conditions are required:

The benchmark script is written in Perl, which uses Perl's DBI module to connect to the database, so the DBI module must be installed. You also need a specific BDB driver for each server you want to test. For example, to test MySQL, PostgreSQL, and DB2, you must install Dbd::mysql, DBD::P G and DBD::D B2 module. See "2.7 Perl Installation Note" for details.

After you get the source code for the MySQL distribution, you can see the benchmark suite in the ' Sql-bench ' directory. To run these benchmarks, first set up the service, then go to the ' sql-bench ' directory and execute the run-all-tests script:

Shell> CD Sql-bench

Shell> Perl run-all-tests--server=server_name

server_name can be any one of the available services. To list all available options and supported services, simply invoke the following command:

Shell> Perl run-all-tests--help

The Crash-me script is also placed in the ' Sql-bench ' directory. Crash-me by executing real queries to try to determine what features, performance, and limitations the database supports. For example, it can be judged:

• What field types are supported

• How many indexes are supported

• What functions are supported

• How large a query can be supported

· How large the VARCHAR field type can support

1.5 Use your own benchmark

Be sure to benchmark your database or application to discover bottlenecks. After solving this bottleneck (or using a fake module instead), it's easy to find the next bottleneck. Even if your application's current overall performance is acceptable, it's at least a plan to find each bottleneck, and someday you'll want your application to have better performance.

From MySQL's benchmark suite, you can find a portable benchmark program that can be ported. Please see "7.1.4 the MySQL Benchmark Suite" for details. You can make appropriate modifications to suit your needs from any program in the benchmark suite. Through the whole way, you can have a variety of different ways to solve the problem, know which program is the fastest.

Problems usually occur when the system is overloaded. We have a lot of customer contact we say they have a (tested) production system has also suffered a load problem. In many cases, performance problems are due to the basic design of the database (for example, the poor performance of scanning data tables under high load), the operating system, or the program library. Often, these problems are relatively easy to solve before they are formally used for production.

2. Optimize SELECT statements and other queries

First, one factor that affects all statements is that the more complex your permission settings, the greater the overhead. Using a simpler GRANT statement allows MySQL to reduce the overhead of permission checking when a client executes a statement. For example, if you do not set any table-level or field-level permissions, the server does not need to check the records of the Tables_priv and Columns_priv tables. Similarly, if no resource restrictions are set on the account, then the server does not have to do resource usage statistics. If you have a large number of queries, it is worthwhile to spend some time planning a simple authorization mechanism to reduce the overhead of server privilege checking.

If the problem is in some MySQL-specific expression or function, you can use the BENCHMARK () function to do a timed test using the MySQL client program. Its syntax is: BENCHMARK (loop_count,expression). For example:

All MySQL functions should be optimized, but there are still some exceptions to the function. BENCHMARK () is a very good tool for checking for problems in query statements.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More