Interview FAQ--database optimization million data how to optimize __ database

Source: Internet
Author: User
Tags create index
Interview FAQ--database optimization How to optimize millions of data

One, five rules of database access optimization

In actual development, we mainly need to optimize the SQL statements, we need to quickly locate the bottleneck point of energy, that is, quickly find the main cost of our SQL where. According to the cask principle, the slowest device is often a performance bottleneck. For example: the use of Internet bandwidth, local data replication when the hard disk access speed.

According to the basic performance index of the current computer hardware and the main operating contents in the database, the following five basic performance optimization rules can be sorted out:

(1) Reduce data access (reduce disk access)

(2) Return less data (reduce network transport or disk access)

(3) Reduce the number of interactions (reduce network transmission)

(4) Reduce server CPU overhead (reduce CPU and memory overhead)

(5) Use of additional resources (additional resources)

Because each layer of optimization rules are to solve their corresponding hardware performance problems, so the proportion of performance increase is not the same. The traditional database system design is also as far as possible to provide optimization method for low-speed equipment, so for low-speed equipment problem can be optimized means more, optimization cost is also lower. Any of our SQL performance optimizations should be up to this rule to diagnose problems and propose solutions, not the first thing to think about is adding resources to solve problems.

The following is a reference to optimization effect and cost experience for each optimization rule hierarchy:

Optimization rules

Performance improvement effect

Optimize costs

Reduce data access

1~1000

Low

Return less data

1~100

Low

Reduce the number of interactions

1~20

Low

Reduce server CPU Overhead

1~5

Low

Leverage more resources

@~10

High

Second, the database access optimization rules detailed

2.1. Reduce data access

(1) correctly create index

What kinds of indexes there are.

Common indexes include B-tree index, bitmap index, Full-text Index, bitmap index is generally used in Data Warehouse application, Full-text indexing is not discussed in depth because it is used less. The B-tree index includes many extension types, such as combined index, reverse index, function index, and so on, and the following is a brief introduction to the B-tree index:

The B-tree index, also known as the Balanced Tree Index (Balance), is a tree-structured directory structure that is sorted by field, primarily to enhance query performance and UNIQUE constraint support. The contents of the B-tree index include the root node, the branch node, and the leaf node.

What fields do we usually index on.

This is a very complex topic that requires a full analysis of the business and data before the results can be reached. Primary keys and foreign keys are usually indexed, and other fields that need to be indexed should meet the following criteria:

1, the field appears in the query conditions, and query conditions can use the index;

2, the sentence execution frequency is high, one day will have more than thousands of times;

3. A small set of records that can be filtered by a field condition, and what proportion of the data filter is appropriate.

This has no fixed value and needs to be evaluated based on the amount of table data, and the following is an empirical formula that can be used for rapid evaluation:

Small table (table with a record number of less than 10000 rows): Filter scale <10%;

Large table: (filter back number of records) < (total number of records * Single record length)/10000/16

Single Record length ≈ field average content length and + field number *2

The amount of overhead that an index attaches to DML (Insert,update,delete).

This is not a fixed ratio, with each table record size and index field size is closely related to the following is a common table test data, for reference only:

Index reduces insert performance by 56%

Index reduces update performance by 47%

Index reduces delete performance by 29%

Therefore, for a system with a large write IO pressure, the index of the table needs to be carefully evaluated and the index will occupy a certain amount of storage space.

(2) Access data only through index

Sometimes, we just access a few fields in the table, and the field content is less, we can create a separate composite index for these fields, so that you can directly access the index can only be data, the general index takes up more disk space than the table, so this way can significantly reduce disk IO overhead.

(3) Optimizing SQL execution Plan

SQL Execution plan is one of the core technologies of relational database, which represents the data access algorithm when SQL executes. As business requirements become more complex, table data is growing, programmers are getting lazy, and SQL needs to support very complex business logic, but SQL needs to improve performance, so good relational databases, in addition to the need to support complex SQL syntax and more functions, There is also a need for an excellent algorithm library to improve SQL performance.

2.2, to return less data

Reducing the return of data is also an important means of optimization, there are two main methods of paging and returning only the required fields.

(1) Paging

Page totals include three ways of paging: client paging, server-side paging, database paging.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.