SQL query optimization--Data structure design

Source: Internet
Author: User

This section of this article will cover MySQL and may not be available in other databases. This section only discusses database structure design, and other content of query optimization is to be continued.

Database design and use is a necessary basic skill for web development programmers, and it is very important for the project to have a reasonable data structure and SQL query optimization in large data volume and high concurrency scenarios. Most experienced programmers can understand that the bottleneck of the program is often not in the program itself, but in the data access layer. There are many reasons for inefficient data access, and how to solve these problems directly affects the stability and robustness of the application. Here are a few common questions:

  • Database lock table, query blocking
  • Link number bottleneck in high concurrency scenarios
  • The query is inefficient and the program cannot exit for a long time
  • Poor write performance, resulting in intense read and write competition

These are just some of the more common problems in database usage, and the common causes of these problems are listed below:

  • Unreasonable data structure design
  • Poor index Design
  • Program Maintenance Data Link is unreasonable
  • The programmer is too lazy, the database does not do good work
  • Data redundancy
  • SQL too slag

This section discusses only unreasonable data structure design, and subsequent chapters continue to discuss other content.

Always feel that as an intermediate level programmer, query optimization is a necessary basic skill. Good data structure design can directly affect the performance, robustness, maintainability and expansibility of the software. Have seen many scenarios where the software is ultimately difficult to extend and difficult to maintain because of unreasonable data structure design. To avoid these problems, we need to master the good data structure design ability.

What kind of data structure is reasonable? This does not have a perfect universal solution, consider a specific application scenario. But there are some criteria that we should try to comply with. Listed below:

  1. Consider the distribution of data structures based on business query scenarios
  2. If there is no business primary key, the ID self-increment primary key should be established
  3. Ensure that smaller data types are used to avoid wasted space
  4. Reasonable control of the number of table fields, the necessary time table storage
  5. Add a field comment

According to the above points, the following details are as follows:

1, according to the business scenario, consider the data structure distribution

The business scenario determines what data you want to store, but it doesn't determine how you want to store the data. You can simply store this information in a table, such as the user table. But when we need more information, such as the user's secondary attributes (school, address, etc.), if it's all plugged into a table, there's not much problem with a database of small amounts of data, but when it comes to scenarios with large volumes of data, queries can become slow. Sub-table is a better solution, according to different business scenarios, the information is divided into two categories, stored in different tables, is a more reasonable solution.

In fact, it is not necessary to put everything into a table for convenience, although it will make your program much easier to write, but it will cause more problems. For example, some people will store the relationship of 1:n to a table, which will bring data redundancy, there are many disadvantages, such as: the Write-and-delete for n will become very complex, the table volume becomes larger, the field increases, resulting in slow query, other table tables query slow speed and so on.

2, if there is no business primary key, should establish the ID self-increment primary key

The primary key is the only sign of a record, and there are many times when we cannot manipulate the data with the primary key. There may be situations where we do not have to set the primary key, but whether or not you are actively setting the primary key, the database will have a primary key (if you do not have an active setting, the database defaults to a row_id column, which you do not see). The primary key is very helpful in the business of connecting tables and queries, so it is necessary to establish a primary key anyway.

3. Ensure a smaller data type and avoid wasted space

Smaller data types mean smaller storage costs, and the database can use cache space more efficiently. The storage engine uses different ways to cache the index or the data in the content, and the smaller data types mean that you can store more valuable data in a limited amount of content space. For variable-length varchar types, if we set the 20 length, but actually occupy only 10 lengths, when loading memory, the space occupied is still 20 instead of 10. Therefore, for variable length types, a reasonable length is more important.

4, reasonable control of the number of table fields, the necessary time table storage

Too many fields if not because of business needs and data structure design is reasonable, most of the following problems:

    • data redundancy
    • index too much
    • table volume large

Here's a reminder to avoid unnecessary data redundancy, and we'll leave it behind for the discussion of data redundancy.

Because of the amount of data in the field, often the query scene will be very complex and changeable, so the index will become more. Indexes can directly affect the write performance of the table, the loss of this performance is very large, it is possible to calculate the time loss of 10 times times. In a frequently written scenario, a write bottleneck may occur. There are also many problems affecting read performance due to writing.

Sub-table is to solve the field too much of a solution, the database after the table, the program may change relatively large, but we should pursue a reasonable and perfect software design, discard the dross. Use a linked list query after the table, or do two queries in the program. Some people may feel that even the table, performance must be very poor, not actually. Even a table means that we can use two indexes in the same SQL, but we only use one index for single-table queries. If the index is well-designed, in most scenarios (which should be a large data volume scenario), a query with a table can be more efficient or even higher than a single-table query. There have been such scenes, the optimization of the table after the picture has become much less.

5. Add Field Comments

This is just a hint to standardize the database design.

original articles, reproduced please specify: reprint from always Enough

This article link address: SQL query optimization--Data structure design

SQL query optimization-data structure design

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.