How can I cope with the query problem of too much table data? (How to avoid joining large tables as much as possible)

Source: Internet
Author: User

In general, for B/S architecture friends, there is a better chance of High-concurrency database access, because the popularity of web is now like rocket launching, at the same time, a series of performance problems will be caused by high traffic volumes, and databases have always been an important platform for communication between users and businessmen. the user does not have the patience to endure a query that takes more than 10 seconds or less. If the server crashes or the query times out frequently, I think it will be a failed project. After several years of Web work, we have never encountered a large volume of traffic or massive data. This does not mean that projects without massive data are not a good project. It depends on the application scenarios of projects.
 

Recently, I got this opportunity by chance. During my work, I found that the maximum number of records in a single table is as high as 9 digits. the Order table also has 8 digits. when querying orders, it is often not possible to query orders through a single table, but also associated queries with other related tables. the associated table data is not very good. Once a large table is associated with a large table, a long wait may occur during query.
 

Purpose:How can this problem be avoided? Now that we have such data, we still need to implement it. Here I have recently optimized the database. I have divided it into two articles.ArticleTo illustrate.
Article 1: How to avoid large table association as much as possible.
Article 2: partition a large table.

Background:There are two tables:
1: Order table: records the detailed information of user orders.OrderWhich has a membership card number fieldCardno, Order generation time.
2: Member table: records member information.MemberA member has a proxy Number:ProxyidThe agent has many membership cards:Cardno, They share a proxy number.

The two tables are associated through cardno.
 

Requirements:Query the order generated by all member cards of a user or some users in a certain period of time.
 

Implement SQL:

Select field from order

Inner join Member on
Order. cardno = member. cardno
And member. proxyid in ('A-01', proxy Number 2)

And time between '000000' and '000000'
 

My opinion:I think most of my friends may write such query SQL statements when they see such requirements.InOr friends who think the in performance is poor can use it.Union allReplace. SQL statements can be simple and can no longer be simple. They are no problem, but if the data of both tables is more than one million and there are many fields. in this case, the expected results may not be achieved with the help of indexes.

Solution 1:ExploitationTable VariablesTo replace the large table Association. The scope of the table variable is a batch processing. After the batch processing is completed, the table variable also becomes invalid. Compared with the temporary table, the table variable has its unique advantages: you do not need to manually delete table variables to release memory.

Feasibility:Because most of the output fields are from the order table, the member table only serves as a data constraint and queries the user membership card number, you can obtain the proxy membership card number to the table variable first, and then use the table variable with the card number to associate with the order table for query.
Declare @ t table
(Cardno INT)
Insert @ t
Select cardno from member where in ('A-01', proxy Number 2)
Select field from order
Inner join @ t on
Order. cardno = @ T. cardno and time between '000000' and '000000'

Here I will not describe the performance comparison diagram. If you are interested, try it on your own. this method is particularly helpful when there are many query personnel. it requires developers to make a detailed comparison based on the actual situation. The results are not uniform, and the results may be different in different environments. hope everyone understands.

 

Solution 2:Use the index view to improve the performance of large table Association.
 

Feasibility:Generally, when a large table is associated, our output columns are much smaller than the fields of the two tables. For example, the above Member table only uses two of these fields (cardno, proxyid ). imagine that it would be better if only these two fields are in the member table at this time? The answer is self-evident.
 

In my previous impressions of the view, I never thought that the view can optimize the query, because I think that the view is a false table for the database, there is no actual physical location in the database to store data. users can view results from different perspectives. view data

Is generated in real time, that is, when the view is called, the view is automatically extended to run the corresponding SELECT statement. later I learned that in Versions later than 2000, visual Views are divided into general views and index views. General views are the views in my impression that no index is created. after a view is created, it is called an index view. the index view exists physically. You can create a unique clustered index first on the view and non-clustered index on other fields. the basic table is optimized without changing.

 

Create view memberview
With schemabinding
As
Select cardno, proxyid from Member
Go
-- Create a unique clustered index with a membership card number
Create unique clustered index ix_member_cardno
On member (cardno );

Go

Note: Key Points for creating an index View:

1: Create view memberview should be followed by with schemabinding

Reason: • Use schemaname. objectname to clearly identify all objects referenced by a view, regardless of which user accesses the view.
 

• The objects referenced in the view definition will not be changed in the way that the view definition is invalid or SQL Server is forced to re-create an index on The View.

 

2: The first index on the view must be clustered and unique.
Reason: it must be unique so that during the maintenance of the index view, you can easily search for records in the view by pressing the key and prevent the creation of views with duplicate items (special logic is required ). It must be clustered, because only clustered indexes can store rows while being forcibly unique.

3: You can create an index view in the following situations:

• Pre-calculation and aggregation can be performed and stored in the index to minimize high-cost computing during query execution.
 
• You can join tables in advance and save the final dataset.
 

• Saves join or aggregate combinations.

4: Updating the basic table will cause index vision updates.

5: Creating an index view brings about maintenance overhead.

Reason: 1: The index view exists physically.

2: Additional indexes need to be maintained.

Implementation: SQL: Select field from order
Inner join memberview on
Order. cardno = member. cardno
And member. proxyid = in ('A-01', proxy Number 2)

And time between '000000' and '000000'

 

Summary:The two solutions have their own advantages. Generally, you can use the index view to optimize large table joining. The above are my measures to avoid large table associations as much as possible.

 

Note:

Reference: http://www.51cto.com/html/2005/1115/11396.htm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.