Massive Data Query and paging Optimization

Source: Internet
Author: User
Tags rowcount
Massive Data Query and paging optimization (Author: jyk from: csdn)

Http://community.csdn.net/Expert/TopicView3.asp? Id = 4180563
Http://www.jyklzz.com/bbs/ demo page
Thank you for your support !!!
Yesterday I sent an invitation to help you test the results. The following is a summary:

I learned through the internal counter that the number of visits is 1071 (many of them are self-directed :)). The number of visits is not ideal. I originally wanted to see the situation of simultaneous access by tens of thousands of people :)

Usage of system resources

Memory-ideal. The maximum memory occupied by SQL is not more than 65 MB, which is usually about 35 MB; the maximum memory occupied by Asp.net is not more than 40 MB, which is generally about 25 MB.

CPU: Around 8%. This value cannot be explained because the number of accesses is small or not concentrated. I clicked the next page n times in a row and found that the CPU usage was high, reaching about 50%.
However, for 1 million of the records, the CPU display speed of amd xp + is acceptable or even ideal.
After all, the server's CPU is much faster than mine, and the record is hard to reach 1 million.

The results are quite satisfactory, but I want to see the results of massive access,
I hope you can support it a little more. Thank you. Haha

In addition, the first n pages can be completed within 60 milliseconds. N should be greater than 500, but less than how much has not been tested yet. The next n pages are slow, which takes about 500 milliseconds.

Next we will discuss the page turning skills.
I didn't use the cursor, temporary table, not in, or in methods. It doesn't mean they are inefficient, but I haven't tested them yet. I only used top and checked the table twice.
You can also provide some other methods. Let's test the results in the case of 1 million entries. (Please do not group strings in the stored procedure. It is too hard to look at it)
The premise of the discussion is that at least 0.1 million of the massive data is used. If there is a small amount of data, you can do it all. It cannot be much worse.

1. Set Reasonable Indexes
The first thing to do is to set a reasonable index, which seems to be often ignored, at least rarely mentioned.

Note: The primary key is one of the index types and the fastest one. If you use components as sorting fields, you have used indexes.

If the parameter is not set properly, the query speed may be very slow or even time-out.

In this regard, you can do an experiment: Find a table and enter 0.1 million records. Suppose There are fields such as ID and addeddate, and execute them in the query analyzer.

Select top 10 * from table
Results should appear immediately.

Then execute select top 10 * from Table order by ID (the ID field is the primary key)
The result is immediately displayed.

Then execute select top 10 * from Table order by addeddate (in this case, the addeddate field has no index)
You will find that the speed is slow.

Now we add a non-clustered index to addeddate, and then execute the preceding query statement, which is faster.

You can see the amazing effects of indexes!

This is the most basic setting for a million-level record flip. I have set the boardid and replydate fields as the Union Index for the page flip of my forum.
This is because it is used to flip pages in the same discussion group and is sorted by replydate.

2. Only the required records are returned.
It is hard to imagine that massive data is read as a cache (if there are few records, it also depends on the utilization rate, which is generally a waste ).
Therefore, if 20 entries are displayed on one page, only 20 entries are read, which means they are saved within the Province and time.

Note: This method is available in ADO. net.
Sqldataadapter. Fill (dataset1, startrecord, maxrecords, srctable );
However, he still needs to extract all records from the SQL query statement, and then extract the specified number of records. This is the same for SQL, and it will be slow for massive data.

On the homepage of the forum, select top 20 * from table where boardid = 5 order by replydate DESC is used.
In this way, only 20 records are returned, and the index's credit is added, which is very fast.
3. Minimize Field Length
You can create many fields in a table, but the total length of the field cannot exceed 8060b. That is to say, if you create a char (8060) field, you cannot create other fields.

In my first test (Sunday), I put all the topic information in a table, including a field of the nvarchar (3600) topic content, it is very slow to copy records,
When it reaches 90 thousand, it is already very slow, barely copying the number of records to 0.35 million, adding an index, testing, the page turning speed is still possible, and the first N is also very fast, the next n pages will be slow,
If the query is added, it will be very slow.

I checked the data file and was shocked. He actually occupied 1G hard disk space. No wonder the copy and query operations are slow.

So I modified the table structure and kicked out the topic content field of nvarchar (3600) and put it in a separate table.
Copying records again is very fast, and the number of records will soon be changed from 16 to 1048577. Yesterday's test was conducted under this condition.

4. Tips
Finally, go to the pageAlgorithmOh, no hurry.
The idea is to first find a flag, and then extract the first N records greater than (or less.
What? Did not understand. It doesn't matter. Let me give you an example.

Suppose the Order is in descending order of ID. Each page shows 10 records with 100 records, and the number of records is exactly 1 to 100 ?? For convenience)

The records on the first page are 100 to 91, the records on the second page are 90 to 81, and the records on the third page are 80 to 71 ......

Now I want to go to the third page, so I need to find the value (80) of the Record ID of the 21st rows, and then use the top 10 to retrieve the records smaller than or equal to 80.

Query statement

Declare @ pagesize int -- returns the number of records on a page
Declare @ curpage int -- page number (page number) 1: First page; 2: Second page;...;-1: last page.

Declare @ count int
Declare @ ID int

Set @ pagesize = 10
Set @ curpage = 1

If @ curpage =-1
Begin
-- Last page
Set rowcount @ pagesize
Select @ ID = ID from Table order by ID
End

-- Positioning
If @ curpage> 0
Begin
Set @ COUNT = @ pagesize * (@ curpage-1) + 1
Set rowcount @ count
Select @ ID = ID from Table order by ID DESC
End

-- Return record
Set rowcount @ pagesize
Select * from table where ID <= @ ID order by ID DESC

Set rowcount 0

Select @ ID = ID from Table order by id desc is used for "positioning ".
This method saves a lot of memory, because only one ID is recorded,

Then select * from table where ID <= @ ID order by ID DESC
Obtain the final required records

Set rowcount @ pagesize is equivalent to top @ pagesize.

Advantage: No matter which page you go through, the memory usage remains the same, and the memory access by multiple users will not remain the same. Many users have not yet tested it yet :)
Disadvantages: single table and single sorting fields.
Subject: Flip 1 million-level data-only a few dozen milliseconds to reveal the story: There are detailed instructions, don't miss it.
Author: jyk (like programming. And generate wealth. Improve together. Common progress)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.