Efficiency Comparison between two databases and four paging Algorithms

Source: Internet
Author: User

From: http://www.cnblogs.com/jyk/archive/2008/08/04/1260406.html

 

PagingAlgorithmThere is no difference between speed and speed. What determines the reaction speed is --Can indexes be effectively used!

 

Algorithm Rating Disadvantages Applicable Database
Max The most efficient Only one sorting field is allowed. Excel, access, SQL Server 2000 (2005)
Top upside down The most widely used Multi-data on the last page Excel, access, SQL Server 2000 (2005)
Table Variables The most disappointing Too many SQL Server 2000 (2005)
Row_number Only applicable to SQL server2005 It can only be used in one type of Database SQL Server 2005

I. Max


The disadvantage is that there can only be one sorting field, and the value of the sorting field cannot have duplicate values or duplicate values, but it cannot be repeated at the top of the page or at the end of the page. The advantage is that it is easy to set the index. You can set the index for the Field in order of that field. After the index is set, the paging speed will be greatly improved, and the more records, the more obvious, because the index is used. What is the role of an index? To put it simply, first stand up and find the most advanced one. Suppose we want to find the highest class in the class, what should we do? Let's arrange the students according to their size, and then we can see who is the highest. After the index is set, it is equivalent to having already lined up, and the rest will be much faster.

 

2. Top upside down

 

This is an upside down optimization. It must have a primary key and cannot be a composite primary key. In most cases, index settings are better and indexes can be used, so the paging effect is also acceptable.

 

Iii. Table Variables

Because we have been using the inverted top method and the positioning method, we have never caught off with table variables. Because it is an algorithm recommended by the paging control of wuqiwa, we will continue to study this upgrade. I have read it carefully, but there are many shortcomings. Because the idea of table variables is to place the primary key and auto-incrementing fields in the table variables, and then use the auto-incrementing fields for paging. As in the preceding example, we need to first sort the data, then report the number, and then extract the data according to the number. This has brought about many shortcomings:

1. the table must have a primary key and cannot be a composite primary key. If the primary key is compound, it is difficult to extract data.
2. The primary key type affects paging algorithm writing. INT and datetime, the field type is different when defining table variables.
3. If only the primary key is sorted, the primary key index will be wasted. They had already lined up, but asked them to arrange again.
4. The more pages you need to go back, the more data you need to store in the table variables. It doesn't matter if you put more data. Other algorithms will also put some data into the "temporary table, however, the problem is that the auto-increment field does not have an index, and the more data it adds, the slower it will naturally be.
5. It can only be used in SQL Server 2000 and SQL server2005, and the speed in 2005 is obviously not faster than row_number. The paging algorithm recommended by the website of wuqiwa is divided into two versions, for SQL2000 and sql2005, row_number is recommended for the latter.

 

Having said so many shortcomings, I am afraid it is easier to write and understand. It is hard to understand if top is upside down. Therefore, this is the most disappointing paging algorithm.

4. row_number

I only use this, and I don't quite understand the internal principle. In most cases, the efficiency is also good, but I once found that he could not use the index, so he was depressed.

I'm afraid you don't trust it either, especially when it comes to table variables. Here we will use tests to prove the performance of each paging algorithm. It is really troublesome to say that there are at least three databases and four paging algorithms. There are also single-field sorting, multi-field sorting, single-Table paging, and variable-Association paging on the data, can a small amount of data and massive data be indexed.

3x4x2x2x2 = 192. So much is not a bit dizzy, just simplify it. Here we first use SQL Server2000 to test the massive data volume in a single table.

There are three paging algorithms, single table, and massive data (2 million, but the comparison is not effective ).

 

 

First introduce the test environment

 

Database: SQL Server2000

IDE: vs2008.

CPU: amd3000 +

Memory: DDR2 1g

Hard Disk: Serial Port 160 GB

 

Three paging algorithms: Max, table variable, and top upside down.

 

We use the northwind database and the products table for testing, and the built-in data is not enough. Let's copy the data and use the insert into select method to add 2523136 records, 15 records are displayed on one page, with a total of 168210 pages. Then we will record the usage time and memory usage of the first load, the first few pages, the first 100 pages, 1,000th pages, 10,000th pages, the last few pages, and the last pages.

 

This is the test result of sorting a single field.ProductidSort.
Unit: milliseconds.

Paging algorithm First time Previous pages First 100 pages Page 1 Page 1 Page 1 Last few pages Last page
Maxmin 953/40 m 400/40 m 0-15/50 m 0-15/50 m 78-93/68 m 3.62 S/210 m
765/210 m
2.80 S/310 m
1.28 S/310 m
5.62 S/310 m
1.28 S/310 m
Top upside down 875/40 m 400/40 m 0-15/50 m 15-31/50 m 281/69 m
93-125/69 m
2.96 S/210 m
1.3 s/210 m
3.78 S/310 m
1.7 S/310 m
15 (Special Handling)
Table Variables 968 468 0-15/50 m 11.18 S/55 m
93-109/59 m
24.265 S/76 m
859/77 m
Timeout
45.171 S/230 m
11.78 S/240 m
8.281 s
This will not be tested. This will not be tested.

 

Record description:

1. If a page has two times, the time required to jump from page 100 to page 1000 is displayed.

2. execution time and memory increment.

3. During the first running, the SQL server needs to load some data into the memory, so it takes a long time.

4. Count (*) is used to count the total number of records on the first page, so the time is a bit long. When accessing other pages, you do not need to count the total number of records, so the time will be very short.

5. The CPU usage is not recorded, and it is basically full. It seems that amd3000 + is a little weak.

 

The SQL statement on the first page:

Select top 15 * from products order by productid

 

Maxmin SQL statement:
Select top 15 * from products where productid> = (select max (productid) from (select top 526 productid from products order by productid) as t) order by productid

 

Top-down SQL statements:
Select * from products where productid in (select top 15 productid from (select top 420 productid from products order by productid) as t order by T. productid DESC) order by productid

 

The paging algorithm on the last page is specially processed to remove bugs rather than speed up.

 

Display the SQL statement on the last page of top upside down
Select * from (select top 1 * from products order by productid DESC) as t order by T. productid

 

SQL statement for table variables:
Declare @ TT table (ID int identity (2130), nid int) insert into @ TT (NID) Select top productid from products order by productid select * from products T1, @ TT T2 where t1.productid = t2.nid and t2.id between 2116 and 2130

 

It is really not enough. When I jumped from page 10 thousand to page 100,000, unfortunately, it timed out after 45.171 seconds. Because the data access function library does not throw an exception when it encounters an exception, so we can get the running time. By viewing the error log, we can see that "the timeout time has reached. The timeout time has elapsed before the operation is completed or the server has not responded ."
Refresh again to obtain the data.

 

During the test, the paging time of the table variables is unstable, and the time is slow for a while. during large-scale page jumping, the time is always timed out. Reversing top is relatively stable.

 

 

This is the result of Multi-field sorting.Unitprice, productid DESCSort. Index: unitprice, productid DESC
Unit: milliseconds.

Paging algorithm First time Previous pages Page 1 Page 1 Page 1 Page 1 Last few pages Last page
Top upside down 375/55 m 15-31/57 m 15-31/60 m 62-78/63 m 687/63 m
486/64 m
5.18 S/65 m
4.3 S/65 m
8.15 S/66 m
7-8 s/66 m
15 (Special Handling)
Table Variables 968 15-31/58 m 1.31/59 m
46/60 m
12.17 S/60 m
93-125/60 m
I don't really understand how it timed out.
27.39 S/76 m
8-9 S/77 m
900/77 m
Timeout
45.171 S/100 m
10.2 S/100 m
9.3 s
27.2 S/100 m
9.3 s
This will not be tested.

 

 

This is the test result of SQL Server 2005. Multiple fields are sorted accordingUnitprice, productid DESCSort. Index: unitprice, productid DESC
Unit: milliseconds.

Paging algorithm First time Previous pages Page 1 Page 1 Page 1 Page 1 Last few pages Last page
Top upside down 375/55 m 15-46/57 m 15-46/100 m 78-93/110 m 531/117 m
466/117 m
5-8 s/118 m 7-10 S/120 m 15 (Special Handling)
Table Variables 968 31-46/100 m 2.62/100 m
0.4 s or 1.6 s/100 m
17.34 S/110 m
93-156/110 m
25.57 S/76 m
1.25 S/77 m
Timeout
22.82 S/130 m
15.2 S/130 m
9.3 s
27.2 S/100 m
9.3 s
This will not be tested.
Row_number 500 15/93 m 546
15-31
4.82 S/200 m
93-106/210 m
3.15/265 m
3.96/263 m
Relatively slow, total timeout, do not want to test.    

 

The test of the row_number algorithm is added here,

 

The SQL statement of row_number:

With t_pager as (select myindex = row_number () over (order by unitprice, productid DESC), * from products) Select * From t_pager where myindex between 150031 and 150045

 

 

I suspect that this test result is correct. If the test result is correct, it means that the efficiency of "upside down" is the highest and the applicability is wide.

 

But I still doubt this result. Is row_number so slow? Is there any problem with the spliced SQL statement? Or is there a problem at that stage?

 

========================================================== ========================================================

 

TestedCode

 

Create a base class, define a gridview, A quickpager paging control, and a label, then override oninit () function, and then add an event. Because the attribute settings of the paging control are the same for the same table, but the attributes of the paging algorithm are different, we can put the same settings in the base class, the difference is placed in a specific page.

 

How to Use the quickpager paging control in the base class

 

Public   Class Baselist: system. Web. UI. Page
{
Protected Jyk. Controls. quickpager mypager =   Null ;
Protected System. Web. UI. webcontrols. gridview gv =   Null ;
Protected Label LBL =   Null ;

Protected Datetime dt1;

Protected   Override   Void Oninit (eventargs E)
{
Base . Oninit (E );

// Set the properties of the paging Control

Mypager. controlgrid = GV;

Mypager. tablename =   " Products " ;
Mypager. tableshowcolumns =   " * " ;
Mypager. tableidcolumns =   " Productid " ;
// Mypager. tableordercolumns = "productid"; productname
Mypager. tableordercolumns =   " Unitprice, productid DESC " ;

Mypager. pagesize =   15 ;

// Add event
Mypager. gridbinded + =   New Jyk. Controls. quickpager. eventpagechange (mypager_gridbinded );

Dt1 = Datetime. now;
}

Void Mypager_gridbinded ( Object Sender, jyk. Controls. Page. pageargs E)
{
Timespan TS = Datetime. Now - Dt1;
LBL. Text =   " Seconds: "   + TS. Seconds +   " , Millisecond: "   + TS. milliseconds;
LBL. Text + =   " <Br> "   + Mypager. getpagersql;
}

Void Mypager_pagechanged ( Object Sender, jyk. Controls. Page. pageargs E)
{

}
}

 

Create An ASPX page and drag in three widgets.

 

Code
< ASP: gridview ID = " Gv " Runat = " Server " Width = " 100% " >
</ ASP: gridview >
< Jyk: quickpager ID = " Mypager " Runat = " Server " />
< ASP: Label ID = " LBL " Runat = " Server " Text = "   " > </ ASP: Label >

 

 

Code
Public   Partial   Class Lst_max: baselist
{
Protected   Void Page_load ( Object Sender, eventargs E)
{
If ( ! Page. ispostback)
Mypager. setsqlkind = Jyk. Controls. Page. mypagesqlkind. row_number;
}
}

 

 

We can also compare SQL 2000 and SQL 2005 by the way. In this case, 05 does not have many advantages. Row_number is not as good as I think. Maybe I did not write it correctly. Maybe it can only show its advantages when there are four or five sorting fields? However, changing the paging algorithm is very easy for quickpager. if it finds that the speed is slow, try another paging algorithm and add an index.

 

Quick pager paging control v2.0.0.8:

Http://www.cnblogs.com/jyk/archive/2008/07/29/1255891.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.