Efficiency Comparison between two databases and four paging Algorithms

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

From: http://www.cnblogs.com/jyk/archive/2008/08/04/1260406.html

PagingAlgorithmThere is no difference between speed and speed. What determines the reaction speed is --Can indexes be effectively used!

Algorithm	Rating	Disadvantages	Applicable Database
Max	The most efficient	Only one sorting field is allowed.	Excel, access, SQL Server 2000 (2005)
Top upside down	The most widely used	Multi-data on the last page	Excel, access, SQL Server 2000 (2005)
Table Variables	The most disappointing	Too many	SQL Server 2000 (2005)
Row_number	Only applicable to SQL server2005	It can only be used in one type of Database	SQL Server 2005

I. Max

The disadvantage is that there can only be one sorting field, and the value of the sorting field cannot have duplicate values or duplicate values, but it cannot be repeated at the top of the page or at the end of the page. The advantage is that it is easy to set the index. You can set the index for the Field in order of that field. After the index is set, the paging speed will be greatly improved, and the more records, the more obvious, because the index is used. What is the role of an index? To put it simply, first stand up and find the most advanced one. Suppose we want to find the highest class in the class, what should we do? Let's arrange the students according to their size, and then we can see who is the highest. After the index is set, it is equivalent to having already lined up, and the rest will be much faster.

2. Top upside down

This is an upside down optimization. It must have a primary key and cannot be a composite primary key. In most cases, index settings are better and indexes can be used, so the paging effect is also acceptable.

Iii. Table Variables

Because we have been using the inverted top method and the positioning method, we have never caught off with table variables. Because it is an algorithm recommended by the paging control of wuqiwa, we will continue to study this upgrade. I have read it carefully, but there are many shortcomings. Because the idea of table variables is to place the primary key and auto-incrementing fields in the table variables, and then use the auto-incrementing fields for paging. As in the preceding example, we need to first sort the data, then report the number, and then extract the data according to the number. This has brought about many shortcomings:

1. the table must have a primary key and cannot be a composite primary key. If the primary key is compound, it is difficult to extract data.
2. The primary key type affects paging algorithm writing. INT and datetime, the field type is different when defining table variables.
3. If only the primary key is sorted, the primary key index will be wasted. They had already lined up, but asked them to arrange again.
4. The more pages you need to go back, the more data you need to store in the table variables. It doesn't matter if you put more data. Other algorithms will also put some data into the "temporary table, however, the problem is that the auto-increment field does not have an index, and the more data it adds, the slower it will naturally be.
5. It can only be used in SQL Server 2000 and SQL server2005, and the speed in 2005 is obviously not faster than row_number. The paging algorithm recommended by the website of wuqiwa is divided into two versions, for SQL2000 and sql2005, row_number is recommended for the latter.

Having said so many shortcomings, I am afraid it is easier to write and understand. It is hard to understand if top is upside down. Therefore, this is the most disappointing paging algorithm.

4. row_number

I only use this, and I don't quite understand the internal principle. In most cases, the efficiency is also good, but I once found that he could not use the index, so he was depressed.

I'm afraid you don't trust it either, especially when it comes to table variables. Here we will use tests to prove the performance of each paging algorithm. It is really troublesome to say that there are at least three databases and four paging algorithms. There are also single-field sorting, multi-field sorting, single-Table paging, and variable-Association paging on the data, can a small amount of data and massive data be indexed.

3x4x2x2x2 = 192. So much is not a bit dizzy, just simplify it. Here we first use SQL Server2000 to test the massive data volume in a single table.

There are three paging algorithms, single table, and massive data (2 million, but the comparison is not effective ).

First introduce the test environment

Database: SQL Server2000

IDE: vs2008.

CPU: amd3000 +

Memory: DDR2 1g

Hard Disk: Serial Port 160 GB

Three paging algorithms: Max, table variable, and top upside down.

We use the northwind database and the products table for testing, and the built-in data is not enough. Let's copy the data and use the insert into select method to add 2523136 records, 15 records are displayed on one page, with a total of 168210 pages. Then we will record the usage time and memory usage of the first load, the first few pages, the first 100 pages, 1,000th pages, 10,000th pages, the last few pages, and the last pages.

This is the test result of sorting a single field.ProductidSort.
Unit: milliseconds.

Paging algorithm	First time	Previous pages	First 100 pages	Page 1	Page 1	Page 1	Last few pages	Last page
Maxmin	953/40 m	400/40 m	0-15/50 m	0-15/50 m	78-93/68 m	3.62 S/210 m 765/210 m	2.80 S/310 m 1.28 S/310 m	5.62 S/310 m 1.28 S/310 m
Top upside down	875/40 m	400/40 m	0-15/50 m	15-31/50 m	281/69 m 93-125/69 m	2.96 S/210 m 1.3 s/210 m	3.78 S/310 m 1.7 S/310 m	15 (Special Handling)
Table Variables	968	468	0-15/50 m	11.18 S/55 m 93-109/59 m	24.265 S/76 m 859/77 m	Timeout 45.171 S/230 m 11.78 S/240 m 8.281 s	This will not be tested.	This will not be tested.

Record description:

1. If a page has two times, the time required to jump from page 100 to page 1000 is displayed.

2. execution time and memory increment.

3. During the first running, the SQL server needs to load some data into the memory, so it takes a long time.

4. Count (*) is used to count the total number of records on the first page, so the time is a bit long. When accessing other pages, you do not need to count the total number of records, so the time will be very short.

5. The CPU usage is not recorded, and it is basically full. It seems that amd3000 + is a little weak.

The SQL statement on the first page:

Select top 15 * from products order by productid

Maxmin SQL statement:
Select top 15 * from products where productid> = (select max (productid) from (select top 526 productid from products order by productid) as t) order by productid

Top-down SQL statements:
Select * from products where productid in (select top 15 productid from (select top 420 productid from products order by productid) as t order by T. productid DESC) order by productid

The paging algorithm on the last page is specially processed to remove bugs rather than speed up.

Display the SQL statement on the last page of top upside down
Select * from (select top 1 * from products order by productid DESC) as t order by T. productid

SQL statement for table variables:
Declare @ TT table (ID int identity (2130), nid int) insert into @ TT (NID) Select top productid from products order by productid select * from products T1, @ TT T2 where t1.productid = t2.nid and t2.id between 2116 and 2130

It is really not enough. When I jumped from page 10 thousand to page 100,000, unfortunately, it timed out after 45.171 seconds. Because the data access function library does not throw an exception when it encounters an exception, so we can get the running time. By viewing the error log, we can see that "the timeout time has reached. The timeout time has elapsed before the operation is completed or the server has not responded ."
Refresh again to obtain the data.

During the test, the paging time of the table variables is unstable, and the time is slow for a while. during large-scale page jumping, the time is always timed out. Reversing top is relatively stable.

This is the result of Multi-field sorting.Unitprice, productid DESCSort. Index: unitprice, productid DESC
Unit: milliseconds.

Paging algorithm	First time	Previous pages	Page 1	Page 1	Page 1	Page 1	Last few pages	Last page
Top upside down	375/55 m	15-31/57 m	15-31/60 m	62-78/63 m	687/63 m 486/64 m	5.18 S/65 m 4.3 S/65 m	8.15 S/66 m 7-8 s/66 m	15 (Special Handling)
Table Variables	968	15-31/58 m	1.31/59 m 46/60 m	12.17 S/60 m 93-125/60 m	I don't really understand how it timed out. 27.39 S/76 m 8-9 S/77 m 900/77 m	Timeout 45.171 S/100 m 10.2 S/100 m 9.3 s	27.2 S/100 m 9.3 s	This will not be tested.

This is the test result of SQL Server 2005. Multiple fields are sorted accordingUnitprice, productid DESCSort. Index: unitprice, productid DESC
Unit: milliseconds.

Paging algorithm	First time	Previous pages	Page 1	Page 1	Page 1	Page 1	Last few pages	Last page
Top upside down	375/55 m	15-46/57 m	15-46/100 m	78-93/110 m	531/117 m 466/117 m	5-8 s/118 m	7-10 S/120 m	15 (Special Handling)
Table Variables	968	31-46/100 m	2.62/100 m 0.4 s or 1.6 s/100 m	17.34 S/110 m 93-156/110 m	25.57 S/76 m 1.25 S/77 m	Timeout 22.82 S/130 m 15.2 S/130 m 9.3 s	27.2 S/100 m 9.3 s	This will not be tested.
Row_number	500	15/93 m	546 15-31	4.82 S/200 m 93-106/210 m	3.15/265 m 3.96/263 m	Relatively slow, total timeout, do not want to test.

The test of the row_number algorithm is added here,

The SQL statement of row_number:

With t_pager as (select myindex = row_number () over (order by unitprice, productid DESC), * from products) Select * From t_pager where myindex between 150031 and 150045

I suspect that this test result is correct. If the test result is correct, it means that the efficiency of "upside down" is the highest and the applicability is wide.

But I still doubt this result. Is row_number so slow? Is there any problem with the spliced SQL statement? Or is there a problem at that stage?

========================================================== ========================================================

TestedCode

Create a base class, define a gridview, A quickpager paging control, and a label, then override oninit () function, and then add an event. Because the attribute settings of the paging control are the same for the same table, but the attributes of the paging algorithm are different, we can put the same settings in the base class, the difference is placed in a specific page.

How to Use the quickpager paging control in the base class

Public Class Baselist: system. Web. UI. Page
{
Protected Jyk. Controls. quickpager mypager = Null ;
Protected System. Web. UI. webcontrols. gridview gv = Null ;
Protected Label LBL = Null ;

Protected Datetime dt1;

Protected Override Void Oninit (eventargs E)
{
Base . Oninit (E );

// Set the properties of the paging Control

Mypager. controlgrid = GV;

Mypager. tablename = " Products " ;
Mypager. tableshowcolumns = " * " ;
Mypager. tableidcolumns = " Productid " ;
// Mypager. tableordercolumns = "productid"; productname
Mypager. tableordercolumns = " Unitprice, productid DESC " ;

Mypager. pagesize = 15 ;

// Add event
Mypager. gridbinded + = New Jyk. Controls. quickpager. eventpagechange (mypager_gridbinded );

Dt1 = Datetime. now;
}

Void Mypager_gridbinded ( Object Sender, jyk. Controls. Page. pageargs E)
{
Timespan TS = Datetime. Now - Dt1;
LBL. Text = " Seconds: " + TS. Seconds + " , Millisecond: " + TS. milliseconds;
LBL. Text + = " <Br> " + Mypager. getpagersql;
}

Void Mypager_pagechanged ( Object Sender, jyk. Controls. Page. pageargs E)
{

}
}

Create An ASPX page and drag in three widgets.

Code
< ASP: gridview ID = " Gv " Runat = " Server " Width = " 100% " >
</ ASP: gridview >
< Jyk: quickpager ID = " Mypager " Runat = " Server " />
< ASP: Label ID = " LBL " Runat = " Server " Text = " " > </ ASP: Label >

Code
Public Partial Class Lst_max: baselist
{
Protected Void Page_load ( Object Sender, eventargs E)
{
If ( ! Page. ispostback)
Mypager. setsqlkind = Jyk. Controls. Page. mypagesqlkind. row_number;
}
}

We can also compare SQL 2000 and SQL 2005 by the way. In this case, 05 does not have many advantages. Row_number is not as good as I think. Maybe I did not write it correctly. Maybe it can only show its advantages when there are four or five sorting fields? However, changing the paging algorithm is very easy for quickpager. if it finds that the speed is slow, try another paging algorithm and add an index.

Quick pager paging control v2.0.0.8:

Http://www.cnblogs.com/jyk/archive/2008/07/29/1255891.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Efficiency Comparison between two databases and four paging Algorithms

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Efficiency Comparison between two databases and four paging Algorithms

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support