When the DataGrid meets 1 million documents

Source: Internet
Author: User
What will happen when the DataGrid encounters 1 million pieces of data? Below is the response in the blue shop
[Author: Yanjiang Hua temple priest]
Original problem Web site: http://www.blueshop.com.tw/board/show.asp? Subcde = brd20050216141103c3y & fumcde = fum20041006161839lrj

Performance is never enough, so performance issues always exist, and the reasons are divided into two parts:
1. Artificial
2. System (software and hardware)
The problem with your 8000 million pieces of data is exactly the same for the above two reasons.

I have an ASP. NET Web report software, using MSDE 2000, the database file size is 900 MB, there are two huge tables, one about 1,700,000 pens (more than 30 fields ), the other is 380,000 transactions (about 15 fields), and the growth rate of data per month is continuously increasing than the number of hundreds of thousands; In the P4-2.8G and in the memory of 100,000 MB, each query will have 3 ~ 50 thousand table cells are displayed, but the query can be displayed within one second (but this is achieved only after many levels of optimization and adjustment. Note: Table is not used for join queries ); and with your 8000 pieces of information + DataGrid on ASP. net is definitely not a problem, but it just needs to be changed. It cannot always rely on default.

However, in order to make an objective comparison with your problem, I first conducted the same simulation test on Nb. The Nb test environment is as follows:
1. OS: Windows 2003 Server Standard Edition
2. Visual Studio. NET 2003
3. ms SQL Server 2000 Standard Edition
4. CPU: Pentium-M 1.8g
5. Memory: 512 MB

After I boot, the memory usage is 205 MB, and a database has nearly 160,000 pieces of data, 15 fields, and myProgramCodeAs follows:
Sqlconnection conn = new sqlconnection ("Data Source =.; initial catalog = LK; user id = sa; Password =" admin ");
Conn. open ();
Sqldataadapter da1 = new sqldataadapter ("select top 8000 * From lm1_detail", Conn );
Dataset DS = new dataset ();

Da1.fill (DS, "ABC ");

Conn. Close ();
Datagrid1.datasource = Ds. Tables ["ABC"]. defaultview;
Datagrid1.databind ();

Da1.dispose ();
DS. Dispose ();
The above are 8000 pieces of data and 15 fields. The execution results are described as follows:
1. in My SQL analyzer, select top 10000 * From lm1_detail for the calculation of the number of Row Records. This is not counted for the first time. After the second time, SQL analyzer is displayed as 0 seconds, that is, less than 1 second.
2. After executing the ASP. NET DataGrid display, the CPU usage immediately increases to 100%, lasting 20 seconds, but the DataGrid still displays all the data volumes.
3. The memory increased from 422 MB to 217 MB, which increased by MB in total ).

So you should be vaguely aware of the problems with such a system, right?

OK. With the above data sample, we can start to discuss the subject (this test data should be more rigorous than your data). Where is the problem?
Basically, to clarify your problem, you must cut and isolate it into three layers:
1. SQL server layer
2. Ado. Net Layer
3. DataGrid Control Layer

Why cut it into these three layers? Because there is a problem in every place on this layer, the following is a brief description:
1. SQL Server-> NO Optimization Problems
2. Ado. Net-> Dataset
3. DataGrid Control-> problems with massive data processing and paging

start to discuss your problem:
SQL server:
1. SQL analyzer has been executed for 5 seconds, however, if you perform more than 8000 join operations on multiple tables, maybe is reasonable, but further analysis is required to determine whether there is room for improvement.
2. you are using a view. Of course, you cannot add an index to a view. However, when you join multiple tables, performance problems may occur, the speed may be 3 ~ longer than that of Cluster Index clustered and non-Cluster Index non-clustered ~ A factor of 10 determines whether the SQL Server Optimization engine works.
3. to solve the index problem, of course, you can not add the view, but add the index to the related table of the join, and the index creation principle is based on the where condition of the T-SQL query syntax, it is usually the key of the relation to create an index; then you can compare the number of seconds of execution with or without an index, which should be much different, but the most objective thing is that you have to check how SQL Server's Optimization engine parse your T-SQL statements and use execution plan to analyze how SQL Server executes your SQL Server statements, if you use table scan to execute the statement, you will not use the index at all. If you use any index, it means that the SQL Server Optimization engine has a role.
4. Whether the Optimization engine of server works depends on 1. The WHERE clause after your T-SQL syntax 2. The index you created is correct ?!
execution plan image Reference URL:

ADO. net problems:
1. in ADO. in net, dataset is a database in memory, that is, a database in the memory. Generally, it has some basic features in the database. The data volume is tens of thousands or tens of thousands of pieces of data, however, if the number is tens of thousands ~ What about 1 million transactions? Possibly in ASP. in the. NET environment, dataset is a big problem and will be very economic (and ASP. net, IIS, and AP server are operating in a way that is not described in detail), but it is still unknown how large its carrying capacity is designed ?! If you are free, you can test it...
2. How much memory space and computing cost does sqldataadapter. Fill (DS) consume cost? In your program code, because of ASP. NET's "stateless" feature, it takes dozens of MB ~ 100 mb. Once used up, it will be discarded. How many people do you think can load? In my experience, if the dataset uses MB of memory each time, the time will be out less than 10 times.
3. for dataset, even if you only view 30 pieces of data on one page, 8000-30 = 7970 pieces of data, not only do 7970 pieces of data need to be read and written for system resources, finally, the system cost is used to deal with the useless 7970 pieces of waste-like data, and it is like a bottle of oil.
4. If the DataGrid is displayed on multiple pages, it will be worse.

DataGrid problems:
1. no matter whether a page is made or not, the DataGrid must first collect all the 8000 pieces of data according to the order, and then process the data, and select 30 pieces of data required for the page. Do you think this will be faster? When you get up to 16000 pieces of data, you need to select 30 pieces of data from 16000 pages... It will be slower. From SQL Server-> ADO. Net dataset-> DataGrid, a large amount of economic data is being transferred, which is harmful to system memory, performance, and CPU operations.
2. What is the action of changing pages on the DataGrid pagination page? It will trigger a pageindexchange event, which will re-execute a fill (Dataset) action, and then throw the 8000 fill to the DataGrid, so how can a program get fast? (Use DataGrid to process a large amount of data ).
3. in the stateless feature of HTTP technology, all resources, unless you use state management (application, session, viewstate) to process, declare as static or component technology, otherwise, all resources will be discarded and cannot be reused.
4. when the DataGrid contains a large amount of read-only data, the enableviewstate attribute should be set to false, because viewstate consumes a large amount of memory, if you use the DataGrid to display 8000 records at a time without paging, you will know the difference between true and false. The gap between the two may be as high as 30 ~ 60 MB (for 30000 ~ In the case of 60000 table cells ).

Solution:
First of all, I must note that this is not an ASP. NET error or any system error. It is simply that there is no good way to deal with this problem. It is necessary to change the method!

SQL Server:
1. Create a valid index, whether it is a cluster index or a non-Cluster Index, or a composite index, or even specify the ascending or descending order of the index if necessary to reduce unnecessary order.
2. Use execution plan to confirm that the Optimization engine uses your index. The where Condition Clause is the key to the Optimization engine analysis.
3. After five seconds, you must make significant improvements to solve the problem of SQL Server!

DataGrid and dataset problems:
1. basically, the DataGrid + dataset will become breathless when encountering a large amount of data, and the system resource quota is exhausted (as mentioned earlier). If the data volume is large, use top XXX to limit the amount of valid data to relieve the pressure on DataSet and DataGrid.
2. When top is not available, please use T-SQL or store procedure to retrieve the 30 pieces of information required for the page, which requires a bit of processing:
(1) calculate the total number of records, then the total number of records/30 = page pages, and then generate the DataGrid in a custom mode for page number paging.
(2) Call the data access layer based on the pageindex currently in which the DataGrid is located, that is, as long as the fill (Dataset) is filled with 30 replies.
3. Remember to close the connection, and dispose the sqldataadapter and dataset to release system resources.

Basically, your system will become very "light", and the speed will be increased by more than 10 times. Even if your data volume grows to more than 1 million, you still have power!

Note:
1. SQL Server 2005 supports the pages of ADO. NET and only retrieves the row required by your page, which is of great help to system performance and design.
2. For the program code that only retrieves the page and SP by page, please take some time to do your homework. I will post the code for reference.

Are you free to visit my blog http://blog.sina.com.tw/weblog.php? Blog_id = 4907

In fact, there are a lot of experts here, everyone has their own strengths, just to communicate with each other, just once spent a lot of time studying a lot of information, Asp. net and ADO. NET performance problems, because my system broke 2 million transactions in a while. By the way, I tested the performance of MSDE 2000. I didn't want to say that this was the power, MSDE 2000 + ADO. net offline can be well matched, can be said to be free of money, and can also be managed using Enterprise Manager, it really saves a lot of money, I will announce my experiences to everyone, thanks to everyone!

There are two paging techniques, one is directly through the T-SQL, the other is through the store procedure, Here post to share with you:
T-SQL:
Assume that northwind has a customer table, and you need to retrieve 41 ~ 50 records, T-SQL syntax how to do?

Select top 10 mermerid, companyName, contactname, country from customers where customerid not in
(Select Top 40 mermerid from customers order by country, customerid)
Order by country, customerid

Store procedure:
From msdn magazine, smart by others
Create procedure northwind_orderspaged
(
@ Pageindex int,
@ Pagesize int
)
As
Begin
Declare @ pagelowerbound int
Declare @ pageupperbound int
Declare @ rowstoreturn int

-- First set the rowcount
Set @ rowstoreturn = @ pagesize * (@ pageindex + 1)
Set rowcount @ rowstoreturn

-- Set the page Bounds
Set @ pagelowerbound = @ pagesize * @ pageindex
Set @ pageupperbound = @ pagelowerbound + @ pagesize + 1

-- Create a temp table to store the select results
Create Table # pageindex
(
Indexid int identity (1, 1) not null,
Orderid int
)

-- Insert into the temp table
Insert into # pageindex (orderid)
Select
Orderid
From
Orders
Order
Orderid DESC

-- Return total count
Select count (orderid) from orders

-- Return paged results
Select
O .*
From
Orders O,
# Pageindex
Where
O. orderid = pageindex. orderid and
Pageindex. indexid> @ pagelowerbound and
Pageindex. indexid <@ pageupperbound
Order
Pageindex. indexid

End

I believe you can write something better than above if you spend more time on it.AlgorithmThere are better ways to share them!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.