Personal ASP. NETProgramPerformance Optimization experience series:
My experiences in ASP. NET program performance optimization (1): Database
My experiences in ASP. NET program performance optimization (1): Database (another article)
Personal experience in ASP. NET program performance optimization (2): ASP. NET code optimization
Personal experience in ASP. NET program performance optimization (3): frontend Performance Optimization
------------------------------------------------------------------------------
In the previous articleArticlePersonal ASP. net Program Performance Optimization experience (1): Database article, many park friends raised some questions about some of my ideas. Here I carefully read some materials, corrected some errors, added more detailed descriptions of some controversial issues, and presented some performance problems as instance data.
Table Structure:
News (newsid, newstitle, content, cateid, catetitle, postuserid, postusername, addtime, datenum, hits, commentnum) // hits: clicks; commentnum: Comment times (Cate (cateid, catetitle)
Only the primary key and no foreign key are created. No other indexes are created for the moment. Among them, the news table inserts 1 million pieces of test data, and the Cate table has three pieces of data.
1,Is the foreign key and Phase Key more efficient than the strict paradigm?:
These two tables are notable fat and thin table queries, which are also common. Assume that a strict paradigm table structure is used, that is, the news table does not have a catetitle. join queries are conducted here:
Select N. *, C. * from news n inner join Cate C on C. cateid = n. cateid and C. cateid = 1
The execution plan is as follows:
My experiences in ASP. NET program performance optimization (1): Database (another article)
In another case, the query statement is as follows:
Select * from news where cateid = 1
The execution plan is as follows:
From the execution plan, we can see that the first case is to merge the results after two index-focused queries and then a nested loop, while the second case is only one index-focused scan, therefore, the performance consumption will be reduced in a certain program. It is a comparison of SQL Server Profiler:
My experiences in ASP. NET program performance optimization (1): Database (another article)
Here, reads is roughly the same. Because the statement is merged again, it will consume a certain amount of CPU performance.
Conclusion: merging Foreign keys and related keys into the primary table improves the performance of this simple query, but it relies on data redundancy to improve the performance, in fact, because the query conditions are performed on Clustered indexes, you do not have to consider this situation if the data volume is small.
2,Datetime Type Problems
In the previous article, I mentioned that datetime has a higher performance than Int. This is an arbitrary conclusion, and in this example, I only explain it in order. In fact, both types are bigint type stored in the database, but datetime occupies 8 bytes and INT occupies 4 bytes. In this simple sorting, the performance is basically no different, datetime has powerful time operation functions, and the int type certainly cannot meet these functions. At this time, using datetime is required, but if it is similar to order by addtime DESC, if the default value of the time to be added is getdate (), the sorting is consistent with that of the primary key. We recommend that you use order by newsid DESC for sorting.
The SQL statement is as follows:
Select top 5000 * from news order by addtime descselect top 5000 * from news order by newsid DESC
The execution results in SQL Server Profiler are as follows:
Therefore, the conclusion is: The datetime type should be retained. If it is stored as smalldatetime, we recommend that you use smalldatetime to store data.
3,Select top 1 problem
If you understand the clustered index, you will surely understand this problem. The primary key must be a clustered index, but the query performance on the clustered index has little impact, the analysis of the query conditions on the primary key with and without top 1 respectively:
My experiences in ASP. NET program performance optimization (1): Database (another article)
In complex query conditions, we need more parameters for query. In this case, the condition column is often assumed that the full table scan will be performed without an index. In this case, the performance will be affected. If Top 1 is not used:
Select * from news where newsid = 500000 and postusername = 'mongoingp'
The execution plan is as follows:
My experiences in ASP. NET program performance optimization (1): Database (another article)
My experiences in ASP. NET program performance optimization (1): Database (another article)
When top 1 is used for condition constraints:
Select top 1 * from news where newsid = 500000 and postusername = 'mongoingp'
We can see that the result is returned when the number of top items is reached after the query results are found, which saves the time for full table scan:
For example:
Therefore, the conclusion is that not all queries are on (clustered) indexes. If only a few queries need to be returned, we recommend that you use top for constraints, this performance difference is more obvious in nested queries such as in.
4,Whether the hits and updatetime fields should be separated from the table
This type of field is the most frequently updated field in the master table. When a large data volume is updated frequently, the display may cause performance degradation, therefore, when the data volume is large, we recommend that you separate this type of data into another table and index the logical foreign key columns in the table to improve performance.
The new table structure is as follows:
News (newsid, newstitle, content, cateid, catetitle, postuserid, postusername, addtime, datenum, commentnum) // hits: clicks; commentnum: Comment times hits (newsid, hits) cate (cateid, catetitle)
5,Foreign Key Issues
The fact that the foreign key problem affects the performance is not mentioned. Here we take performance as the main research point. Of course, the specific situation should be considered based on the specific situation, foreign keys are an important means to maintain data integrity. In some applications, data integrity may be more important than performance. In this case, we recommend that you establish foreign keys. This consumption of performance is much less important than the business. In addition, performance optimization can be performed through other optimization methods.
For Internet applications, data growth is extremely fast. In addition, unreasonable design and poor coding may cause unpredictable problems in operation (except for the old wet on csdn, who can guarantee zero bug ?), Relatively speaking, the cost of using foreign keys is much higher. Of course, we recommend that you use foreign keys for small systems such as Enterprise websites. At least you can standardize your own code specifications.
6,Will scan the entire table?
Physically, SQL server uses three methods to organize data in its partitions: 1. Use B to store table data pages with clustered indexes; 2. Use a heap to store tables that do not have an organizational unit. 3. Use a B-tree similar to a clustered index to store the index structure. SQL Server uses different data retrieval methods for these three structures:
This is actually a very profound knowledge point, and I am also a bit familiar with it. I come from the Microsoft SQL Server Enterprise platform management practices book, and the conclusion is as follows: scan does not always perform full table scan. In some cases, scan does not have worse performance than seek. To fully understand it, you need to better understand the query mechanisms at the core of SQL Server.
However, it is worth noting that in general, our queries are dominated by simple queries. In this case, scan should be avoided as much as possible, especially for complex scenarios such as multi-nested queries.
7. powerful insights and Indexes
I don't need to talk about the concept of indexes. The following example illustrates the powerful functions of indexes.
First, sort commentnum (comment count) in reverse order without creating an index:
Select top 5000 * from news order by commentnum DESC
Then create an index for commentnum:
Execute the SQL statement again:
We can see that the time is mainly spent on sorting before the index is created, and the time after the index is created is basically all searched. For example:
You can analyze the complex conditions that may occur in the query to create an appropriate index, which will greatly improve the performance.
Finally, the most important thing is that in actual database applications, we should use them flexibly. We cannot take the theory as an example. If there are doubts, we can test it on our own, read more authoritative materials so that the technology is constantly improving.
If you have any questions in this article, please reply to them.
References:
Microsoft SQL Server Enterprise platform management practices
This article is also posted on my personal homepage: http://www.walkingp.com /? P = 1136