Golden ocean student's "Performance Test on Int auto-increment field and GUID field"

Last Update:2018-12-08 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original article: performance tests on the Int auto-increment field and GUID field. Only test, no analysis, huh, huh

1. Why does the execution plan become different after top is added?
After top 100 is added, SQL server can optimize the execution plan and use the nested loop method because only 100 pieces of data need to be returned and the association between the two tables has index information.
If top 100 is not added, all data needs to be returned, because the cost is relatively high and the computer is dual-core. Therefore, the parallel execution plan is adopted. Parallel Execution requires many more operations. SQL server requires partitions and merge results after execution. However, this may shorten the execution time (not absolute, in some cases, the time is longer than the non-parallel time)

2. Why is the int identity speed not significantly faster?
Speed! = Efficiency: efficiency is a comprehensive consideration. A simple example is as follows:
A) when SQL A is executed, the CPU usage is 25%, and the required time is 2 seconds.
B) when SQL B is executed, the CPU usage is 10%, and the required time is 4 seconds.
In ideal cases, what is the maximum number of requests processed per 10 seconds, a) and B? (A 20, B 25)
The result of efficiency is the support for throughput and data volume. The program we develop is not only used by one person, but the statements written in many cases may be very fast, but this does not necessarily mean high efficiency. What will happen when concurrent requests are 50, 100, or 1000? What happens when the data volume reaches 0.1 million, 1 million, or 10 million?
The usage of SQL statements on CPU, memory, and I/O is the main consideration. using SQL server profiler to monitor the execution of SQL statements, you can see the CPU, Reads, Writes, and Duration data, generally, SQL server profiler and execution plan information are enough. For more detailed monitoring and analysis, you can use the system performance monitor.
My opinion on this test:
1. 50 thousand join 0.2 million and 0.2 million data are returned. The cost is not small. As for whether the primary key is int, guid, or varchar, the execution time does not differ much.
2. because different types of primary keys are not significantly different, and SQL server itself has multi-task scheduling, memory management, and windows Multi-task environment, there may be a lot of interference from other factors, for example, anti-virus software monitors files and memory. It is impossible to determine who is faster than anyone else. Test the program several times, shut down other programs, or restart the machine for testing, or if someone else tests the same database, the results may be different.
3. Use SQL profiler to check the differences between CPU, Reads, Writes, and Duration. You can get a little detailed and comprehensive comparison and evaluation. There is no better argument between them than the other. Efficiency, performance, and design are all decisions between trade-offs.
4. For the primary key of guid, the performance is more comprehensive consideration of the storage structure, index and data page fragmentation, and the efficiency of insert.

3. Why does SQL server not return data for a long time when the CPU usage is very low? What is SQL server doing?
First, you must check the join method according to the query plan. Generally, the CPU and memory usage of the Nested Loop/Inner Join operation is relatively low, however, Hash Match/Inner Join operations require a lot of CPU usage, and the memory needs to be determined by the amount of data to be operated. Inaccurate statistics and other factors may cause SQL server to use nested loop when it is not suitable. This results in low CPU usage and long execution of query statements.
For details about the join method, refer to the "heaven-hell JOIN" Method for performance optimization.
If Hash Match/Inner Join is used and the CPU usage is low, it is likely that I/O is being waited, probably because the memory that SQL server can use is insufficient, or I/O itself is slow, as long as the memory is sufficient data can be loaded into the memory, and the data volume is not too small, Hash Match CPU usage is usually 80-100%

Note: Sorry, I have not seen several posts discussing this topic. You may have discussed these ideas.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Golden ocean student's "Performance Test on Int auto-increment field and GUID field"

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support