Key points:
Infrastructure indicators can only work best if they are associated with application metrics.
The key to efficient performance optimization is performance data.
Some APM tools provide out-of-the-box support for ASP, so getting started with ASP. NET requires minimal initial setup.
The Code analysis tool gives the most detailed view of program performance.
The Lightweight analysis tool gives a real-time view of Web page performance that can be used in both the development and production environments.
"This page is too slow to open!" "The complaints about Web sites are recurring and pervasive, especially since Web applications are beginning to replace desktop applications. While the web brings the ideal features of global delivery, it also poses challenges at the performance level.
Fundamentals of data acquisition and use
The user gave you a "Turtle speed" page URL, well, what should you do? Where does the problem of Web page opening come from? Was it that slow at first? Is it slow for all users? To solve the problem of slow Web page opening and make sure that it won't slow down again after a week, there are a number of such issues that need to be addressed.
While some performance-optimized data can be found on the web, they are typically related to specific topics such as JIT, garbage collection, SQL query optimization, ORM traps, and so on. Given the tempting prospect of optimizing, there is a problem: how do you know that the selected optimization method will actually produce good results for current performance issues?
There is no doubt that one of the rings in this job is missing. We need to be able to sustainably find the way performance problems lie. By using this method, we can find the slower parts of the system and have practical measures to support our diagnosis of performance problems. With the performance issues in place, we can further determine if performance improvements are needed and explain all this to stakeholders.
For the above mentioned performance problems, accurate screening is a more effective way to deal with them. The problem may not be a slow load on a webpage at the beginning. When there is a time-out (for example, the load balancer may not service the connection for a few seconds), it is completely inaccessible. This is a deadlock problem or slow response time, because both of these problems result in the same result, that is, a timeout occurs. This requires data to find the real cause of the problem.
To illustrate the importance of accurately identifying performance issues, here are some possible troubleshooting points that cause Web applications to respond slowly:
JavaScript responds slowly;
The congestion is generated in the resource loading;
Agent exists on client side;
DNS issues;
ISP or network problems;
Switches and routers;
Load balancer;
Application code (including third-party repositories);
HTTP server (for example, sometimes ASP. NET or IIS);
Third-party services, such as payment service providers, map service providers, etc.;
Subsystems, including: SQL Server, Redis, Elasticsearch, Rabbit MQ, and more.
You can also list more performance troubleshooting points, depending on the complexity and size of the system you need to handle. How can you diagnose performance problems when so many system components can affect performance optimization issues? The answer is summed up in one word: data. You need data that is relevant and meaningful from each system component. For slow Web application responses, data can prove that each system component has an impact on the problem or is completely irrelevant.
With the data in hand, you can begin to extract the troubleshooting points from the list above, similar to finding them in the sorting tree. Each time you go down one level in the tree, the closer you get to the details and the nature of the performance issue, and in turn, identify whether the performance issue exists in:
Client, server side, or somewhere between the two?
Slow-response JavaScript, rendering, or resource blocking?
Load balancer, Web server, any subsystem or third-party software?
In such a tree, performance issues become clearer as you go down the layer. For each level of troubleshooting points, the data required to locate performance issues must match the corresponding problem accuracy. It is necessary to use a tool such as a performance analysis tool or a SQL execution plan.
In order to use time effectively, it is necessary to reiterate the Amdahl law:
No matter how much a task is improved, the part of the task that does not benefit from the improvement limits the theoretical task acceleration.
For example, in a Web request, it is assumed that 100 milliseconds of server processing time and 5 Seconds of SQL query time are required. Even if you can optimize the server processing time to less than 1 milliseconds, the overall response time improvement is small, that is, from 5.1 seconds to 5 seconds. The 5 seconds required to improve SQL processing is the maximum potential benefit optimization.
Architectural issues
The top-down method, which is the one-layer-wise optimization problem, has a good effect on the optimization problem confined to a single page. So what happens when you apply an optimization problem that spans multiple pages? For example, some pages have intermittent slow-open problems because the subsystem cannot keep up with the overall operating rhythm, or because there is an old network switch in the system that may not continue to work due to a restart.
In this case, the application-focused monitoring method shows its limitations. This requires more software-level and hardware-level metrics to evaluate each component in the system.
At the hardware level, the first thing you can think of is a Web server and a database server, but they are just a tip of the iceberg. Hardware components in all systems must be identified and monitored, including: servers, network switches, routers, load balancers, firewalls, sans, etc.
Since the general work of the system administrator is hardware monitoring, all of the above metrics may be obvious to the system administrator. But here's an important caveat: If you separate these hardware metrics from the software metrics, then most of these hardware metrics are useless from a performance perspective. In other words, the indicator can play its best role only if it is placed in the appropriate environment.
For example, in some cases it may be perfectly normal to have an average CPU utilization of 50% on the database server, but this is a time bomb for other servers. 50% CPU usage, if at peak times this means there is still a lot of room to run more onerous tasks. However, if 50% of CPU usage occurs frequently in a free time period, this means that the application may not be able to withstand the burst spikes of incoming requests.
The bottom line is that in order to assess the health of the system, system-wide metrics such as CPU, memory, and disk must be associated with the application metrics. To give a more complete view of the system health, you can visualize system metrics such as application metrics such as request throughput and CPU utilization.
Application Performance Management (Application performance MANAGEMENT,APM) tools
APM tools provide basic operations such as data acquisition, data storage, and visualization of data. Typically, the agent is responsible for collecting data and sending the data to the data store and visualizing the data using the Web interface to centralize the dashboards on the Web request.
APM can be used to:
The overall visualization of Web application performance;
Visualize the performance of specific Web requests;
Automatically send alarms when Web application performance becomes worse or multiple errors occur;
When the volume of business is large, the response of the application is validated.
An example is given here.
The following is not an exhaustive list of APM tools that support the out-of-the-box use of ASP. NET and IIS:
Newrelic APM
Application Insights
AppDynamics
Stackify
Infrastructure Monitoring Tools
Infrastructure monitoring tools capture metrics at the host level, which can reflect performance more fully. These metrics are collected at the hardware and software level.
DataDog
Opserver-open Source
Lightweight analysis tools
Lightweight analysis tools provide high-level metrics for specific Web requests and provide real-time feedback when developers browse Web pages. These tools are available in all environment types (including development environment, QA validation, simulation environment, production environment, etc.) and are therefore ideal for quick evaluation of specific page performance.
The essential difference between lightweight analysis tools and the corresponding fully functional analysis tools is that they are not affiliated with the process, which means you don't have to worry about the overhead of using lightweight analysis tools.
In the development environment, the Lightweight analysis tool provides real-time feedback on the code that is currently being written. This is useful for discovering problems such as n+1 or slow response times, because response times are always displayed on the corner of the page.
Open-Source Miniprofiler
Open-Source Glimpse
Fill the blanks with performance counters
Performance counters (performance counter) in Windows systems provide metrics on different aspects of hardware and software levels. Monitoring tools are typically reported in performance counters, such as CPU and memory usage. However, some useful counters, such as garbage collection time, are often missing. The most practical way to get started is to use the basic list and add the necessary related counters in the iteration. In addition, it is possible to capture and visualize performance counters in real time using Perfmon. In many cases, it is also possible to integrate user-defined metrics or plug-ins with APM tools.
SQL Tools
Because of the widespread use of databases in many applications, persistent layers (that is, SQL databases) are often a bottleneck in performance. Professional tools for SQL monitoring can provide resource usage metrics, as well as specific metrics such as wait times, compile times per second, and so on, here are just a few.
There are some types of problems that can be identified and performance improvements available when the following data is available:
Excessive throughput on one or several queries;
Excessive CPU usage, which implies the existence of query problems or the lack of indexes;
High-throughput queries that can be cached.
SQL monitoring tools include:
Other durable systems
All subsystems need to be monitored in some way. Simple data acquisition and visualization is sufficient for low-throughput or non-critical systems. In this case, more advanced and professional monitoring is required.
Code Analysis tools
The Code Analysis tool provides the most detailed view of a performance issue when a specific page or snippet detection is diagnosed as slow. The Code analysis tool can also provide an accurate view of external calls such as database queries and Web requests.
Analysis Tools:
Redgate Ants
JetBrains Dottrace
Memory analysis Tools
Memory monitoring and garbage collection metrics help detect potential problems. But these indicators show that there are problems at the same time, usually not to the problem. Memory analysis tools can be useful if you need to delve deeper into team memory and garbage collection issues.
Analysis Tools:
User-side analysis tools
Performance issues may also come from the front end. This is a common problem because of the proliferation of JavaScript-dominated single-page applications. All major browsers have embedded tools such as code Analysis and memory analysis. The tool that displays the sequence of events and requests facilitates a glance to determine whether the problem originates from the front-end or back-end.
Tool: Tools:
Google Chrome Timeline
Firefox
Page analysis Tools
High-level client tools provide a convenient starting point for discovering and resolving performance issues. These tools provide a high-level view of the root cause of the response time problem, and give some recommendations accordingly. Google's Pagespeed insights, for example, is a free tool like this.
The number of factors and tools associated with system performance is very large, which seems to be very complex. But they can be summed up in one word: data. There is a clear and accurate view of the system, which makes the reasoning performance problem possible. This also allows you to learn how to solve performance problems on the spot, as performance metrics and graphs will guide you to discover exactly what is impacting system performance.