Perform a stress test on the Linux kernel

Source: Internet
Author: User
Tags gcov perl script
Article title: comprehensively analyzes the stress test on Linux kernel. Linux is a technology channel of the IT lab in China. Includes basic categories such as desktop applications, Linux system management, kernel research, embedded systems, and open source.
Automated Software Testing allows you to run the same test within a period of time to ensure that the content you compare is truly comparable. In this article, members of the Linux Test Project team shared their ideas about Linux? The method, principle, script, and tool used for testing the kernel pressure.
  
In the test of Linux kernel version stability, it is necessary to explicitly declare and prove why the version is stable or unstable. However, it has not been proved and confirmed that stress tests within the current system range can test the overall stability of the Linux kernel. This article provides a method to create a Linux stress test within the system range and verify the result. Different Linux developers, users, and release versions use their own methods to test kernel stability. However, the basic information about the tests they decide to run, the code covered, and the pressure level they reach is not released, which greatly reduces the value of the results.
  
Using lab machines and tests from the Linux Test Project Test suite, we developed a Test combination based on statistics on the utilization of system resources to provide sufficient pressure for the system. We analyzed this combined test to determine which parts of the Linux kernel are used in the test execution. Then we modified the composite test to increase the percentage of code coverage while maintaining the expected high-intensity system pressure. The final stress test covers enough of the Linux kernel to facilitate stability declaration, and supports data with system usage and kernel code coverage.
  
The four steps of this combined test method are: Test selection, system resource utilization evaluation, kernel code coverage analysis, and final stress test evaluation.
  
   Select test
The test selection includes two tests:
-The test should be able to obtain a high level of resource utilization in the main kernel areas such as CPU (s), memory, I/O, and network.
-The test should fully overwrite the kernel code to support the stability statement generated from the results.
  
If possible, automated or easy-to-modify tests must be used to support automatic operations. Automatic operations can make the test faster and repeat, and help reduce the risk of human error. Another aspect to consider when selecting an appropriate test is the use of applications that can freely publish results. It is best to choose a test and test suite that firmly supports open source code methods and/or GPL to help ensure the simplicity of the release process.
  
   Evaluate system resource utilization
The combination of the selected tests must put sufficient pressure on system resources. The four main aspects of the Linux kernel can affect the system response and Execution time:
-CPU: the time used to process data on the machine's CPU (s.
-Memory: the time when the data is read and written from the real Memory.
-I/O: The time when data is read and written from the disk memory.
-Networking: The time when data is read and written from the network.
  
The test designer should use the following two well-known open-source Linux resource monitoring tools to evaluate the resource utilization level. (See references later in this article for links to download these tools .)
-Top: an open-source tool maintained by Albert D. Cahalan. it is included in most Linux releases and can be used for the current 2.4 and 2.6 kernels.
-Sar: another open source code tool, which is maintained by Sebastien Godard. This tool is also included in most Linux releases and can be used for the current 2.4 and 2.6 kernels.
  
The evaluation phase of system resource utilization in the method usually requires multiple attempts to obtain a suitable test combination and get the expected level of utilization. Overuse is always a critical issue when determining a Test combination. For example, if the selected combination is too limited by I/O, the CPU test result may be poor, and vice versa. This part of the method involves a large number of experiments and errors until all resources reach the expected level.
  
The top tool can be used to quickly determine which Resources (CPU, memory, or I/O) are affected by each test and display in real time how many resources are used. The sar tool collects network utilization statistics over a period of time and records snapshots of all utilization data to one file.
  
After a combination is selected, the test must run for a long time to accurately evaluate the resource utilization rate. The duration of the test depends on the length of each test. If multiple tests run at the same time, the time must be long enough so that the longest of these tests can be completed. In this evaluation process, sar tools should also be running. In the conclusion of evaluation operation, you should collect and evaluate the utilization levels of all four types of resources.
  
The following example shows the CPU, memory, and network usage of the sar output:
Listing 1. sar output example
  
10:48:27 CPU % user % nice % system % iowait % idle 10:48:28 all 0.00 0.00 0.00 0.00 100.00 10:48:29 all 3.00 0.00 1.00 0.00 10:48:30 all 96.00 100.00 0.00 0.00 0.00 10:48:31 all 0.00 100.00 0.00 0.00 0.00 0.00 02:27:31 kbmemfree kbmemused % memused kbswpfree kbswpused % swpused 02:29:31 200948 53228 20.94 530104 0.00 199136 02:31:31 55040 21.65 530104 0.00 198824 02:33:31 IFACE rxpck /s txpck/s rxbyt/s txbyt/s 02:29:31 eth0 738.79 741.66 76025.55 136941.85 02:31:31 eth0 743.30 744.97 76038.82 02:33:31 eth0 136907.77 744.80 745.02 76135.53 02:35:31 eth0 136901.38 742.35 744.34
  
   Analyze kernel code coverage
Obtaining adequate kernel coverage is another responsibility for system stress testing. Although the selected test combination makes full use of the four main resources, it may be a small part of the execution of the kernel. Therefore, you should analyze the coverage rate to ensure that the combination can become a system stress test, rather than a system load generator. Currently, there are two open source code tools to help analyze the code coverage of the Linux kernel:
-Gcov: an open source code tool maintained by the Linux Test Project. This tool analyzes kernel code coverage and reports which rows, functions, and branches are overwritten and how many times they are accessed.
-Lcov: another open source code tool developed by IBM and maintained by the Linux Test Project. This tool consists of a Perl script built on text-based gcov output to Implement HTML-based output. The output includes the coverage percentage, chart, and overview page, allowing you to quickly browse the coverage data. You can find these tools on the Linux Test Project (LTP) homepage (see references for links ).
  
After the gcov module is loaded, all tests running in the system stress test combination must be executed. Although the original system stress test can be executed at the same time, it should be executed in a loop. Each test should be run once until the end, and run one by one. no tests can be run repeatedly. It runs in a single, cyclic manner to reduce the unpredictable and untargeted kernel code execution caused by attempts to balance their loads when running multiple system stress tests at the same time. You should perform gcov analysis after the last test is completed. Since the data needs to be formatted for analysis, how does the lcov tool run? Carries the gcov module.
  
The lcov tool generates a complete HTML tree containing each line of code in the kernel and the data (if any) about how many times each line has been executed ). This tool quantifies the coverage data and generates a percentage number on each part of the kernel and file coverage. The following example shows the sample code coverage output:
  
   Figure 1. gcov output example
    
The lcov maintainer defines "adequate coverage" (green), so this lcov example is just an evaluation. However, the raw data included allows any viewer to make his or her own judgment. After browsing the coverage analysis, the test creator can now modify the Test combination to change and/or increase the number of covered code.
  
   Rating final stress testing
The last step in the method is to verify the system stress test. Execute stress tests on a kernel that is considered stable; normally, the kernel in the release version can meet this requirement, but not always. To perform a stress test for a long time (at least 24 hours recommended), run the sar tool at the same time for the following reasons:
-Running for a long time can help you discover all problems in the combination. Otherwise, these problems may be ignored in a short period of "sniff test.
-The data generated by sar constitutes a baseline for comparison during testing.
  
After a long running, you can determine whether the test combination is a suitable candidate for system stress testing based on all the collected data.
  
   Figure 2. design process Summary
  
The Linux Test Project uses this design method when designing the Linux kernel stress Test script ltpstress. sh. This application combines multiple tests from different aspects of the LTP test suite as well as memory and network transmission load generators. Before execution, the test adjusts the total memory usage according to the number of physical and virtual memory in the system. This test script can be obtained from the LTP test suite (see references ). To ensure accuracy of the results, this script is created under controlled laboratory conditions.
  
IBM Linux Technology Center Test uses this stress Test and other tools and tests as a relatively fast and easy way to help confirm the stability of the Linux kernel release version. To help ensure adequate coverage, tests are performed in both lab conditions and simulated user cases.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.