Linux acceleration and extreme startup time optimization

Source: Internet
Author: User
After the Linux cutting of the embedded application was completed last time, it still took about 7 s to start Linux. Although barely acceptable, it still did not reach my goal-within 2 s. Moreover, in the actual commercial environment, the requirement for device reliability is "5 9" (99.999%, that is, the OOS time is less than 5 minutes/year ), this means that the Linux Startup (device reset) time is reduced every second, which significantly improves the reliability.
To put it bluntly, how can we optimize the Linux Startup time?

The CELF (the Consumer Electronics Linux Forum) Forum guides us in one direction.

(1) first, tracking and analyzing the Linux Startup Process and generating a detailed startup time report.

The simple and feasible method is to add a timestamp for all kernel information during the startup process through the printktime function, which facilitates summary and analysis. Printktime was one of the earliest Kernel patches provided by CELF and was formally incorporated into the standard kernel in later kernel 2.6.11. Therefore, you may directly enable this function in the new kernel version. If your Linux kernel cannot be updated to Versions later than 2.6.11 for some reason, you can refer to the methods provided by CELF to modify or directly download the patches they provide: http://tree.celinuxforum.org/CelfPubWiki/PrintkTimes

You can simply enable the printktime function by adding "time" to the kernel startup parameters. Of course, you can also choose to directly specify "show timing information on printks" in "kernel Hacking" during kernel compilation to force the timestamp to be added to the kernel information at each startup. This method also has another advantage: You can get all the information of the kernel before parsing the startup parameter. Therefore, I chose the next method.

After completing the preceding configuration, restart Linux and run the following command to output the kernel startup information to the file:

Dmesg-s 131072> ktime

Then, use the script "show_delta" (located in the scripts folder of the Linux Source Code) to convert the output file to the time increment display format:

/Usr/src/linux-x.xx.xx/scripts/show_delta ktime> dtime

In this way, you get a detailed report on Linux Startup time consumption.

(2) then, we will use this report to find out the relatively time-consuming process during startup.

It must be clear that there is no necessary correspondence between the time increment and kernel information in the report. The real time consumption must be analyzed from the kernel source code.

This is not difficult for a friend who is a little familiar with programming, because the time increment is only the time difference between two calls to printk. Generally, during kernel startup, some time-consuming tasks, such as creating hash indexes and probe hardware devices, are printed through printk. In this case, the time increment usually reflects the time consumed by the information-related process. However, in some cases, the kernel starts the corresponding process only after the printk output information is called, in the report, the time consumed for the corresponding kernel information process corresponds to the time increment of the next line. In other cases, the time is consumed in an uncertain period between two kernel information outputs, in this way, the time increment may be completely unable to be reflected through the kernel information.

Therefore, in order to accurately determine the real time consumption, we need to analyze it with the kernel source code. When necessary, for example, in the third case above, You have to insert printk printing in the source code to further determine the actual time consumption process.

The Linux kernel startup analysis after my last reduction is as follows:

Total kernel Start Time: 6.188 s

Key time-consuming part:
1) initialization of core components such as 0.652 S-timer, IRQ, cache, And mem pages
2) 0.611 S-kernel and RTC Clock Synchronization
3) 0.328 S-computing calibrating delay (total consumption of 4 CPU cores)
4) 0.144 S-calibration APIC clock
5) 0.312 S-calibration migration cost
6) 3.520 S-Intel e1000 Nic Initialization

Next, we will analyze and resolve the above parts one by one.

(3) Next, perform specific optimization.

CELF has already proposed a complete set of startup optimization solutions for Embedded Linux for consumer electronic products. However, due to different applications, we can only use some of their experiences for reference, analyze and try your own problems.

Key kernel components (timer, IRQ, cache, Mem pages ......) Currently, there are no reliable and feasible optimization solutions for initialization.

For items 2 and 3 in the above analysis results, CELF has a special optimization solution: "rtcnosync" and "presetlpj ".

The former is easier to implement by shielding RTC clock synchronization during startup or putting the process after startup (depending on the requirements of specific applications for clock accuracy, however, the kernel must be patched. It seems that CELF's current job is only to remove this process, rather than implementing the aforementioned "delay" to process RTC clock synchronization. For this reason, I have not introduced this optimization in my solution for the moment (after all, the time drift it has brought has reached the "second" level), and I will continue to pay attention to it.

The latter skips the actual computation process by forcibly specifying the lpj value in the startup parameter, which is based on the fact that the lpj value does not change without changing the hardware conditions. Therefore, after normal startup, record the "calibrating delay" value in the kernel information, and then you can forcibly specify the lpj value in the following form in the startup parameters:

Lpj = 1, 9600700

The 4 and 5 items in the above analysis results are part of SMP initialization, so they are not in the scope of CELF research (maybe multi-core MP4 will appear in the future ?......), You must be self-reliant. After studying the SMP initialization code, we found that "Migration cost" can also be used as "calibrating delay" to skip the calibration time in a preset way. The method is similar. Add the following to the kernel startup parameters:

Migration_cost = 4000,4000

However, it is troublesome for intel to initialize and optimize Nic drivers. Although it is also open-source, reading hardware drivers is no better than reading General C code, moreover, the "optimization" modification based on such a superficial understanding is hard to maintain. Based on Reliability, I finally gave up this path after both attempts failed. From another perspective, we can use the "parallel initialization" idea of CELF in the "parallelrcscripts" solution to compile the NIC driver into a module independently, load it in the initialization script in sync with other modules and applications to eliminate the impact of probe blocking on startup time. Considering that the application initialization may also use the network, in our actual hardware environment, only eth0 is available for use. Therefore, we need to take the 0.3s of the first network port initialization time into consideration.

In addition to the Optimization points mentioned above in my solution, CELF also proposes some special optimizations that you may be interested in, such:

Shortidedelays-shorten the IDE test duration (my application scenarios do not contain hard disks, so I cannot use them)
Kernelxip-run the kernel directly in ROM or flash (not used considering compatibility)
Idenoprobe-Skip the IDE port of the unconnected device
Optimizercscripts-optimize the linuxrc script in initrd (I used the more concise linuxrc script in busybox)

And other optimization solutions that are still in the hypothetical phase. If you are interested, visit CELF developer wiki to learn more.

(4) optimization results

After the special optimization and redundant reduction of inittab and RCS scripts, the startup time of the Linux kernel is reduced from 6.188 s before the optimization to 2.016 S. If eth0 Initialization is not included, it takes only 1.708 s (the initialization of eth0 can be parallel to system middleware and some applications), basically achieving the set goal. In combination with kexec, it can greatly reduce the reset time caused by software faults and effectively improve product reliability.

If you have any suggestions or questions about Kernel startup time optimization, you are welcome to discuss it with me. :)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.