An article on optimizing the startup time of Linux system in foreign language __linux

Source: Internet
Author: User
Tags python script

The translation of the relatively rotten, the record of their own.

Startup time Optimization

Alexander Belloni,michael Opdenacker free electrons

Simple process Information

Simple optimization is divided into the following aspects:

Principle

How to Measure

User Area

Kernel

Start bootloader

Note: Complete the sample image, measure the start time, optimize the startup script, optimize the kernel

Principle:

1. Reducing startup time means first measuring start time

2. You need to select the starting and stopping reference point, that is, the start time start to the end of the point.

Some ideas to reduce startup time:

1. The fastest code is that the code has not been executed

2. A large part of the boot time is to move code and data from storage to ram. Reading less code and data can be faster. I/O operations are time-consuming.

3. The larger the file system, the longer the load time.

4. So the code that doesn't run will make your startup time longer

5. Of course, different storage is not the same, the general SD card faster than NAND.

6. When compiling with GCC, using -0s parameters makes the code smaller, but the code loses some features, which is also a way to do it.

Learning the impact of development boards

Learn how to implement it

Measuring measuring

1. The best instrument is an oscilloscope.

2. Test on the electrical start time, this is a very accurate way of testing.

3. It is very simple to write Gpio port and memory when the system starts.

4. Some oscilloscopes can afford it.

5. Usually you do not want to use the oscilloscope, or do not want to risk the hardware connection.

6. Usually we feed back the startup time information via serial port. Must be connected with some software.

7. A real-time serial port is required, which can be achieved by starting the process.

8. Limit: Can not detect power time on. But you can guarantee that the start of the first phase of time unchanged.

9. The serial port connects using a USB conversion port.

10. Use USB to go to the serial port will lose some time precision.

11. All development boards have this standard USB serial port.

Tim bird:grabserial This order http://elinux.org/Grabserial

A Python script adds timestamp information from the serial console output.

Key advantages: Start counting very early bootstrap and bootloader

Another advantage: run directly on the main device without debugging the upper layer.

Disadvantage: The precision is not enough, can not measure the power time.

A variety of time components

Initialize script init scripts

There are several ways to measure this time in the initialization script to restart the application to start.

1. Open the application at the same time as possible to meet the application must be the conditions.

2. Simplified shell script

3. Start the application and init together or before it.

If you need more detailed descriptions than grabserial you can use the Bootchart

You can use BOOTCHARTD in BusyBox config_bootchartd=y

Start your board, join INIT=/SBIN/BOOTCHARTD at the start command line.

Copy/var/log/bootlog.tgz from your target board to your host analysis

The ordinary Timechart:

Cd bootchart-<version>

Java–jar Bootchart.jar bootlog.tgz

Bootchart Website: http://www.bootchart.org/

If you are using SYSTEMD in the initialization project, you can use Systemd-analyze to see the site

Http://www.freedesktop.org/software/systemd/man/systemd-analyze.html

Start as soon as possible after all dependent services are started:

1. Dependent services in the Init process, which is a slow sysv init script.

2. Init script starts with letter command

3. You can use low numbers in your application.

4. You can replace init with your application.

How fast can we start the application first?

Start all services using startup scripts to eliminate most use/bin/sh

Use Mdev instead of Udev. Mdev is part of the BusyBox. It is not a daemon, so you need to manually load your hot plug drive.

Remove Udev If you indicate that you need a driver file, use DEVTMPFS (CONFIG_DEVTMPFS), the kernel is automatically managed and less resources are used.

Using the Fork/exec system call is very good. This is because, using the shell to execute the call, it is slower.

The result of equal use of echo in the BusyBox shell is a system call

Select the Shells->standalone shell in the BusyBox configuration. This configuration lets the BusyBox script invoke the application at any time.

A pipe is a return call, and of course he is called with Fork/exec. You can improve them or try to use them less in the script.

Reduce size

1. Reduce the implementation of documents and libraries. Removal of elf segments and need for development and debugging. This script command has been provided at the time of the Cross compilation tool Br2_strip_strip when creating the file system.

2. Superstrip more in-depth removal of the use of the implementation of the document.

3. Use Mklibs. Mklibs program to shrink shared libraries He contains regular and special execution documents. Actually use like some large shared libraries like OpenGL and Qt. They often work in the absence of the source code, so you have to be careful when cutting too much.

Using BuildRoot to generate this file system

Using Bootchart to measure startup time

C Library

Glibc

License: LGPL

C Library from Gun project

Design performance, standard compilation, and portable performance

Create all Gun/linux in the main system

Of course, active maintenance

Fit the size of the embedded system.

uclibc

License: LGPL

Lightweight embedded System C library

High-Configurable: Many features can be lost and enabled to configure the interface via Menuconfig

Intelligent working in Linux/uclinux, working in many embedded architectures

No stable (barn, stables) ABI, different from ABI dependent library configuration

Emphasis is on size and performance

Less compile time

UCLIBC (2)

Introduction to a variety of libraries

The comparison of several libraries finally uclibc with Thumb-2 minimum, the compiled file is minimal.

A good idea is to use a small initramfs, just enough to start the application of the minimum requirements, start the final file system using Switch_root.

Use the smallest C library file. UCLIBC use it if he's not yet used in your file system

Using static link applications, Br2_prefer_static_lib in BuildRoot

Don't compress your Initramfs if your kernel is already compressed.

Simplifying User space Scripts

Application

Allows you to track all system calls with application and application child processes

Useful ways to:

1. Know where the time is consumed.

2. For example: It is easy to know that the file opens open (), the file operation read (), write (), and storage space to apply for the time spent, without the need for source code.

3. Find the maximum time consumption.

4. Find unnecessary work in applications and scripts, for example: Open the same file multiple times, or try to open a nonexistent file

Limitation: You cannot track the init process.

Notes: Strace is often used to track system calls and received signals when a process executes. In the Linux world, processes cannot directly access hardware devices, and when processes need access to hardware devices (such as reading disk files, receiving network data, and so on), they must be switched from User state mode to kernel mode to access hardware devices through system calls. Strace can track system calls generated by a process, including parameters, return values, and time spent executing.

Strace can no longer be compiled on the target board.

Relatively easy: Download a ready-mode static binary code to your target board, so you need him

Http://git.free-electrons.com/users/michael-opdenacker/static-binaries/tree/strace

Recommended use of the command:

Strace–f–tt–o Strace.log <program> <arguments>

-F Tracking Child processes

-TT Displays the timestamp using microseconds for precision.

Two modes of operation: Heritage model and perforation pattern

Legacy Model: Low precision, using kernel-driven implementations

Config_oprofile

User Area tools: Opcontrol and oprofiled

Perforation mode: Using hardware performance count

Config_perf_events and Gonfig_hw_perf_events

Using Tools Operf

Use hardware performance count

Config_perf_events and Config_hw_perf_events

User Area tool: Perf. This software is part of the kernel source code, so it must be synchronized with the kernel.

How to use: Perf Record/my/command

Get this result command: Perf

Tool Optimization:

Use one of the most optimized tools on your project. This is the benchmark for your entire project.

Use GCC version 4.6.3 9.63s

Use GCC version 4.7.3 9.26s

This result is very dependent on the system file size and attributes. Here, our general standard when the kernel features and file systems do not take up too much time. Note: Do not compile the file system using GLIBC/EGLIBC without UCLIBC This boot file system size is very different ...

Apply code groups to start using:

Locate the feature that was started in startup, for example, using the –finstrument-functions gcc option.

Create a custom connection script to rearrange these features using a call queue. You can implement their respective functions using their respective block codes via the –ffunction-sections gcc option.

Special applications such as large MTD memory blocks are read. When they read the block, they stop before reading the useless data.

Pre-connect Features:

Pre-connection can reduce time requirements in the beginning of an execution block inside.

This is often used in Android.

This function is to configure the libraries that are known to be preinstalled and to allocate and maintain the required library addresses and symbols when the application is started.

Beware of the impact on security

Code in http://people.redhat.com/jakub/prelink/

Tracing and describing the main application using Strace

Kernel optimization

Want to know that initialization takes the longest, initcall_debug configuration to the kernel boot command kernel

The best idea is to add the LOG configuration Config_log_buf_shift to the kernel configuration file. You still need config_printk_time,config_kallsyms.

First of all, our most important thing is to reduce this time without removing the features.

1. The main principle is to use kernel mode

2. Compile anything that is not needed at startup as a module.

3. Two benefits: Kernel becomes smaller loading faster, less initialization code execution

4. Remove some of the not-used features: Config_kallsyms,config_debug_fs,config_bug

5. Design your embedded system with some features: config_slob,config_embedded

Balance the memory reading speed and CPU decompression speed, you need to find different compression mode before the difference between the fastest compression mode.

If you can't compile a function into a module, try Deferred_initcalls. Your kernel will shrink but it cannot perform the same initialization. If you are sure that your application is ready, restart the call before you start reinitialization. Website:

Http://elinux.org/Deferred_Initcalls

Adjust command line

In their respective boot, the kernel has a correction delay time (for this udelay function). This means that a value is a small value and you need to measure it once. Find the LPJ value in the boot log.

Now you can use this LPJ argument directly. 180ms can be saved here.

This is directly added to the u-boot inside the cmd line OK is as follows:

console=ttys0,115200 mtdparts=atmel_nand:256k (Bootstrap) ro,512k (uboot) ro,256k (env), 256k (env_redundant), 256k ( Spare), 512k (DTB), 6M (kernel) ro,-(rootfs) rootfstype=ubifs ubi.mtd=7 root=ubi0:rootfs rw lpj=1314816

After testing, has the effect.

The console output actually takes part of the time. You don't usually need it in a product. It can be lost to the kernel command line by quiet parameters. You can use the DMESG command to view this information.

Multiple processors provide SMP

SMP is fairly slow to initialize.

The up system starts faster.

You can try to heat up other cores after your application.

Practical libraries reduce kernel startup time

Re-compile the kernel and select Initramfs

Use Initcall_debug to find out where the longest time is spent

Reduce the number of such modules

Adjusts the kernel command-line arguments.

Kernel: Last few milliseconds

Optimize the last few milliseconds, you need to remove unnecessary functionality

Config_printk=n and the use of quiet are the same effect.

Try config_cc_optimize_for_size=y this will affect system performance, you need to have a standard, there is a contrast.

Try reducing the initialized RAM using the MEM parameter and less RAM being initialized will reduce startup time.

Module loading, unloading

Block Layer

Network block

USB block

Remove using a.out format

Power Management

config_sysfs_deprecated

Input: Keyboard/mice microphone/touch screen

Config_legacy_pty_count or Pty.legacy_count kernel parameters.

Startup time Optimization

Typically, some of the features of bootloader only need to be developed

1. A variety of different bootloaders in your development board. Give them a try.

2. Assess whether these functions are really needed. Do you need upgrade features

3. Remove startup delay time. U-boot they are configured as Config_zero_bootdelay_check parameters. Doc/readme.autoboot allows you to enter the launch shell

4. Maybe you can skip this startup AT91 example: Http://free-electrons.com/blog/starting-linux-directly-from-at 91bootstrap3/

Now, let's limit some of the features that start bootloader

We can use this IO port to implement the boot that has this function returned. For example, use the Gpio_direction_input and Gpio_get_value commands in the script when starting the upgrade to start the rescue kernel.

Note: This kernel cannot make actual changes but we can't get the exact time through the serial port when we choose the way to boot the kernel.

Warning: Sometimes the kernel relies on bootloader to initialize hardware, so be careful when we remove these features.

You can try using At91bootstrap to start the Linux kernel so that you can remove the second phase of startup. But you will lose the advantage of the main barebox. It uses the CPU's cache to load and decompress the kernel.

Booting the kernel is easy for at91bootstrap3. You just have to configure it to be Linux or LINUX_DT.

Make At91sama5d3xeknf_linux_dt_defconfig

Make

Detailed information to see the site:

http://free-electrons.com/blog/at91bootstrap-linux/

Reduce startup time using Barebox startup

Optimize Barebox

Hardware initialization

Hardware requires time initialization:

1. Voltage stability, crystal oscillator stability

2.200ms to stabilize

3. For a software engineer, you can't do anything during this time

4. All you can do is measure time and ask the hardware engineer if it is possible to improve. However, this will prolong the CPU boot time, which cannot be shortened.

Alternative selection

Most of them are used in mobile phones and PCs. Cannot accept that the device is idle for most of the time or longer.

Also called pause to disk.

Use

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.