Linux Embedded System reduction: Methods and instances (Rev #2) Source PDF: Linux Embedded system Introduction: methods and practices
Author: Liang yuanbiao Lin yingda Liu Jianwen finishing (http://blog.csdn.net/keminlau
)
Key: Linux embedded for Embedded Operating Systems
Department of Information Science, National Jiaotong University
300 No. 1001, Xinzhu University Road
Tel: 03-5712121 Ext. 56667 Fax: 03-5712121 Ext. 59263 {upleong, ydlin} @ cis.nctu.edu.tw contact: Liang yuanbiao
Summary
The rapid development of open source code (Open Source Code) is gradually penetrating into the embedded field. The combination of the two will become an extremely important part of the personal computer era. First, we will discuss how to use Linux as an embedded operating system, through how to perform kernel, daemons, libraries and applications (Applications/utilities) after the size of the four main parts is reduced, it can be configured in an embedded system with flash memory as the storage device. Next, we use the security gateway [1] that provides VPN, firewall, Intrusion Detection System (IDS), and other functions as an example to illustrate how to build a complete embedded Linux system. Finally, we have successfully reduced the number of security gateways with all of the above functions from the RedHat System (about 1.3 gbytes) and the additional RPM Software Package (about 15 Mbytes) to 19 Mbytes. After being compressed to 7 Mbytes, it is placed in a system with 64 MB memory and 8 Mb flash memory.
Keywords: embedded, Linux, downsizing
1. Background
Embedded systems are already virtually integrated into our daily lives. All kinds of digital cameras, personal digital assistants, ADSL for broadband Internet access, or cable data machines depend on program control. In these examples, we can find that they all have specific functions and are stable and very common. This is the power of the embedded system. It uses less resources, has high stability, and has a low price. With many advantages, embedded systems will become an important product technology in the Post-personal computer era. Table 1 lists the differences in resources and functions between General PCs and embedded systems. The storage devices used by embedded systems are flash memory, which is more durable than hard disks but has little capacity. Second, unnecessary devices such as VGA will be removed. The most important thing is that embedded systems have specific functions, so they can be optimized and tested easily, and greatly improve the stability of software in the system.
As for network devices, factors such as comprehensive functions, stability, and interoperability are critical. Prior to the vigorous development of open-source software, most of the market was occupied by commercial software, and the rise of Linux and other operating systems has become one of the best choices in the network field. Table 2 provides a simple comparison between Linux and the operating systems provided by various vendors. We can see that commercial software is expensive and there is a system that collects royalties. In addition, most of the software provided by each vendor is incompatible, and the operating system is also closed. Therefore, the general program does not support the software and relies heavily on the vendor's provision. Embedded Linux provides source code, is compatible with Common Unix system applications, and has full freedom to gain advantages, all of which pose a threat to systems with a long history. However, because it is still in its infancy, the current embedded Linux version is not a very simple system. How to make good use of and allocate resources, integrate general UNIX programs, and cut down the system to be suitable for embedded applications becomes the main goal.
Next, let's take a look at the highly elastic Linux systems, including the main available segments. Then analyze the methods and development tools to be removed to the embedded version. At the end of this article, we will take the security gateway as an example to introduce its functional specifications and the software packages used, and present the remarkable results after the above reduction process and porting to the embedded system.
2. detachable parts
With the continuous development of Linux software and the accumulation of more and more applications, the formation of the same functional program has the advantages of multiple options. What we need is a streamlined system, so we must cut it down. First, we will summarize the huge system into four parts: kernel, daemons, libraries, applications and utilities ). The principle we are cutting is that we do not directly Delete the code to ensure the integrity of the source code. Table 3 lists the directions and objectives of each subsystem reduction.
2.1 Kernel
Because the functional specifications of the embedded system are clearly defined, only necessary options or modules must be left in the Kernel configuration. Other unnecessary options can be discarded. Another method is to use substitution. If it is a network device, you can replace the output of the console with the serial port. This not only saves the hardware cost, but also saves 42kb space in the Linux 2.4 kernel. In addition, the parallel port, plug-and-play, soft drive, optical drive, keyboard, mouse, USB and other drivers can be omitted.
2.2 daemon
Under the influence of open source, in addition to more complete functions of various programs, the support for various software and hardware is also becoming diversified. However, with limited resources, you must redefine the functions required by each program as required, instead of installing software packages that can be directly executed. For example, if only squid is used as the HTTP proxy, its caching function is not used. During compilation, you can directly use the squid configuration program to turn off the support for the file system (Use Option -- enable-storeio = NULL ). In this way, the program code obtained in Squid 2.4.stable1 can save as much as 26% KB (about. Other network-related programs such as GNU Zebra can also turn off IPv6 support to save space.
2.3 Library
When a program uses static or dynamic connections, different features are generated. Static connections reduce overhead during program execution and simplify program code. Dynamic connection saves a lot of space as the number of programs in the shared library increases. Therefore, you must strike a balance between the two based on your needs. If only one program is in use in the library, you can choose to use static connections. When dynamic connections are used, you can use the tool program LDD to check the dependency between the program and the library. With this method, you can find and retain the minimum set of shared libraries required by the system.
2.4 applications and tool programs
In this part, it is more flexible, because such programs are more selective and can be replaced by multiple methods. For example, with support for/proc, you can read or modify the parameter configurations of multiple systems without using cumbersome or uncommon tools. For example, when reading the ARP table, you can use the command # Cat/proc/NET/arp to achieve it, saving the use of the/sbin/arp program. Secondly, in embedded systems, users are no longer general PC users, simplifying the program interface, online help, and even simplifying program functions. Busybox [2] is a representative set of tool programs. It provides file tools, shell, and text processing with a single small program, common Unix functions such as compression programs. Table 4 lists the functions of busybox and tinylogin [3] and their relative sizes to the original tool programs. It can be seen that the full use of tools designed specifically for Embedded Systems is a good way to achieve a win-win situation.
2.5 debug information and symbol table)
After the debugging stage ends, all the information used for debugging in the program can be deleted, because they all provide the function of debugging with source code, so it occupies a lot of space, you can use the tool program strip to delete it. Take IDs's snort daemon and squid daemon as examples, respectively, from 969 K and 670 K to 307 K and 419 K. In this case, we can also scan all execution files to find out the list of symbols that are not used in the shared library and remove them from the library using strip to further reduce the system.
3. How to reduce the traffic by 3.1
At present, there are many distribution versions of Linux (distribution), each with its own advantages and disadvantages. It is not a dream to build a set of Embedded Linux based on appropriate methods. First, from the release version of RedHat 7.1, the version of the server to be fully installed requires a space of more than 1.3 gbytes, which does not include some special software packages. However, among the numerous software, we only need a small part, which may be between 10 and 10 ~ Between 30 Mbytes. The system must be established efficiently. There are two basic feasible methods:
(1) Delete all unnecessary parts from one of the release versions and leave the system we want.
(2) All the functions of the system are re-built according to the specifications.
As mentioned above, we must first have a clear understanding of the system function specifications before we can cut the system. As shown in figure 1, method 1 may affect unnecessary data of hundreds of Mbytes. In method 2, you can build an embedded system from 2 MB to 16 Mb. Therefore, using the reconstruction method is obviously a good choice. before using this method, you must prepare three things --Overall Function Specification
,Software package used
,Specifications of the Target Platform
. In fact, these three are closely related, that is, defining system practices from three different perspectives to make them more comprehensive and accurate.
3.2 Development Environment
After learning about the system to be built, you can use Linux to build embedded devices. First, we must partition the development platform and target platform. The target platform is where the program is finally transplanted and executed. because of limited resources, the development and debugging environments are concentrated on the development platform. Table 5 lists the main development tools on the two platforms.
To establish a basic development environment, you must have a cross-platform development kit, including the compiler [4], connector, and debugger. In addition, you must prepare the programs required to create a file system. On the target platform, you only need to prepare a system boot program, such as etherboot [5] and Redboot [6. This program can be started after obtaining the system image from the network or directly starting the system from the flash memory during the debugging phase. Once started, you can access the Linux operating system and use GDB server as a remote debugging tool.
3.3 Development Process
The development process 2 can be divided into several parts. First, you need to prepare the Linux kernel with the file system where the root directory is located, coupled with daemon programs and applications. After compression, It is packaged into an image file containing the kernel. The target platform obtains the image file through the network or flash memory and decompress it. After the system is started and initialized, It is a machine that uses embedded Linux as the operating system.
When creating a file architecture, you must pay attention to the permission configuration. Because the number of overwriting times of flash memory is limited (about 1 million times), the system will be mounted in read-only mode. Directories such as/var/dev must be read and written, or some temporary files and record files must be recorded, the system memory space can be used to simulate the writable file system (RAM disk),. The write permission is obtained by establishing a symbolic link, and the rest is protected on the flash memory in read-only mode. If a large number of record files must be left after shutdown, it is appropriate to plug-in the hard disk to the system/var/log. A small number of record files can be written to flash memory, but must be buffered. Otherwise, frequent writing will shorten the service life. Similarly, we cannot use swap. If the memory is insufficient, we can consider leaving the system on the flash memory, and adding a hard disk machine as the function of swap disk. In an embedded system, ramdisk is used together with the system, but it is different from the general file system. Because ramdisk uses part of the memory, the available memory is reduced, this may cause misunderstandings about memory usage, so pay special attention to it. Figure 3 shows the memory space distribution when RAM disk is used.
4. Security Gateway instance 4.1 features
Next, we will take the security gateway as an example to illustrate the effectiveness of developing the embedded version based on the above method. A security gateway is a gateway with a firewall, VPN, and intrusion detection system. All are composed of open-source software, including Linux kernel 2.4.7 [7], netfilter (packet filter, port redirect, Nat), squid (URL Filter) [8], tis (Content Filter) [9], FreeS/Wan (VPN) [10], and snort (IDS) [11]. In addition, there are other functions, such as DNS, DHCP server, routing, bandwidth management [12], and web-based configuration.
4.2 hardware Specification
Select an industrial-level computer with flash memory. The CPU is Pentium II 350, 64 mb ram, 8 Mb flash, and 10/100 mbpsnic. The program will first compress and then put it into the flash memory. Therefore, the actual system size can exceed the object size limit of 8 Mbytes.
4.3 main software packages used and Handling Methods
The software used in the main functions of the security gateway is listed in table 6.
4.3.1 kernel Reduction Process
After the source code of the kernel is unwrapped, configure the kernel options (kernel options) (command # Make menuconfig or # Make xconfig) replace the VGA Console (disable config_vt, config _ {vt, VGA} _ console) and use the memory as the hard disk during startup (enable config_blk_dev _ {Ram, initrd }). Finally, you can compile the new kernel (command # Make Dep bzimage ).
4.3.2 daemon process reduction
The squid daemon is used as an example. In addition to the configuration file (squid. in conf), you can further remove some of the cache supported by the program. In version 2.4.stable1, three cache storage formats are provided: UFS, aufs, and null. The last one indicates that the cache is not used, so it meets the requirements. Then, use the configuration program to set (command #./configure -- enable-storeio = NULL -- enable-Linux-netfilter) and use the Linux netfilter module, and finally compile.
4.3.3 library Reduction Process
Here, we all refer to the reduction method for the dynamic library. Because of the sharing, we should consider the entire system rather than a single library. By using the script program, first identify the dependency of the execution file in the system (using LDD, list dynamic dependencies ), then, find out the symbol usage between the execution file and the Library (use command # objdump-t to view the dynamic symbol table) and calculate the result. Finally, based on the record, the program strip is used to remove unnecessary parts of the execution file and library.
4.3.4 tool program Reduction Process
Busybox is used as a tool to replace system tools with simplified functions. First, unbind the source code, select the required functions, and define in the configuration file (busybox-0.51/config. h), such as # define bb_ping, # define bb_sleep, # define bb_route. You can edit the list function, and the final commands are linked to/bin/busybox in the form of symbolic link.
5. Comparison of results
At last, according to the methods mentioned above, different procedures of the security network should be closed and the process should be reduced. Table 7 contains some representative functions or software packages, which are compared on the two platforms respectively, and finally the ratio of reduction. The ratio of software packages that can be reduced is.
It can be seen that after the kernel removes all unused drivers, it will cut the size by 11%. To further scale down, you may need to modify some programs or compile them in modular form. In addition, because Pluto daemon and tis both have a large amount of debugging information, only the Strip symbol method has good results. Part of Snort can reach 73% because MySQL databases are pre-configured in the release version, and debugging information can be removed through reconfiguration. Some of the Web server functions are preset in the release version, which leads to a high ratio of reduction after re-compilation. In terms of the library, it is because it is already the smallest set in use, and its space reduction is only some unused symbol, so there is only a reduction rate of about 24%. Parts listed at the end include unprocessed and unclassified procedural parts. Among them, the reason for not being processed is: the above method is invalid, the program has been very streamlined, and it is used directly to speed up system establishment. Finally, to further reduce the workload, you can directly modify the program code to develop a dedicated Embedded Version (for example, using the APIS provided by busybox ), integrate more functional daemon programs (replace squid with the proxy provided by Apache.
6. Conclusion
Using Linux as an embedded operating system is a very interesting thing, because users and contributors all over the world contribute their own efforts, supported versions are available on all major platforms. However, it is a pity that there is not a good integration environment with numerous resources. In Embedded Linux, although each independent development tool has complete functions and graphic user interfaces, it is quite good. However, this part of the integrated development environment is still in its infancy. Compared with the development environment of Embedded Operating System VxWorks, the function is too simple. Even vendors mainly embedded in Linux, such as Lineo [13] embedix, and montavista [14] hardhat Linux, all provide good development tools, A powerful graphical integration environment is lacking. The integrated environment should focus on the cross-platform development environment, supplemented by the network connection debugging function, dynamically download the execution module, and display the execution status in real time, it will be an important milestone to challenge commercial software. We believe that this is the same problem that open software has always faced. If it is solved, embedded Linux applications will grow rapidly. Apart from what we have seen in this article, you can use a general Linux kernel for implementation, but you can also have other options. For example, uClinux [15] without memory management units (MMU), armlinux [16, 17] supporting various central processors, and linuxppc [18, you can make appropriate choices based on different applications. We hope that you can also use embedded Linux, which can be applied in different fields to achieve unlimited creativity.
7. References
[1] Ying-dar Lin, Shao-Tang Yu, Huan-yun Wei,
Integrating and benchmarking security gateway
Open source firewall, VPN, and IDs, submitted
Publication, August 2001.
[2] The Swiss Army Knife of Embedded Linux,
Http://busybox.lineo.com
[3] The worlds smallest login/passwd/Getty/etc,
Http://tinylogin.lineo.com
[4] crossgcc frequently asked questions,
Http://www.objsw.com/CrossGCC
[5] etherboot home page,
Http://etherboot.sourceforge.net
[6] The RedHat embedded debug and bootstrap
Firmware, http://sources.redhat.com/redboot
[7] the Linux kernel archives, http://www.kernel.org
[8] Squid Web Proxy Cache,
Http://www.squid-cache.org
[9] The firewall Toolkit (fwtk) from tis,
Http://www.fwtk.org
[10] Linux FreeS/WAN, http://www.freeswan.org
[11] The Open Source Network Intrusion Detection
System, http://www.snort.org
[12] iproute2 + Tc notes,
Http://snafu.freedom.org/linux2.2/iproute-notes.html
[13] Lineo, inc., http://www.lineo.com
[14] montavista software, http://www.mvista.com
[15] embedded Linux microcontroller project,
Http://www.uclinux.org
[16] armlinux.org homepage, http://www.armlinux.org
[17] the ARM Linux project,
Http://www.arm.linux.org.uk
[18] the home of the PowerPC GNU/Linux port,
Http://www.linuxppc.org