Assembly program design in Linux

Source: Internet
Author: User
Compilation Program Design in Linux-general Linux technology-Linux programming and kernel information. The following is a detailed description. Abstract: This article describes the advantages and disadvantages of using assembly in Linux, as well as the usage and syntax features of Common Assembly tools. Focuses on NASM.

Introduction:

Assembly Language is a low-level language that is closely related to hardware and operating systems. My PC used DOS before, and now it has developed into WINDOWS 98, while another operating system Linux is also on the rise. The following compares the three operating systems:

DOS
Relatively stable, fast, and insufficient computer performance, without a low graphical interface

WINDOWS 98
Easy to operate, with many applications and good hardware compatibility
Unstable, frequent crashes, slow and high speed

Linux
Excellent performance, very stable, beautiful interface, easy to operate
Lack of support from software vendors and less free Application Software

From the comparison above, we can see that the Linux operating system has a great advantage, and its popularity should be just a matter of time, therefore, how to develop software in Linux is a subject that students in the computer department must learn and study.

The main programming language in Linux is C. At the same time, Linux also supports many other programming languages. assembly language is also included as one of the most important programming languages. It can complete functions that cannot be completed by many other languages. To learn Linux programming, you must learn assembly programming in Linux. Next I will introduce the assembly program design in Linux.

Linux Assembly introduction:

I. Advantages and Disadvantages of assembly language:

Since Linux is written in C, C naturally becomes the standard programming language of Linux. Most people ignore the compilation, and it is very difficult to find information on the Internet. Many problems need to be tried by yourself. In my opinion, it is unfair to treat assembly languages like this. We cannot only see its shortcomings, but also its advantages. The following compares its advantages and disadvantages:

Advantage: assembly languages can express very underlying things.

L can directly access registers and I/O

L The code can be executed very accurately.

L code compilation is more efficient than general compilation systems

L can be used as interfaces of different languages or standards

Disadvantage: assembly language is a very low-level language.

L it is very lengthy and monotonous, and can be realized during programming in DOS.

L prone to bugs and difficult debugging

L code is not easy to maintain

L poor compatibility and close relationship with hardware

In general, the assembly language should be used wherever necessary, as far as possible to write large programs with less compilation, and the inline mode should be adopted.

Ii. assembly language tools:

Commonly used tools under DOS, such as MASM and TASM, cannot be used in Linux. Linux has its own assembly tools and many types of tools. Among them, Gas can be regarded as a standard configuration. Each type of Linux includes Gas, but GAS does not use the Assembly syntax we usually use in DOS. It uses the AT&T syntax format, the syntax format is quite different from that of intel.

To use a syntax similar to DOS, you must use another compilation tool, NASM, which is basically the same as MASM, but there are also many differences, especially when it comes to operating system principles, it is totally different from DOS.

Linux Assembler Program Design:

1. Hello, world!

Almost all languages are started with "Hello, world !" For example, I also use Hello, world! As an example.

; ------------- NASM's standalone Hello-World.asm for Linux --------
Section. text
Extern puts
Global main

Main:
Push dword msg; stash the *** of msg on the stack.
Call puts; call the 'puts' routine (libc ?)
Add esp, byte 4; clean the stack?
Ret; exit.

Msg:
Db "Hello World! ", 0

Compile:
Nasm? F elf hello. asm
Gcc? O hello. o

Note: This program is actually called. The Linux system's puts function is the same as calling the C language function under DOS. Use Extern to declare puts as an external function, press the parameter (msg address) into the stack, and the Call function outputs the result.
Let's look at a program:

Section. text
Global main

Main:
Mov eax, 4; called on 4
Mov ebx, 1; ebx returns 1 to indicate stdout
Mov ecx, msg; the first address of the string is sent to ecx
Mov edx, 14; the length of the string is sent to edx
Int 80 h; Output string
Mov eax, 1; 1 call
Int 80 h; End
Msg:
Db "Hello World! ", 0ah, 0dh
(Compile the same program)

This program is very similar to the DOS program. It uses 80 h interrupt in linux, which is equivalent to 21h interrupt in DOS, because Linux is a 32-bit operating system, therefore, registers such as EAX and EBX are used. However, Linux, as a multi-user operating system, is very different from DOS. It is impossible to write special programs without understanding the operating system and hardware. Next I will introduce the Linux operating system.

II. Introduction to Linux:

The operating system is actually an interface between Abstract resource operations and specific hardware operation details. For a multi-user operating system such as Linux, it must avoid direct access to hardware and mutual interference between users. Therefore, Linux takes over BIOS calls and Port input and output. For more information about Port input and output, see Linux IO-Port-Programming HOWTO. To access hardware through Linux, System Call is required. In fact, many C functions can be called in Assembler programs. The Calling method is the same as that in DOS, in addition, you do not need to link additional library functions when using ASM for assembly.

The main differences between Linux and DOS are memory management, processes (no process concept under DOS), and file systems. Memory Management and process are closely related to assembly programming:

1. Memory Management:

For any computer, its memory and other resources are limited. Linux uses a memory management method called "Virtual Memory" to ensure that the limited physical memory meets the large memory demand of applications. Linux divides memory into easy-to-process "memory pages". When the memory requirements of applications during system operation are greater than the physical memory, in Linux, memory pages that are not currently used can be switched to the hard disk. In this way, idle memory pages can meet the memory requirements of applications, but applications will not notice the occurrence of memory switching.

2. Process

A process is actually a running entity of a specific application. In Linux, multiple processes can be run at the same time. In Linux, multiple tasks are implemented by running these processes in turn at a short interval ". This short interval is called a "time slice". The method that allows the process to run in turn is called "scheduling", and the program that completes the scheduling is called a scheduling program. Through a multi-task mechanism, each process can be considered to only have its own computer, thus simplifying programming. Each process has its own address space and can only be accessed by this process, the operating system avoids mutual interference between processes and the potential harm to the system caused by "bad" programs.

To complete a specific task, you sometimes need to combine the functions of two programs, for example, one program outputs the text, and the other program sorts the text. To this end, the operating system also provides inter-process communication mechanisms to help complete such tasks. Common inter-process communication mechanisms in Linux include signals, pipelines, shared memory, semaphores, and sockets.

Iii. Compilation tools in Linux:

In Linux, the compilation tools can be described as a hundred schools of contention. Unlike in DOS, they all need to be controlled by MASM and TASM. However, every compilation tool in Linux is very different. It is almost impossible to master it all. Next I will introduce several common compilation tools, focusing on NASM and Its Usage and syntax.

1. GCC

GCC is actually a gnu c language product, but it supports Inline Assemble. in GCC, inline assemble is used like a macro, but it is clearer and more accurate than a macro to express the working status of machines.

C is a high generalization of assembly programming, which can reduce the trouble in many compilation, especially in the C compiler of GCC, assemble does not seem to play much role.

2. GAS

GAS is a basic Assembly Tool in various Linux versions, but it uses AT&T's syntax standards and Intel's syntax standards. For DOS programming, it is very difficult to learn. Of course, to be proficient in assembly programming in Linux, it is also necessary to learn GAS. For specific syntax standards, see Using GNU Compiler er.

3. GASP

GASP is an extension of GAS, which enhances GAS's support for macros.

4. NASM

NASM is a compilation tool with the most similar syntax as DOS in linux. Even so, it is quite different from MASM.

L The NASM format is as follows:

Nasm? F -O

For example:

Nasm-f elf hello. asm

Will compile hello. asm into the ELF object file, and

Nasm-f bin hello. asm-o hello.com

Will compile hello. asm into a binary executable file hello.com

Nasm? H

The complete description of the NASM command line is listed.

NASM does not have any output unless an error occurs.

-F is mainly used for aout and ELF in Linux. If you are not sure whether your Linux system should use AOUT or ELF, you can enter File NASM In the nasm directory if nasm is output: ELF 32-bit LSB executable i386 (386 and up) Version 1 indicates ELF. If nasm: Linux/i386 demand-paged executable (QMAGIC) is output, it indicates aout.

L major differences between NASM and MASM:

First, like linux systems, nasm is case-sensitive. Hello And hello are different identifiers. to assemble them into DOS or OS/2, add the UPPERCASE parameter.

Second, the memory operands in nasm are expressed in.

In MASM

Foo equ 1
Bar dw 2
Mov ax, foo
Mov ax, bar
It will be compiled into completely different commands, although they are expressed in the same way in MASM. NASM completely avoids such confusion. It uses the rule that all memory operations must be implemented through. For example, in the preceding example, the bar operation must be written in the following format: mov ax, [bar]. Therefore, the use of offset in nasm is unnecessary (there is no offset in nasm ). The use of [] In Nasm is different from that in masm. All expressions must be written in []. The following two examples are provided to illustrate:

Masm Nasm
Mov ax, table [di]
Mov ax, [table + di]

Mov ax, es: [di]
Mov ax, [es: di]

Mov ax, [di] + 1
Mov ax, [di + 1]

Nasm does not store the variable type. The reason is that the type must be specified for the [] addressing type variable in masm. Nasm does not support LODS, MOVS, STOS, SCAS, CMPS, INS, and OUTS. It only supports specified types of operations such as lodsb and lodsw. There is no assume operation in Nasm, and the segment address is entirely dependent on the value of the stored segment register.

For more information about NASM usage and syntax, see the NASM user manual.

Conclusion:

I think it is no longer possible to compile a large program in Windows/DOS or Linux, and no one would like to do so. In windows, we can use VC. in Linux/Xwindows, we can use C or even C ++ Builder, however, tools such as VC and C ++ Builder try to hide the underlying calls and block the opportunity to become a master, because the compiled program cannot understand its execution process, it makes the most important "predictable" programming very low. Just because of this compilation, there is a need for its existence, and there is a more important reason, as Liang zhaoxin, author of "super solution", said: "The focus of the Compilation Program is not" Compilation ", but debugging the program. theoretically perfect, many details will be encountered during implementation. These problems must be debugged before they can be solved. My programming habit is to write debugging for five days in a day, and "super solution" is debugged rather than written. Debugging involves Assembly issues. It is not thorough or reassuring not to conduct assembly-level debugging.
Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.