Assembly and C are only one step closer-small talk C language (19)

Source: Internet
Author: User

Author: Chen Xi

Date: 10:50:13

Environment: [Ubuntu 11.04 intel-based x64 gcc4.5.2 codeblocks10.05 at&t Intel assembly]

Reprinted please indicate the source

Q: for example.

A: The purpose of the following code is to calculate the value of 1 + 2, put it in the temp variable, and output:

#include <stdio.h>#include <string.h>#define PRINT_D(longValue)       printf(#longValue" is %ld\n", ((long)longValue));#define PRINT_STR(str)              printf(#str" is %s\n", (str));static void assemble_func(){    int temp;    __asm__("mov $1, %eax");    __asm__("mov $2, %ebx");    __asm__("add %ebx, %eax");  // 1 + 2    __asm__("mov %%eax, %0":"=r"(temp));    // mov the value of register eax to the var "temp"     PRINT_D(temp)               // print temp }int main(){    assemble_func();    return 0;}

Running result:

temp is 3

Q: What is the form of assembly code for assemble_func functions?

A:

  0x08048404 <+0>:push   ebp   0x08048405 <+1>:mov    ebp,esp   0x08048407 <+3>:push   ebx   0x08048408 <+4>:sub    esp,0x24=> 0x0804840b <+7>:mov    eax,0x1   0x08048410 <+12>:mov    ebx,0x2   0x08048415 <+17>:add    eax,ebx   0x08048417 <+19>:mov    ebx,eax   0x08048419 <+21>:mov    DWORD PTR [ebp-0xc],ebx   0x0804841c <+24>:mov    eax,0x8048510   0x08048421 <+29>:mov    edx,DWORD PTR [ebp-0xc]   0x08048424 <+32>:mov    DWORD PTR [esp+0x4],edx   0x08048428 <+36>:mov    DWORD PTR [esp],eax   0x0804842b <+39>:call   0x8048340 <printf@plt>   0x08048430 <+44>:add    esp,0x24   0x08048433 <+47>:pop    ebx   0x08048434 <+48>:pop    ebp   0x08048435 <+49>:ret    

The above assembly is the data obtained by running the disassemble command when debugging and running the assemble_func function. Note that the arrow on the left of the fifth line is the number of running lines in the debugging status.

Q: The Assembly above is embedded in the C code. It is a separate and complete assembly code. How can we implement the hello World function?

A: in essence, we only need to understand the underlying layer in the form of Assembly. C code is no different from assembly in terms of compilation, it's just that the format of the write and the called items seem inconsistent.

The following code implements the standard console output function:

.section .rodatastr:.ascii "Hello,world.\n".section .text.globl _main_main:movl  $4,    %eax    # the number of system call movl  $1,    %ebx    # file descriptor, 1 means stdoutmovl  $str,  %ecx    # string addressmovl  $13,   %edx    # string lengthint   $0x80

Save as hello. S.

Q: How to compile it and use GCC?

A: Of course. However, this file does not need to be preprocessed. It is already in the Assembly format and does not need to be compiled in a narrow sense. It only needs to start from the assembly process.

It can directly generate the target file hello. o

Q: What to do next? Can I execute it directly?

A: Try it.

In this case, add the executable permission to hello. O and then execute:

Q: Why?

A: continue to observe the attributes of the hello. o file.

It can be seen that it is not an executable file. In fact, it is very simple. Hello. O is only the target file and is not linked to an executable file.

Q: Why? The entry symbol _ start is not found. The default LD entry symbol is _ start?

A: Yes. In the code, _ main is used, so the linker should understand that the entry symbol is _ main.

Q: You can run it now. Run:

Hello, world is output. Why is a segment error following it?

A: First, let's take a look at what is returned by the above operation.

The returned value is 139. What does it mean?

Q: Find the system's errno. h header file and related files and obtain all system error codes:

/Usr/include/ASM-generic/errno-base.h file:

#ifndef _ASM_GENERIC_ERRNO_BASE_H#define _ASM_GENERIC_ERRNO_BASE_H#defineEPERM 1/* Operation not permitted */#defineENOENT 2/* No such file or directory */#defineESRCH 3/* No such process */#defineEINTR 4/* Interrupted system call */#defineEIO 5/* I/O error */#defineENXIO 6/* No such device or address */#defineE2BIG 7/* Argument list too long */#defineENOEXEC 8/* Exec format error */#defineEBADF 9/* Bad file number */#defineECHILD10/* No child processes */#defineEAGAIN11/* Try again */#defineENOMEM12/* Out of memory */#defineEACCES13/* Permission denied */#defineEFAULT14/* Bad address */#defineENOTBLK15/* Block device required */#defineEBUSY16/* Device or resource busy */#defineEEXIST17/* File exists */#defineEXDEV18/* Cross-device link */#defineENODEV19/* No such device */#defineENOTDIR20/* Not a directory */#defineEISDIR21/* Is a directory */#defineEINVAL22/* Invalid argument */#defineENFILE23/* File table overflow */#defineEMFILE24/* Too many open files */#defineENOTTY25/* Not a typewriter */#defineETXTBSY26/* Text file busy */#defineEFBIG27/* File too large */#defineENOSPC28/* No space left on device */#defineESPIPE29/* Illegal seek */#defineEROFS30/* Read-only file system */#defineEMLINK31/* Too many links */#defineEPIPE32/* Broken pipe */#defineEDOM33/* Math argument out of domain of func */#defineERANGE34/* Math result not representable */#endif

/Usr/include/ASM-generic/errno. h file:

#ifndef _ASM_GENERIC_ERRNO_H#define _ASM_GENERIC_ERRNO_H#include <asm-generic/errno-base.h>#defineEDEADLK35/* Resource deadlock would occur */#defineENAMETOOLONG36/* File name too long */#defineENOLCK37/* No record locks available */#defineENOSYS38/* Function not implemented */#defineENOTEMPTY39/* Directory not empty */#defineELOOP40/* Too many symbolic links encountered */#defineEWOULDBLOCKEAGAIN/* Operation would block */#defineENOMSG42/* No message of desired type */#defineEIDRM43/* Identifier removed */#defineECHRNG44/* Channel number out of range */#defineEL2NSYNC45/* Level 2 not synchronized */#defineEL3HLT46/* Level 3 halted */#defineEL3RST47/* Level 3 reset */#defineELNRNG48/* Link number out of range */#defineEUNATCH49/* Protocol driver not attached */#defineENOCSI50/* No CSI structure available */#defineEL2HLT51/* Level 2 halted */#defineEBADE52/* Invalid exchange */#defineEBADR53/* Invalid request descriptor */#defineEXFULL54/* Exchange full */#defineENOANO55/* No anode */#defineEBADRQC56/* Invalid request code */#defineEBADSLT57/* Invalid slot */#defineEDEADLOCKEDEADLK#defineEBFONT59/* Bad font file format */#defineENOSTR60/* Device not a stream */#defineENODATA61/* No data available */#defineETIME62/* Timer expired */#defineENOSR63/* Out of streams resources */#defineENONET64/* Machine is not on the network */#defineENOPKG65/* Package not installed */#defineEREMOTE66/* Object is remote */#defineENOLINK67/* Link has been severed */#defineEADV68/* Advertise error */#defineESRMNT69/* Srmount error */#defineECOMM70/* Communication error on send */#defineEPROTO71/* Protocol error */#defineEMULTIHOP72/* Multihop attempted */#defineEDOTDOT73/* RFS specific error */#defineEBADMSG74/* Not a data message */#defineEOVERFLOW75/* Value too large for defined data type */#defineENOTUNIQ76/* Name not unique on network */#defineEBADFD77/* File descriptor in bad state */#defineEREMCHG78/* Remote address changed */#defineELIBACC79/* Can not access a needed shared library */#defineELIBBAD80/* Accessing a corrupted shared library */#defineELIBSCN81/* .lib section in a.out corrupted */#defineELIBMAX82/* Attempting to link in too many shared libraries */#defineELIBEXEC83/* Cannot exec a shared library directly */#defineEILSEQ84/* Illegal byte sequence */#defineERESTART85/* Interrupted system call should be restarted */#defineESTRPIPE86/* Streams pipe error */#defineEUSERS87/* Too many users */#defineENOTSOCK88/* Socket operation on non-socket */#defineEDESTADDRREQ89/* Destination address required */#defineEMSGSIZE90/* Message too long */#defineEPROTOTYPE91/* Protocol wrong type for socket */#defineENOPROTOOPT92/* Protocol not available */#defineEPROTONOSUPPORT93/* Protocol not supported */#defineESOCKTNOSUPPORT94/* Socket type not supported */#defineEOPNOTSUPP95/* Operation not supported on transport endpoint */#defineEPFNOSUPPORT96/* Protocol family not supported */#defineEAFNOSUPPORT97/* Address family not supported by protocol */#defineEADDRINUSE98/* Address already in use */#defineEADDRNOTAVAIL99/* Cannot assign requested address */#defineENETDOWN100/* Network is down */#defineENETUNREACH101/* Network is unreachable */#defineENETRESET102/* Network dropped connection because of reset */#defineECONNABORTED103/* Software caused connection abort */#defineECONNRESET104/* Connection reset by peer */#defineENOBUFS105/* No buffer space available */#defineEISCONN106/* Transport endpoint is already connected */#defineENOTCONN107/* Transport endpoint is not connected */#defineESHUTDOWN108/* Cannot send after transport endpoint shutdown */#defineETOOMANYREFS109/* Too many references: cannot splice */#defineETIMEDOUT110/* Connection timed out */#defineECONNREFUSED111/* Connection refused */#defineEHOSTDOWN112/* Host is down */#defineEHOSTUNREACH113/* No route to host */#defineEALREADY114/* Operation already in progress */#defineEINPROGRESS115/* Operation now in progress */#defineESTALE116/* Stale NFS file handle */#defineEUCLEAN117/* Structure needs cleaning */#defineENOTNAM118/* Not a XENIX named type file */#defineENAVAIL119/* No XENIX semaphores available */#defineEISNAM120/* Is a named type file */#defineEREMOTEIO121/* Remote I/O error */#defineEDQUOT122/* Quota exceeded */#defineENOMEDIUM123/* No medium found */#defineEMEDIUMTYPE124/* Wrong medium type */#defineECANCELED125/* Operation Canceled */#defineENOKEY126/* Required key not available */#defineEKEYEXPIRED127/* Key has expired */#defineEKEYREVOKED128/* Key has been revoked */#defineEKEYREJECTED129/* Key was rejected by service *//* for robust mutexes */#defineEOWNERDEAD130/* Owner died */#defineENOTRECOVERABLE131/* State not recoverable */#define ERFKILL132/* Operation not possible due to RF-kill */#endif

No 139 found.

A: It seems that the system has encountered some strange situations, and the error code is already incorrect. To ensure that the error code 139 does not exist, we Recursively search for the character 139 in the/usr/include directory.

grep -R '139' *

The results are relatively long and are not listed here. The error 139 definition corresponding to the system is still not found.

So let's take a look at the system logs, where the problem may be.

Q: The following command is used to obtain the error message:

At the end of the page, you can see the system log of the hello application program running error. It should be a pointer access error. Is it because the compilation code is too large to properly set the register values such as stack registers?

A: It is very likely that. To make it easier to see where the problem may be, write a C code similar to the function to get its assembly code, and compare it with the above assembly code.

Q: The following hello_1.c code is written:

#include <stdio.h>int main(){    printf("Hello,world!\n");    return 0;}

View its assembly code:

.file"hello_1.c".section.rodata.LC0:.string"Hello,world!".text.globl main.typemain, @functionmain:pushl%ebpmovl%esp, %ebpandl$-16, %espsubl$16, %espmovl$.LC0, (%esp)callputsmovl$0, %eaxleaveret.sizemain, .-main.ident"GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2".section.note.GNU-stack,"",@progbits

Sure enough, it is different from the hello. s code. Here, EBP and ESP are processed during execution, and the leave and RET commands are used. Is it because of them?

A: in practice, whether it is Code such as pushl % EBP or leave or ret, the final execution is still a segment error. I have never understood this point. If anyone knows this, I hope I can give it to anyone. However, you can call the exit system to end the application, so that no segment error occurs. As follows:

.section .rodatastr:.ascii "Hello,world.\n".section .text.globl _main_main:movl  $4,    %eax    # the number of system call movl  $1,    %ebx    # file descriptor, 1 means stdoutmovl  $str,  %ecx    # string addressmovl  $13,   %edx    # string lengthint   $0x80movl  $1,    %eaxmovl  $0,    %ebxint   $0x80

Running result:

Q: When a 0x80 Soft Interrupt is called, where is the parameter saved, is it in the register written above?

A: Yes. In Linux, the function numbers and return values are saved in eax. The parameters are generally less than five, which are passed according to EBX, ECx, EDX, ESI, and EDI. If there are too many parameters, the stack is used. We can see that the above two system calls are using the EBX, ECx, and EDX registers.

Q: What is System Call no. 4? Where can I know?

A: You can see all system calls of the platform in/usr/include/ASM/unistd_32.h or/usr/include/ASM/unistd_64.h. The following is the start part of the unistd_32.h file:

#define __NR_restart_syscall      0#define __NR_exit  1#define __NR_fork  2#define __NR_read  3#define __NR_write  4#define __NR_open  5#define __NR_close  6#define __NR_waitpid  7#define __NR_creat  8#define __NR_link  9#define __NR_unlink 10#define __NR_execve 11#define __NR_chdir 12#define __NR_time 13#define __NR_mknod 14#define __NR_chmod 15#define __NR_lchown 16#define __NR_break 17

As you can see, the number 1 system call is exit, and the number 4 is write, which is exactly what the above Code uses.

Q: How to call C-library functions in assembly?

A: Use the call command, but pass the parameter before calling it. The following code calls the C library printf function:

.section .rodatastr:.ascii "Hello,world.\n".section .text.globl mainmain:pushl$strcallprintfpushl$0callexit

Save as printf. S. Compile:

Run:

Q: Can I use as and LD for assembly and link?

A: Yes. Note that because it uses the C library, you need to specify the link to the C Library:-lC;

Q: The multiplication operation Mul is followed by only one number, and where is the other number?

A: The other number is stored in the Al, ax, or eax registers, depending on whether mulb, mulw, or mull commands are used. The results are stored in Dx and ax in the order of high positions.

Similarly, the division operation div is followed by only one divisor, which is stored in ax, DX: ax or edX: eax. The maximum length of a divisor is only half of the divisor length. The quotient and remainder are determined based on the divisor usage:

If the divisor is in ax, the quotient is in Al, and the remainder is in AH; if the divisor is in eax, the quotient is in ax, and the remainder is in DX; if the divisor is in edX: eax, the quotient is in eax, the remainder is in EDX.

The following is the test code:

#include <stdio.h>#include <string.h>#define PRINT_D(longValue)       printf(#longValue" is %ld\n", ((long)longValue));#define PRINT_STR(str)              printf(#str" is %s\n", (str));static void assemble_func(){    int result_high, result_low;    short result, remainder;   // mul    __asm__("mov $10, %eax");    __asm__("mov $10, %ebx");    __asm__("mull %ebx");    __asm__("mov %%edx, %0":"=r"(result_high));    __asm__("mov %%eax, %0":"=r"(result_low));    PRINT_D(result_high)    PRINT_D(result_low)    // div    __asm__("mov $0,   %dx");    __asm__("mov $100, %ax");   // the divident is dx:ax    __asm__("mov $9,  %bx");    __asm__("div %bx");         // the divisor is bx    __asm__("movw %%ax, %0":"=r"(result));    __asm__("movw %%dx, %0":"=r"(remainder));    PRINT_D(result)    PRINT_D(remainder)}int main(){    assemble_func();    return 0;}

Output result:

result_high is 0result_low is 100result is 11remainder is 1

Q: How does the data comparison command CMP work with JMP-related commands?

A: The CMP command calculates the difference between the two data. If the result is 0, jz is true. If the result is not 0, jnz is true. Example:

#include <stdio.h>#include <string.h>#define PRINT_D(longValue)      printf(#longValue" is %ld\n", ((long)longValue));#define PRINT_STR(str)          printf(#str" is %s\n", (str));#define PRINT(str)              printf(#str"\n");static void assemble_func(){    __asm__("mov $10, %eax");    __asm__("cmp $10, %eax ");    __asm__("jz  end");    PRINT("below jz")    __asm__("end:");    PRINT("the end")}int main(){    assemble_func();    return 0;}

Obviously, JZ will be set up and the output is as follows:

"the end"

Q: In some cases, addition may cause overflow. How can this problem be determined?

A: There is a register in the CPU, which stores the overflow flag of, which can be determined by Jo or jno.

#include <stdio.h>#include <string.h>#define PRINT_D(longValue)      printf(#longValue" is %ld\n", ((long)longValue));#define PRINT_STR(str)          printf(#str" is %s\n", (str));#define PRINT(str)              printf(#str"\n");static void assemble_func(){    __asm__("movw   $0x7FFF,  %ax");    __asm__("movw   $0x7FFF,  %bx");    __asm__("addw   %bx,      %ax");    __asm__("jo     overflow_set");    __asm__("movl   $1,       %eax");    __asm__("movl   $0,       %ebx");    __asm__("int    $0x80");    __asm__("overflow_set:");    PRINT("overflow flag is set...")}int main(){    assemble_func();    return 0;}

Running result:

"overflow flag is set..."

Q: Should I determine the overflow?

A: Using addition as an example, if the number of the two identical symbols is the opposite, it will surely overflow.

Q: What is the difference between the logo space of and CF?

A: Cf indicates the carry mark. Carry is not always overflow. For example, the minimum value of the signed integer plus 1. Although carry is not exceeded. Because the theory of computer complement allows carry, but the result is correct.

#include <stdio.h>#include <string.h>#define PRINT_D(longValue)      printf(#longValue" is %ld\n", ((long)longValue));#define PRINT_STR(str)          printf(#str" is %s\n", (str));#define PRINT(str)              printf(#str"\n");static void assemble_func(){    __asm__("movw   $0xFFFF,  %ax");    __asm__("movw   $0x1,  %bx");    __asm__("addw   %bx,      %ax");    __asm__("je     carry_set");    __asm__("movl   $1,       %eax");    __asm__("movl   $0,       %ebx");    __asm__("int    $0x80");    __asm__("carry_set:");    PRINT("carry flag is set...")}int main(){    assemble_func();    return 0;}

Running result:

"carry flag is set..."

Of course, we can use Jo to test whether the above addition overflows.

#include <stdio.h>#include <string.h>#define PRINT_D(longValue)      printf(#longValue" is %ld\n", ((long)longValue));#define PRINT_STR(str)          printf(#str" is %s\n", (str));#define PRINT(str)              printf(#str"\n");static void assemble_func(){    __asm__("movw   $0xFFFF,  %ax");    __asm__("movw   $0x1,  %bx");    __asm__("addw   %bx,      %ax");    __asm__("jo     overflow_set");    __asm__("movl   $1,       %eax");    __asm__("movl   $0,       %ebx");    __asm__("int    $0x80");    __asm__("overflow_set:");    PRINT("overflow flag is set...")}int main(){    assemble_func();    return 0;}

Execution result:

It has no output, which means that of is not set.

Author: Chen Xi

Date: 10:50:13

Environment: [Ubuntu 11.04 intel-based x64 gcc4.5.2 codeblocks10.05 at&t Intel assembly]

Reprinted please indicate the source

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.