Line: run the native Linux program on Windows (3): Bash and GCC are available, and the source code is released.

Source: Internet
Author: User

Bash and GCC can both run, and it is another step away from the "available system. I sorted it out todayCode, Put itGoogle CodeIf you are interested, you can download it. If anyone is interested in this, you can leave a message below and play it together.

If we narrow down the scope of the discussion to the X86 platform, the differences between Linux and Windows are much smaller than we think at least in terms of user State. Therefore, if you really want to do this, implementing various * nix features on windows is not as difficult as you think. On the other hand, Windows features can be fully implemented on * nix. This is not an example. The former has cygwin, colinux, and the latter has wine, which are mature projects. The difference between my stuff and cygwin has been mentioned many times. The difference with colinux is that colinux actually has a virtual layer, it will load a hardware Virtual Layer in the Windows Kernel, and then run a * real * Linux Kernel on this virtual layer. For the general architecture, refer to this article. Line does not need any virtual layer. It is ** directly ** to implement Linux system calling in the Windows Kernel. (Of course, this is my final goal, at present, we still need to use the cygwin simulation Layer ). Considering the LinuxProgramBasically, glibc runs on top of glibc, while glibc is purely user-oriented and communicates with the kernel through system calls, if we can implement all the Linux system calls in the Windows Kernel, glibc does not know whether it is running on Windows or Linux.

Line has two core components: Kernel-state int 80 Response Function and user-state elf loader. At present, the implementation of the int 80 response function is very simple, and the 20-line compilation is done, because it is the first stage, all things are implemented in the user State, moving the logic to the kernel is the next phase. The rough code is as follows:

 1  _ Interrupthandler proc
2
3 ; Check for syscall_linexec_handler
4 CMP Eax, 0 deadbeefh ; The first thing for line.exe to run is to set the address of the User-State response function so that the address can be transferred from the time the system calls return. This time, the eax of int 80 is set to deadbeef. This value is not any system call number.
5 JNE Reflect_syscall
6
7 MoV DS: _ Syscallhandlerptr, EBX ; Address for saving the user-State Response Function
8
9 Iretd
10
11
12 Reflect_syscall:
13 Push Eax
14
15 ; Simple sanity check
16 MoV Eax, DS: _ Syscallhandlerptr
17 CMP Eax, 0
18 Je No_handler
19
20 Push EBX
21
22 MoV EBX, dword ptr [esp + 8 ] ; When the user State program calls int 80, this command automatically saves the return address for you
23 MoV Dword ptr [esp + 8 ], Eax ; Replace the return address with our response function in the user State.
24
25 MoV Eax, dword ptr [esp + 8 + 12 ] ;
26 Sub Eax, 4
27 MoV Dword ptr [esp + 8 + 12 ], Eax ;
28 MoV Dword ptr [eax], EBX ; The preceding steps save the old return address. After the user-State response function is finished, it jumps back to this address.
29
30 Pop EBX
31
32 JMP Exit_handler
33
34 No_handler:
35 Pop Eax
36 Push - 38 ; -38 = enosys
37
38 Exit_handler:
39 Pop Eax
40 Iretd
41
42 _ Interrupthandler endp
43
44 End

As shown above, this Assembly changes the original call process. The original interrupt process is probably like this: The program calls int 80 to interrupt-> the CPU automatically pushes the EIP to the stack-> (1) go to the kernel to do things-> after the kernel finishes calling iretd-> return to user State-> automatically restore the EIP. The interrupted process after the change is probably like this: The program calls int 80 to interrupt-> the CPU automatically presses the EIP to the stack (we suppose it is address 1)-> enter the kernel-> (2) change the retained EIP on the stack to the user-State response function (we assume address 2)-> Save address 1-> iretd returns the user State-> restore EIP, in this case, address 2-> go to address 2 to do things-> (3) return address 1 after completion. For the tasks between (2) and (3), the int 80 Initiating program does not know at all. It only knows which command is saved before the interruption, and the command is returned for execution. If things between (2) and (3) are completely unnecessary after the second stage is complete, they will be moved to (1.

Next we will talk about elf loader. We know that the process of creating a process is basically like this: fork creates a new process-> the sub-process calls the exec * function and replaces the existing program with the target program. The replacement process is basically like this: load elf-> parse import table-> load all dependent libraries-> adjust the export function address of various libraries-> Find the elf entry function address-> jump. This entire process is re-implemented in line, because only PE Loader on Windows does not recognize the ELF format, so we have to load the ELF File ourselves. Suppose we are currently running the bash program, and then run the command LS-al. This process is generally like this in Linux: fork-> the parent process waits, sub-process exec * ("ls", argv, ENV) // argv [0] = ls, argv [1] =-Al. On line, it becomes like this: fork-> parent process wait, child process exec * (cmdline.exe ", argv, ENV) // argv [0] = line, argv [1] = ls, argv [2] =-Al. That is to say, a new process such as "ls-Al" is now changed to a new "line LS-Al" process, so that all processes are loaded by line. As for the specific details of ELF loading, no matter whether you are looking at the Linux kernel source code or the source code of the LD-Linux library, you can have a very detailed explanation.

There are two more difficult problems: Memory layout and path. In other words, line.exe is still a PE file loaded by windows. Some memory blocks are occupied after loading (generally 0x40000000 or above ). If our ELF file needs these addresses, it will be troublesome. Fortunately, Elf's first address is generally 0x08040000, far from 0x40000000, which is enough for us to use. But in fact, some of the addresses below 0x40000000 are marked as unexecutable by windows. We must re-mark the required memory block before loading the ELF File.

The path is definitely one of the most disgusting places on Windows ). I remember in my previous blog I once said that I could use the namespace in Windows to simulate a single file system. This is absolutely feasible in the second stage, however, the first phase is difficult because it is all implemented in the user State. Currently, my solution is to change all absolute paths to relative paths. For example, if our line.exe program is placed in the C:/line Directory, when the Linux program accesses the/bin directory, I will append the current directory to the directory and change it to the C:/line/bin directory. The same/lib directory is the C:/line/lib directory, and so on. So far, this mechanism has been running well. As for the/proc and/dev virtual paths, they have not yet been implemented, but cygwin should also be used. The conversion path function is as follows:

 1   Void  Change_path_to_relative (  Char  *  Des,  Char  *  SRC)
2 {
3 Char Root_path [max_path] = { 0 };
4 Char * Slash = NULL;
5 If ( ! SRC | ! * SRC | ! Des ){
6 Return ;
7 }
8 If (SRC [ 0 ] ! = ' / ' ){
9 Strcpy (DES, Src );
10 Return ;
11 }
12 Strcpy (root_path, linexec_exe ); // Let's assume that the directory where line is located is retained at startup.
13 Slash = Strrchr (root_path, ' / ' );
14 If ( ! Slash ){
15 Strcpy (DES, Src );
16 Return ;
17 }
18 * Slash = ' \ 0 ' ;
19 Strcpy (DES, root_path );
20 Strcat (DES, Src );
21 Return ;
22
23 }

The compilation steps and running steps are attached. After a compiled package is attached to the library I copied from readhat 6.0, it will be compressed with 200 m +. Wait for me to find a free network space and then upload and release it.

Compilation DRIVER:

1. Download and install wdk

2. Run the wdk compiling environment and enterSource codeSrc directory

3. Enter the int80 directory

4. Run build-g-C

5. Enter the generated int80.sys to the i386 directory.

6. Run install. BAT as the administrator.

Compile the program:

1. Download and install cygwin

2. Run cygwin to enter the src directory of the source code.

3. Run make

4. If no exception occurs, several DLL and exe files will be generated in the src directory, and all the files will be copied to other directories, such as C:/line.

5. Find cygwin1.dll and cyggcc_s-1.dll files from cygwin's installation directory and copy them to your directory.

6. Find the old Linux program and Library (make sure there is no nptl, and I have not thought of a way to implement this ...), Test your directory and keep the directory structure to ensure that the root directory is your directory.

7. Run the cmdline.exe bash command to access your directory.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.