One of the compatible kernels: How does reactos implement system calling?
Mao decao
Some netizens posted a post on the forum asking me to talk about how reactos implemented system calls. On the other hand, I have talked about how to implement Windows System calling compatible with the kernel last time. Then I will talk about how reactos can implement system calling. This time I will talk about this topic. However, this is obviously not part of the "talking about wine" category, and there is no need to "talking about reactos" again. Therefore, we decided to include all topics other than wine into the "talking about compatible kernel ".
The goal of the reactos project is to develop an open-source windows. It is self-evident that the system call to be implemented is the system call of windows, that is, to faithfully implement the Windows system call interface. This article is not about the windows system call interface, but about how reactos implements this interface, mainly about how applications in the user space enter/exit the kernel, that is, the system space, how to call functions defined on this interface. In fact, reactos enters the kernel and calls the system through the "int 0x2e" command. Although reactos is not windows, its authors may not have seen the source code of windows. However, I believe that reactos code should at least be Code in this regard, it should be very similar to the "original" Windows code, and there must be only a difference in details.
The following uses the system call ntreadfile () as an example to describe how to read the reactos code and how reacos can call the system.
First, Windows applications should call the library functions defined by this interface through Win32 APIs. These library functions are basically implemented in "dynamic Connection Library", that is, DLL. For example, readfile () is a library function defined in Win32 API. Export Open kernel32.dll, you can see that this DLL export function table has readfile (). On the other hand, in Microsoft's VC development environment (Visual Studio) and Win2k DDK, there is a "header file" WINBASE. H, which contains the readfile () interface definition:
Winbaseapi
Bool
Winapi
Readfile (
In handle hfile,
Out lpvoid lpbuffer,
In DWORD nnumberofbytestoread,
Out lpdword lpnumberofbytesread,
In lpoverlapped
);
The keyword winapi before the function name indicates that this is a function defined in Win32 API.
The reactos Code also contains WINBASE. H, which is in the reactos/w32api/include directory:
Bool winapi readfile (handle, pvoid, DWORD, pdword, lpoverlapped );
Obviously, the two are actually the same (or else they are not compatible ). Of course, Microsoft does not disclose the code of this function, but reactos provides an open-source implementation for it. Its code is in reactos/lib/Kernel32/file/RW. C.
Bool stdcall
Readfile (handle hfile, lpvoid lpbuffer, DWORD nnumberofbytestoread,
Lpdword lpnumberofbytesread, lpoverlapped)
{
......
Errcode = ntreadfile (hfile,
Hevent,
Null,
Null,
Iostatusblock,
Lpbuffer,
Nnumberofbytestoread,
Ptroffset,
Null );
......
Return (true );
}
Here we only care about ntreadfile (), so we skipped other code.
Refer. In the reactos code, the ntreadfile () kernel function is defined in reactos/include/ntos/ZW. h, the same definition also appears in reactos/w32api/include/DDK/winddk. h:
Ntstatus
Stdcall
Ntreadfile (
In handle filehandle,
In handle event optional,
In pio_apc_routine userapcroutine optional,
In pvoid userapccontext optional,
Out pio_status_block iostatusblock,
Out pvoid buffer,
In ulong bufferlength,
In plarge_integer byteoffset optional,
In Pulong key optional
);
The corresponding implementation is in reactos/ntoskrnl/IO/RW. C.
On the surface, this seems quite normal. readfile () calls ntreadfile (), and reactos/ntoskrnl/IO/RW. C provides it with the called ntreadfile (). However, it is incorrect to think about it carefully. This readfile () runs in the user space, while the code in reactos/ntoskrnl/IO/RW. C is in the kernel and in the system space. Can a user space program directly call functions in the kernel in this way? In that case, what are the traps and calling mechanisms? Besides, how can we connect them during compilation?
In this case, we can conclude that there is another secret. For more information, see ntreadfile () in msvc6/iface/native/syscall/debug/ZW. C:
_ Declspec (naked) _ stdcall
Ntreadfile (INT dummy0, int dummy1, int dummy2)
{
_ ASM {
Push EBP
MoV EBP, ESP
MoV eax, 152
Lea edX, 8 [EBP]
Int 0x2e
Pop EBP
RET 9
}
}
It turns out that the user space also has an ntreadfile (), which is executing the self-trapping command "int 0x2e ". Let's take a look at this Assembly Code. Here, 152 is the call number of the System Call ntreadfile (), so when the CPU falls into the system space, the register eax holds the specific system call number. The register edX, after executing the lea command, holds the stack pointer of the CPU on the eve of calling this function, which is actually the starting point of the parameter called in the stack. Windows and Linux have obvious differences in how to pass parameters during system calls. We know that in Linux, parameters are transferred through registers, which has a high efficiency, but the number of parameters is limited. Therefore, the number of parameters called in Linux is very small, when a large number of parameters need to be passed, they are assembled in the data structure, and only the data structure pointer is passed. In Windows, parameters are passed through the stack. As you can see above, readfile () has nine parameters when calling ntreadfile (). These nine parameters are pushed into the stack, edX points to the starting point of these parameters in the stack (the lowest address ). In this function, we do not see any operations on the parameters passed through the stack, nor add other parameters to the stack, therefore, the nine parameters passed down are passed down without any modification (as the int 0x2e self-trapping parameter ). In this way, when the CPU falls into the kernel, EDX still points to these parameters in the user space stack. Of course, the stack after the CPU enters the kernel is the system space stack, rather than the user space stack, so we need to use functions such as copy_from_user () to copy these parameters from the user space, the edX value can be used as the source pointer. As for the Register EBP, it is used as the "stack framework" pointer when calling this function.
When the kernel completes the specific system call operation and the CPU returns to the user space, the next command is "Pop EBP", that is, to restore the stack framework pointer of the previous function. Then, the "RET 9" command causes the CPU to return to the previous function, and adjusts the stack pointer so that it can skip nine call parameters on the stack. In the "authentic" x86 assembly language, the value used in the RET command is in bytes, so it should be "RET 24 h", but here it is a 4-byte long word, this is because different compilation tools are used.
The caller of the subroutine can push the parameter into the stack and pass the parameter to the caller through the stack. However, who is responsible for clearing the parameters from the stack when the CPU returns from the subroutine? Obviously, either the caller is responsible or the caller is responsible. Here we need an agreement to ensure that the caller and the called are consistent. In the above ntreadfile () function, we can see that the caller assumes this responsibility and is adjusting the stack pointer. The _ stdcall in front of the Function Code illustrates this. Similarly, stdcall is added before the definition (Declaration) of ntreadfile () in the. h file to illustrate this convention. The book "untitled ented Windows 2000 secrets" (p51-53) provides a large description of similar conventions for your reference. On the other hand, in the Code of the above function, the number of function call parameters is 3 rather than 9. However, you can see from the code that these parameters are not used at all, and the caller, that is, the preceding readfile (), also calls ntreadfile () based on nine parameters. Therefore, the three parameters here are completely virtual. It doesn't matter whether or not there are, or if there are a few parameters. No wonder the Code calls it "Dummy ".
The ntreadfile () in the user space indicates the kernel function ntreadfile () upwards, and the down indicates the function that wants to call the kernel function ntreadfile (). Here it is readfile (); however, it does not provide any additional functions. Such intermediate functions are called "stub ".
Of course, this reactos approach can easily confuse readers. In contrast, the Linux practice is clearer. For example, the application calls the database function write (), and the corresponding function in the kernel is sys_write ().
So why does reactos do this? I can only guess:
In the source code of (1).windows, for example, depends.exein ntdll.dlland ntoskrnl.exe, you can see the functions named ntreadfile (), while reactos is based on Huludao.
(2) As a development path, reactos may not divide user space and system space in the initial stage, and all code runs in the same space. Therefore, applications can directly call functions in the kernel. In this way, for example, the development of file systems can be easier and easier. Then, after some major functions are developed, we will divide the user space and system space and add how to cross the space layer. From the ZW. c file in the native/syscall/DEBUG directory, it seems that reactos is in the process of taking this step.
(3) the authors of reactos may intentionally apply it to embedded systems as well. Embedded systems usually connect applications and kernels to the same executable image without dividing user space and system space. In this way, if you need to compile the code into an embedded system, Stub will not be used. If you want to compile the code into a desktop system, then you can add stub to the user space and add it to the kernel to process the self-contained command "int 0x2e.
In Windows, the stub function ntreadfile () is in Ntdll. dll. In fact, all the stub functions called by the 0x2e system are in this DLL. Obviously, all stub functions called by the system have the same style. The difference is only the number of system call numbers and parameters. Therefore, reactos uses a tool to automatically generate these stub functions. The code of this tool is in msvc6/iface/native/genntdll. C. The following is a piece:
Void write_syscall_stub (File * Out, file * out3, char * Name, char * name2,
Char * nr_args, unsigned int sys_call_idx)
{
Int I;
Int nargbytes = atoi (nr_args );
# Ifdef parameterized_libs
......
# Else
Fprintf (/"// N // T. Global _ % S // N // t/"/N ", name );
Fprintf (Out, "/". Global _ % S // N // t/"/N", name2 );
Fprintf (Out, "/" _ % s: // N // t/"/N", name );
Fprintf (Out, "/" _ % s: // N // t/"/N", name2 );
# Endif
Fprintf (Out, "/T/" pushl/T % EBP // N // t/"/N ");
Fprintf (Out, "/T/" movl/T % ESP, % EBP // N // t/"/N ");
Fprintf (Out, "/T/" mov/T $ % d, % eax // N // t/"/N", sys_call_idx );
Fprintf (Out, "/T/" Lea/T8 (% EBP), % edX // N // t/"/N ");
Fprintf (Out, "/T/" int/T $ 0x2e // N // t/"/N ");
Fprintf (Out, "/T/" popl/T % EBP // N // t/"/N ");
Fprintf (Out, "/T/" RET/T $ % S // N // t/");/n", nr_args );
......
}
'/T' in the Code indicates the Tab character. It should be fine for readers to read this code. This Code uses parameters such as name, nr_args, and sys_call_idx to generate the compilation code of the stub function for a given system call. So where do these parameters come from? In the reactos/tools/NCI directory of the reactos code, there is a file sysfuncs. lst. The following are the lines extracted from this file:
Ntacceptconnectport 6
Ntaccesscheck 8
Ntaccesscheckandauditalarm 11
Ntaddatom 3
......
Ntclose 1
......
Ntreadfile 9
......
The ntacceptconnectport is the system call ntacceptconnectport () with the call number 0. It has six parameters. The other system calls ntclose () with only one parameter. The ntreadfile () has nine parameters, which are exactly the 153rd rows in the Table. Therefore, the call number is 152.
Once the user space program executes int 0x2e, the CPU enters the system space. The physical process here is not much to mention. If you need it, you can refer to "scenario analysis" or other relevant materials. Here I will talk about how the CPU enters the int 0x2e self-trap handler.
Like other interrupt vectors, reactos sets the int 0x2e vector in its initialization program keinitexceptions (). The Code of this function is in reactos/ntoskrnl/ke/i386/exp. C:
Void init_function
Keinitexceptions (void)
/*
* Function: initalize CPU Exception Handling
*/
{
......
Set_trap_gate (0, (ulong) kitrap0, 0 );
Set_trap_gate (1, (ulong) kitrap1, 0 );
Set_trap_gate (2, (ulong) kitrap2, 0 );
Set_trap_gate (3, (ulong) kitrap3, 3 );
......
Set_system_call_gate (0x2d, (INT) interrupt_handler2d );
Set_system_call_gate (0x2e, (INT) kisystemservice );
}
Obviously, the int 0x2e vector points to kisystemservice ().
Reactos strives to be consistent with windows in the naming and definition of its kernel functions, so there are also functions prefixed with KE and Ki in the reactos kernel. The prefix ke indicates that it belongs to the "kernel" module. Note that the so-called "kernel" module in Windows is only part of the kernel, rather than the entire kernel. I will discuss this in "talking about wine" later. The prefix Ki refers to the functions related to interrupt response and processing in the kernel. Kisystemservice () is an assembly program which serves as system_call () in the Linux kernel. This Code is in reactos/ntoskrnl/ke/i386/syscall. S. Due to space limitations, I will not detail all the code of this function in this short article, but I will explain some important joints in segments. Generally, readers who can understand the system_call () code in the Linux Kernel should be able to read this function at least in general.
_ Kisystemservice:
/*
* Construct a trap frame on the stack.
* The following are already on the stack.
*/
// SS + 0x0
// ESP + 0x4
// Eflags + 0x8
// Cs + 0xc
// EIP + 0x10
Pushl $0 // + 0x14
Pushl % EBP // + 0x18
Pushl % EBX // + 0x1c
Pushl % ESI // + 0x20
Pushl % EDI // + 0x24
Pushl % FS // + 0x28
/* Load PCR selector into FS */
Movw $ pcr_selector, % BX
Movw % BX, % FS
/* Save the previous exception list */
Pushl % FS: kpcr_exception_list // + 0x2c
/* Set the exception handler chain Terminator */
Movl $0 xffffffff, % FS: kpcr_exception_list
/* Get a pointer to the current thread */
Movl % FS: kpcr_current_thread, % ESI
The preceding commands are mainly stored at the storage site, similar to the macro operation save_all in the Linux kernel. The key step here is to get the pointer of the current thread from the address % FS: kpcr_current_thread and store it in the register % ESI. Each thread has a kthread data structure in the kernel, which is equivalent to the process control block (task_struct) in the Linux kernel. In Windows, there are also "process control blocks" in the kernel, but they only remove the information shared by various threads in the process, while "thread control blocks" play a more important role. The pointer to the current thread is the pointer to the kthread data structure of the current thread. When the kernel schedules a thread to run, it stores the address of its kthread data structure in the address % FS: kpcr_current_thread (CPU in system space) the value of % FS is invariably stored in the pcr_selector address (defined as 0x30 ). Additionally, the Win2k kernel maps % FS: 0 to the linear address 0xffdff000 (see "secrets" book p428 ).
In short, from now on, register % ESI points to the kthread data structure of the current thread. So why is this step important for system calls? Let's take a look at several components in this data structure to understand:
Typedef struct _ kthread
{
/* For waiting on thread exit */
Dispatcher_header dispatcherheader;/* 00 */
......
Ssdt_entry * servicetable;/* DC */
......
Uchar previusmode;/* 137 */
......
} Kthread;
Note after each component indicates the relative displacement of this component in the data structure in bytes. For example, the relative displacement of the pointer servicetable is 0xdc. In fact, this pointer is what we are most concerned about at the moment, because it is directly related to the function jump table called by the system. This pointer of each thread points to an ssdt_entry structure array. Since each thread has such a pointer, it means that each thread can have its own servicetable. However, in fact, the servicetable of each thread usually points to the same structure array. Let's wait and see this structure array. Now let's look at the code first.
/* Save the old previous mode */
Pushl % SS: kthread_previus_mode (% Esi) // + 0x30
/* Set the new previous mode based on the saved CS selector */
Movl 0x24 (% ESP), % EBX
Andl $1, % EBX
Movb % BL, % SS: kthread_previus_mode (% Esi)
/* Save other registers */
Pushl % eax // + 0x34
Pushl % ECx // + 0x38
Pushl % edX // + 0x3c
Pushl % DS // + 0x40
Pushl % es // + 0x44
Pushl % GS // + 0x48
Sub $0x28, % ESP // + 0x70
# Ifdef dbg
......
# Else
Pushl 0x60 (% ESP)/* debugeip * // + 0x74
# Endif
Pushl % EBP/* debugebp * // + 0x78
/* Load the segment REGISTERS */
STI
Movw $ kernel_ds, % BX
Movw % BX, % DS
Movw % BX, % es
/* Save the old trap frame pointer where edX wocould be saved */
Movl kthread_trap_frame (% Esi), % EBX
Movl % EBX, ktrap_frame_edx (% ESP)
/* Allocate new kernel stack frame */
Movl % ESP, % EBP
/* Save a pointer to the trap frame in the TCB */
Movl % EBP, kthread_trap_frame (% Esi)
Checkvalidcall:
# Ifdef dbg
......
# Endif
/*
* Find out which table offset to use. Converts 0x1124 into 0x10.
* The offset is related to the table index as such: offset = tableindex x 10
*/
Movl % eax, % EDI
Shrl $8, % EDI
Andl $0x10, % EDI
Movl % EDI, % ECx
/* Now add the thread's base system table to the offset */
Addl kthread_service_table (% Esi), % EDI
Here we are concerned with the last section. First, kthread_service_table (% Esi) is the servicetable pointer of the current thread. The constant kthread_service_table is defined as 0xdc:
# Define kthread_service_table 0xdc
This is obviously consistent with the definition of the previous kthread data structure.
As mentioned above, in general, the servicetable pointer of all threads points to the same structure array, that is, keservicedescriptortable []:
Ssdt_entry
_ Declspec (dllexport)
Keservicedescriptortable [ssdt_max_entries] = {
{Mainssdt, null, number_of_syscils, mainsspt },
{Null, null, 0, null },
{Null, null, 0, null },
{Null, null, 0, null}
};
The size of this array is generally 4, but only the first two elements are used. Here we only use the first element, which is the jump table called by conventional Windows systems.
I have mentioned before that Windows has moved many functions originally implemented in the user space (mainly graphic interface operations) to the kernel to become a kernel module win32k. and add a group of "extended system calls ". The second element of this array is prepared for calling the extended system, but this element is empty in the source code, because of win32k. sys can be dynamically installed, and the specific data structure pointer is entered after installation. The difference between extended system call and conventional system call is that the system call numbers of the former are greater than or equal to 0x1000, while those of the latter are less than 0x1000. Obviously, the kernel needs to determine which jump table should be used based on the specific system call number, or which element in the above array. The size of each element is 16 bytes, so as long as a relative displacement is calculated based on the specific system call number, it plays a role in choosing to use the jump table. Specifically, if the calculated displacement is 0, the general jump table is used, and 0x10 is the expanded jump table.
The above code is doing this. Shift the copy of the system call number (in % EDI) to eight places right, and then the 0x10 phase to achieve this effect. Therefore, the command "addl kthread_service_table (% Esi), % EDI" directs the register % EDI to the expected jump table structure, that is, the ssdt_entry data structure. The author of the Code adds a comment saying "Convert 0x1124 to 0x10", which means: "If the system call number is 0x1124, the calculated relative displacement is 0x10, and the following sentence is "relative displacement = array subscript multiplied by 0x10 ".
The third component in the ssdt_entry data structure, that is, the relative displacement of 8 is an integer. It indicates that there are several pointers in the function jump table, that is, the maximum number of allowed system calls. For general system calls, this integer is number_of_syscils, which is defined as 232 in the reactos code, slightly less than Win2k.
Let's continue to look at the Code:
/* Get the true syscall ID and check it */
Movl % eax, % EBX
Andl $ 0x0fff, % eax
CMPL 8 (% EDI), % eax
/* Invalid ID, try to load win32k table */
JNB kibbtunexpectedrange
/* Users's current stack frame pointer is source */
Movl % edX, % ESI
/* Allocate room for argument list from Kernel stack */
Movl 12 (% EDI), % ECx
Movb (% ECx, % eax), % Cl
Movzx % Cl, % ECx
/* Allocate space on our stack */
Subl % ECx, % ESP
As mentioned in the comments in the code, it starts to check whether the system call number is within the valid range. The comparison object here is obviously number_of_syscall.
As mentioned above, register % edX points to the function call framework on the user space stack, which actually points to the passed parameter. Now we copy this pointer to % ESI, this is in preparation for copying parameters from the user space stack. However, it is not enough to start from a single copy. The length of the copy (the number of bytes), that is, the number of parameters multiplied by 4, must be known that the specific system call has several parameters. This information is stored in an unsigned byte array that is subject to the system call number (so the total length of each system call parameter cannot exceed 255 bytes ), the third component (relative displacement of 12 or 0xc) in the ssdt_entry data structure is the pointer to this array. For general system calls, this array is mainsspt. It is conceivable that the content of this array should also come from sysfuncs. lst. The code first points % ECx to mainsspt, and then adds the system call number in % eax to point to the corresponding elements in the array, and the movb command retrieves this byte. Therefore, % ECx holds the parameter copy length for the given system call. After % ECx is subtracted from the content of % ESP, several bytes are retained on the system space stack. The length is equal to the parameter copy length, in this way, we are ready to copy parameters from the user space stack to the system space stack. Let's look at the following:
/* Get pointer to function */
Movl (% EDI), % EDI
Movl (% EDI, % eax, 4), % eax
/* Copy the arguments from the user stack to our stack */
SHR $2, % ECx
Movl % ESP, % EDI
ClD
Rep movsd
/* Do the system call */
Call * % eax
Movl % eax, ktrap_frame_eax (% EBP)
/* Deallocate the kernel stack frame */
Movl % EBP, % ESP
Previously, register % EDI has pointed to the ssdt_entry data structure called by the general system, that is, to the first component in the data structure. The first component of the ssdt_entry data structure is a pointer pointing to a function pointer array. For general system calls, this is mainssdt. Command "movl (% EDI), % EDI" assigned % EDI to the content indicated by % EDI, so that % EDI originally directed to this pointer now points to mainssdt. This is also an array with the system call number as the lower mark, which is defined:
Ssdt mainssdt [] = {
{(Ulong) ntacceptconnectport },
{(Ulong) ntaccesscheck },
{(Ulong) ntaccesscheckandauditalarm },
......
{(Ulong) ntreadfile },
......
}
In our example, the command "movl (% EDI, % eax, 4), % eax ", that is to say, "Load % eax" to the content where % EDI is added with a relative displacement of 'System call number multiplied 4', so that % eax points to ntreadfile (). The parameter is then copied from the user space stack to the system space stack. Note that the length in % ECx is in bytes, so the two digits to the right should be changed to long characters.
Finally, the "call * % eax" command causes the CPU to enter the ntreadfile () in the kernel. Its code is in reactos/ntoskrnl/IO/RW. C. If you follow the Linux rules, this should be sys_ntreadfile ():
Ntstatus stdcall
Ntreadfile (in handle filehandle,
In handle event optional,
In pio_apc_routine apcroutine optional,
In pvoid apccontext optional,
Out pio_status_block iostatusblock,
Out pvoid buffer,
In ulong length,
In plarge_integer byteoffset optional,
In Pulong key optional)
{
......
}
The interface for calling this function is the same as that used by the application to call the system in the user space, the nine parameters pushed by the application into the user space stack have been copied to the appropriate position in the system space stack. Therefore, for this function, it is like its caller. In our scenario, it is readfile (), just like in system space.
Return to the assembly code above. When the CPU returns from the target function, the register % eax holds the return value of the function. This is returned to the user space, so it is saved in the stack framework.
The following is the process of returning the kernel to the user space. I leave the code to the readers for their own research. However, you need to give the following message:
(1) APC in the Code refers to "asynchronous procedure call", which is equivalent to signal in Linux.
(2) Windows divides the running status of the kernel into several levels. The highest level is that hardware interruption is not allowed (lower-level hardware interruption is not allowed); Second (level 2 and level 1) Is that process scheduling is not allowed (but hardware interruption is allowed ), DPC (level 2, equivalent to BH function) and APC (level 1, equivalent to signal) should be executed under the condition of scheduling prohibition; the lowest (Level 0) is to allow process scheduling.
(3 ). from the kernel, you can also use _ kisystemservice () to call the system (but it must pass through a kernel version of the stub function). Therefore, you need to detect and distinguish the CPU from entering _ kisystemservice () in the code () the previous running mode, and the kthread data structure of the thread also has a component previusmode, which is used to save this information. Kthread_previus_mode (% Esi) points to the previusmode of the current process.
Kereturnfromsystemcall:
/* Get the current thread */
Movl % FS: kpcr_current_thread, % ESI
/* Restore the old trap frame pointer */
Movl ktrap_frame_edx (% ESP), % EBX
Movl % EBX, kthread_trap_frame (% Esi)
_ Kiserviceexit:
/* Get the current thread */
CLI
Movl % FS: kpcr_current_thread, % ESI
/* Deliver APCs only if we were called from user mode */
Testb $1, ktrap_frame_cs (% ESP)
Je kirostrapreturn
/* And only if any are actually pending */
Cmpb $0, kthread_pending_user_apc (% Esi)
Je kirostrapreturn
/* Save pointer to trap frame */
Movl % ESP, % EBX
/* Raise IRQL to apc_level */
Movl $1, % ECx
Call @ kfraiseirql @ 4
/* Save old IRQL */
Pushl % eax
/* Deliver APCs */
STI
Pushl % EBX
Pushl $0
Pushl $ usermode
Call _ kideliverapc @ 12
CLI
/* Return to old IRQL */
Popl % ECx
Call @ kflowerirql @ 4
Kirostrapreturn:
/* Skip debug information and unsaved REGISTERS */
Addl $0x30, % ESP // + 0x48
Popl % GS // + 0x44
Popl % es // + 0x40
Popl % DS // + 0x3c
Popl % edX // + 0x38
Popl % ECx // + 0x34
Popl % eax // + 0x30
/* Restore the old previous mode */
Popl % EBX // + 0x2c
Movb % BL, % SS: kthread_previus_mode (% Esi)
/* Restore the old exception handler list */
Popl % FS: kpcr_exception_list // + 0x28
/* Restore final registers from trap frame */
Popl % FS // + 0x24
Popl % EDI // + 0x20
Popl % ESI // + 0x1c
Popl % EBX // + 0x18
Popl % EBP // + 0x14
Add $4, % ESP // + 0x10
/* Check if previous CS is from user-mode */
Testl $1, 4 (% ESP)
/* It is, so use fast exit */
Jnz fastret
/*
* Restore what the stub pushed, and return back to it.
* Note that we were called, so the first thing on our stack is the ret eip!
*/
Pop % edX // + 0x0c
Pop % ECx // + 0x08
Popf // + 0x04
JMP * % edX
Intret:
Iret
Fastret:
/* Is sysexit supported/wanted? */
CMPL $0, % SS: _ kifastsystemcalldisable
Jnz intret
......
Readers familiar with Linux know that the CPU should call functions related to process (thread) scheduling before returning to the user space, so they will expect such operations in this code, but they will not. However, this operation is actually hidden in the kflowerirql () function.
Readers who have understood this function should now know what we are going to do. However, our goal is not to accumulate and coordinate kisystemservice () and Linux system_call (), but to integrate the former into the latter. Besides, even if kisystemservice () is copied, kflowerirql () cannot be copied because the program calls kflowerirql. If so, it is necessary to accumulate the entire reactos kernel into the Linux kernel. It can be seen that we need to refer to and learn from the implementation of the reactos kernel, and study how to integrate and graft it into the Linux kernel. This is of course a challenging task.