Recently, some netizens have repeatedly mentioned an open source project kernel-win32 designed to move wineserver into the kernel; some questions and themselves, hoping to make some analysis and explanation of its code, also asked about the relationship between compatible kernels and this project. So from the beginning of this talk about kernel-win32.
First, compatible kernel projects should learn nutrition from all (available) Open-source projects. Sometimes, they may even adopt "tailism". They are all open-source projects and only need to comply with relevant regulations. In this sense, we must learn from the kernel-win32, may also want to "take" some. However, this learning and the trade-off must be based on objective analysis and be consistent with our ultimate goal. I believe that after reading a few articles from this article, I will understand why I listed wine, reactos, and ndiswrapper as three main sources of kernel compatibility, and did not list kernel-win32 as one of the main sources.
In general, the kernel-win32 has moved some of the features and mechanisms originally provided by the wine service process into the Linux kernel, specifically (as far as the version is currently seen) There are some:
1. File Operations.
2. semaphore operation.
3. mutex operation.
4. Event operation.
5. waitformultipleobjects () system call as a means of synchronization.
All these mechanisms and functions have a common basis, that is, the implementation of various kernel "objects" and their handle. Because opened objects belong to the resources of processes and are shared by all threads in the same process, they are related to the implementation and management of processes and threads.
In addition, kernel-win32 also provides a more efficient RPC mechanism than wine, to improve the efficiency of application process and wine service process communication.
However, the implementation of kernel-win32 is not complete, and even does not constitute a part of the integrity, and the part that has been achieved is a relatively difficult part, the adopted scheme is also worthy of scrutiny.
Specifically, the kernel-win32's goal is only to improve wine efficiency, so it does not involve device drivers. Compared with the ultimate goal of kernel compatibility, the two have a "same path" relationship in a short journey. In our perspective, kernel-win32 is undoubtedly in the correct direction, but go too close after all.
The Kernel-win32 code is roughly divided into three parts. The first part is the Linux kernel code patch, under its kernel directory. The second part is its own code, which needs to be dynamically installed into the kernel as a module. The specific code files are all under its root directory. The third part is some test/demo programs, which run on wine as application software, or partially bypass wine and run in parallel with wine, in particular, it also includes a library program win32.c used to start the system call. These programs are all in its test directory. Logically speaking, the Library Program win32.c belongs to the kernel-win32, and the test/demo program is not; as if the Linux kernel and libc belong to the Linux, the program used to test/demonstrate its functions is not the same. In addition, to help debugging, The strace directory in the Code also includes patches and extensions for a Linux debugging tool strace. This tool can be used to track Linux system calls made by application software, and display the parameters and kernel return values of the tracked application software for specific system calls in real time. This is of course helpful for debugging, but it is not logically part of the kernel-win32.
Next I start from the kernel-win32 code, it is a brief introduction and analysis of all aspects, this article first introduces the Object Management of kernel-win32, that is, the implementation of object and handle, in fact, it will inevitably involve processes and threads.
Implementation of object and handle
I once mentioned that Linux regards a device as a file, so "file" is a broad concept, while windows further regards a file as an "object )", "object" is a broader concept. While handle indicates the opened object, although it is similar to opening a file number in the physical sense (essentially a subscript), it has many different features, therefore, the two cannot be confused. To "graft" the file system mechanism of Linux (kernel) to the Windows system call interface, you must prepare the next place for each process running the windows application ("Windows Process" or "Wine Process, it is used to maintain an "open object table" similar to "open object table ". On the other hand, Windows "processes" and "Threads" are different from their Linux counterparts, so they have different data structures. Therefore, an additional data structure must be provided for each Windows process, as a supplement to the Linux "Process Control Block", that is, the task_struct data structure, and establish a connection between the two, such as adding a pointer to the task_struct structure.
Kernel-win32 is essentially doing this, we may wish to take a look at it on the task_struct structure of the patch (to facilitate reading, has made some sort ):
Struct task_struct {
......
-Spinlock_t alloc_lock;
+ Rwlock_t alloc_lock;
+ Struct list_head ornaments;
};
Alloc_lock is the last component in this data structure. Now we have changed its type from spinlock_t to rwlock_t, this is because the original lock only targets the multi-processor structure and prevents conflicts between different processors. Now the range is expanded. The added component is a double-stranded queue header called "ornaments ". The term ornament originally meant "decoration, ornaments". Here it is extended to "Accessories, supplements.
So what is the data structure to link to this queue? Here is the task_ornament data structure:
Struct task_ornament {
Atomic_t to_count;
Struct list_head to_list;
Const struct task_ornament_operations * to_ops;
};
Here, to_count is clearly a count. When the count is 0, it indicates that the data structure no longer has a "user" and thus can be undone. The queue header to_list is obviously used to mount the data structure to the ornaments queue. Therefore, the substantive component here is the pointer to_ops, which points to a task_ornament_operations data structure, which mainly includes some function pointers. Currently, the kernel-win32 only defines a task_ornament_operations data structure, that is, wineserver_ornament_ops, which will be discussed later.
The task_ornament data structure can be another component in winethread. Therefore, the data that is attached to the ornaments Queue (except in special cases) is actually a winethread data structure. In this case, the task_ornament data structure acts as a "connector.
Struct winethread {
# Ifdef wine_thread_magic
Unsigned wt_magic;/* Magic Number */
# Endif
Struct task_ornament wt_ornament;/* Linux task attachment */
Struct task_struct * wt_task;/* Linux task */
Object * wt_obj;/* thread object */
Struct wineprocess * wt_process;/* wine process record */
Struct list_head wt_list;/* process's thread list */
Enum winethreadstate wt_state;/* thread state */
Unsigned wt_exit_status;/* thread exit status */
Pid_t wt_tid;/* thread ID */
};
Of course, each winethread data structure represents a wine thread, that is, a Windows Thread.
In the Linux kernel, The task_struct data structure represents a process or thread, which is the object of kernel scheduling. The wine thread must be scheduled to run on a task_struct data structure, that is, a Linux thread or process. In turn, the task_struct data structure (if it represents a wine thread), as a scheduled running unit, cannot represent multiple wine threads, otherwise, these wine threads will be merged into a scheduling unit. Therefore, there should be a one-to-one relationship between the two. Since it is a one-to-one correspondence relationship, we should use pointers instead of queues to establish mutual connections. Now that a queue is used and there can only be one wine thread in the queue, other members (if any) in the queue must be something else. Logically, many members in the same queue have equal relations, but what can be equal to the thread? So it is worth pondering. I will discuss this issue later.
Several components in the winethread structure need to be described:
The pointer wt_obj points to an object pointer. In Windows, threads and processes are all "objects" and must have an object data structure as the representative. The following describes the object data structure.
The pointer wt_task points to the task_struct data structure of the current process (thread, in this way, a two-way connection can be established between a wine thread and the Linux thread it implements (the other direction is to follow the ornaments queue of the Linux thread ). The other pointer wt_process points to a wineprocess data structure. Apparently, the wineprocess data structure represents the widows process.
In the Linux kernel, the process does not have a data structure independent of the thread, which is represented by task_struct. A process becomes a process (break with its parent process through execve () and other calls). It can also be said to be the first thread in the process. Later, the process creates sub-processes through fork () and other calls. At the beginning of the creation, the sub-process shares space with the parent process, and thus is a thread. Later, the sub-process uses execve () to call another portal, and has its own space to become a process. However, this is not true for Windows. In Windows. A process and the first thread of the process (and other threads) are two different concepts with different data structures. In general, processes represent resources, especially a user space, while threads represent context. For example, the process is like a stage and a script, and the thread is the process of actors and their performances. In the Windows Kernel, the information is divided into different data structures, and in the Linux kernel, all information is stored in the task_struct structure. However, some information about Windows processes and threads is not in or different from the task_struct structure, which is exactly why the task_struct structure needs to be "decorated" and supplemented.
The data structure of wineprocess is defined as follows:
Struct wineprocess {
Int wp_magic;/* Magic Number */
Struct nls_table * wp_nls;/* Unicode-ASCII translation */
Pid_t wp_pid;/* Linux task id */
Enum wineprocessstate wp_state;/* process state */
Struct list_head wp_threads;/* thread list */
Rwlock_t wp_lock;
Struct object * wp_obj;/* process object */
Struct object * wp_handles [0];/* handle map */
};
The queue header wp_threads corresponds to the wt_list in the winethread structure above to form a dual-chain queue between a Windows Process and all windows threads contained in it. As mentioned above, Windows processes are not implemented in a specific Linux Process or thread in concept, so there is no pointer to the task_struct structure in this data structure. But in essence there is still a connection, because in Linux, a "process" is the same as its "first thread. The wp_pid here is the Linux "task id" (that is, the Linux PID). In fact, it is better to change it to the task_struct structure pointer.
By the way, the Windows and Linux processes have different priority settings, while the Linux kernel schedules the priorities based on the priorities recorded in the task_struct data structure, there is a problem about how to convert. The wineprocess data structure does not have a record about the process priority. Apparently, the author of The kernel-win32 has not considered this issue.
In Windows, the process is also an "object", so there is also an object structure pointer wp_obj. This is the same as the wt_obj pointer in the winethread structure, except that the two objects have different types. One represents the process and the other represents the thread.
The pointer array wp_handles [] is the essence of this data structure, which is the "open object table" of a Windows process ". Each valid (non-0) pointer in the array points to an object structure. Different object types mean different objects in the corresponding object structure. For example, some represent files, some represent "Events", some represent processes, and so on. The specific pointer subscript in the array (strictly speaking, the converted subscript, as described later) is the handle after the object is opened. In the code, the size of this array is defined as 0, because the size depends on the size of the bucket allocated for the wineprocess data structure. At present, the kernel-win32 always allocates a 4 kb physical page for the wineprocess data structure, after deducting the header of this data structure is used for this array, the size of the calculation is as follows:
# Define maxhandles (PAGE_SIZE-sizeof (struct wineprocess)/sizeof (struct object *))
Therefore, the size of "open object table" is about 1020. This is quite different from Windows (theoretically, the size of the Windows Object table is almost infinite), but it is enough.
I have always said that handle is a subscript. In fact, this is only a conversion subscript in terms of its logic. This is because: first, 0 is not a valid handle value. On the other hand, the handle value is a multiple of 4, reflecting the shift in bytes. If index is a real subscript, the value of handle is (index + 1) * sizeof (Object *).
In short, each process has an open object table shared by many threads contained in the process. Each valid pointer in the table points to an object data structure. The objects in the kernel are like files on the disk. They all have a "lifecycle" from creation to opening, to shutdown, and finally deleted ". Each object is represented by an object data structure, which is defined:
/*
* Object definition
*-Object namespace is indexed by name and class
*/
Typedef struct object {
Struct list_head o_objlist;/* OBJ list (must be 1st )*/
# Ifdef object_magic
Int o_magic;/* magic number (debugging )*/
# Endif
Atomic_t o_count;/* usage count */
Wait_queue_head_t o_wait;/* Waiting process list */
Struct objectclass * o_class;/* object class */
Struct oname o_name;/* Name of object */
Void * o_private;/* type-specific data */
} Object;
The object type is marked by the o_class in its data structure. The Pointer Points to the objectclass data structure and the object type. In addition, each object can have an object name, as if each file has a file name. o_name is used to maintain the data structure of the Object Name.
Obviously, the number of objects in the kernel can be very large. Here is a question about how to find a specific object. Therefore, you must first divide the queue by the object type, and then arrange several queues for each category based on the hash value of the Object Name. The object data structure of a specific object is linked to a queue of the category according to the hash value of its object name. The queue header o_objlist in the structure is used for this purpose.
It is not hard to see that all the elements in the object data structure reflect the commonality of a certain type of object, but do not reflect the individuality of a specific object. Therefore, this structure has a non-typed pointer o_private, which is used to point to a data structure that describes a specific object. When the object type is thread, This Is A winethread data structure; when the object type is file, this is a winefile data structure; so on.
As mentioned above, the pointer o_class in the object data structure indicates the object type. This is a pointer to an objectclass structure. Each objectclass structure represents an object type, which is defined as follows.
Struct objectclass {
Struct list_head oc_next;
Const char oc_type [6];/* type name (5 chars + NUL )*/
Int oc_flags;
# Define ocf_dont_name_anon 0x00000001/* don't name anonymous objects */
INT (* constructor) (Object *, void *);
INT (* reconstructor) (Object *, void *);
Void (* destructor) (Object *);
INT (* describe) (Object *, struct wineserver_read_buf *);
INT (* poll) (struct wait_table_entry *, struct winethread *);
Void (* detach) (Object *, struct wineprocess *);
/* Lock governing access to object lists */
Rwlock_t oc_lock;
/* Named object hash */
Struct list_head oc_nobjs [objclassnobjssize];
/* Anonymous object list */
Struct list_head oc_aobjs;
};
The first queue header oc_next in the structure is used to form an object type queue, that is, the objectclass data structure. The queue header array oc_nobjs [objclassnobjssize] is used to form the hash queue array of the object class. The specific object determines which queue to link to based on the hash value of its object name. The array size objclassnobjssize is defined as 16. The object can also be unknown. All the unknown objects are linked to the unknown queue of the category, that is, the oc_aobjs queue.
The most substantial component in the objectclass data structure is a set of function pointers, especially the constructor, which determines how to construct a specific type of object. The current keenel-win32 defines event_objclass, file_objclass, mutex_objclass, semaphore_objclass, process_objclass, thread_objclass, runtime, section_objclass and other 11 object types of objectclass data structure.
Let's take the simple "semaphore" type as an example to describe how to create an object.
The createsemaphorea () function is the implementation of kernel_win32 in the kernel for system calls with the same name. It aims to create and open a semaphore with the object name (in the kernel) for the current process. Let's skip the system call to go to the kernel step, and check the relevant code from this function.
Int createsemaphorea (struct winethread * filp, struct wioccreatesemaphorea * ARGs)
{
Handle hsemaphore;
Object * OBJ;
OBJ = Createobject (filp, & semaphore_objclass, argS-> lpname, argS, & hsemaphore );
......
Return (INT) hsemaphore;
}/* End createsemaphorea ()*/
The call parameter filp points to the winethread data structure of the current thread, which is provided by the system call Mechanism Implemented by kernel_win32. Another parameter, argS, is a pointer to a wioccreatesemaphorea data structure. The system call Mechanism Implemented by kernel_win32 assembles all Windows System Call parameters in a data structure, and then transmits the starting address of this structure as a parameter to the kernel. To this end, kernel_win32 defines a data structure for each Windows System Call. wioccreatesemaphorea is the data structure defined for the system to call the createsemaphorea () parameter.
The main body of createsemaphorea () is the call to the Createobject () function. Because the object to be created is a semaphore, the starting address of the data structure semaphore_objclass of this object type is also passed as a parameter. The & hsemaphore parameter is used to return handle.
[Createsemaphorea ()> Createobject ()]
Object * Createobject (struct winethread * thread, struct objectclass * clss, const char * Name, void * data, handle * hobject)
{
Struct wineprocess * process;
Struct oname;
Object * OBJ, ** ppobj, ** epobj;
Int err;
* Hobject = NULL;
/* Retrieve the name */
Err = fetch_oname (& oname, name );
If (ERR <0)
Return err_ptr (ERR );
/* Allocate an object */
OBJ = _ allocobject (clss, & oname, data );
If (oname. Name) putname (oname. Name );
If (is_err (OBJ ))
Return OBJ;
/* Find a handle slot */
Process = getwineprocess (thread );
Epobj = & process-> wp_handles [maxhandles];
Write_lock (& process-> wp_lock );
For (ppobj = process-> wp_handles; ppobj <epobj; ppobj ++)
If (! * Ppobj) goto found_handle;
Write_unlock (& process-> wp_lock );
Objput (OBJ );
Return err_ptr (-emfile );
Found_handle:
/* Make link to object */
Objget (OBJ );
* Ppobj = OBJ;
Write_unlock (& process-> wp_lock );
Ppobj ++;/* don't use the null handle */
* Hobject = (handle) (char *) ppobj-(char *) process-> wp_handles );
Return OBJ;
}/* End Createobject ()*/
The operation of this function can be divided into two parts. The first part is the call to _ allocobject () to create a specific object. The second part is to "Install" the pointer to the created object in the "open object table" of the current process, and convert the corresponding subscript to handle. To facilitate reading and discussing, let's first assume that the first part of the operation has been completed and _ allocobject () has returned the pointer to the created object structure. First, let's take a look at the operations on "Open object table, this starts with the comment line "/* Find a handle slot. For putname (), objget (), and objput () functions, the reference count in the data structure is increased or decreased (to 0, the occupied storage space is released ), it does not affect the discussion of substantive operations.
The "Open object table" is in the wineprocess data structure, and the passed-in is just a winethread pointer. So here we need to use getwineprocess () to find the wine process to which the current thread belongs. This is actually just getting its wt_process pointer from the winethread data structure.
After finding the wineprocess data structure of the process, you can use a for loop to scan its "Open object table" to find an idle location. if the pointer is 0, it indicates that the process is idle. This also explains why 0 cannot be used as the handle value. After finding the object, fill in the OBJ pointer of the newly created object to this position. While the handle value is calculated, it can be seen that the pointer's (byte) displacement in the array is increased by 4. In fact, the subscript is added by 1 and then multiplied by 4. Note that the value of handle is returned by calling the hobject parameter.
Return to the first part, that is, the object creation, which is completed by _ allocobject.
[Createsemaphorea ()> Createobject ()> _ allocobject ()]
Static object * _ allocobject (struct objectclass * clss, struct oname * Name, void * Data)
{
Object * OBJ;
......
/* Create and initialise an object */
OBJ = (Object *) kmalloc (sizeof (object), gfp_kernel );
......
Atomic_set (& obj-> o_count, 1 );
Init_waitqueue_head (& obj-> o_wait );
......
/* Name anonymous objects as "Class: objaddr" if so requested */
If (! Name-> name &&~ Clss-> oc_flags & ocf_dont_name_anon ){
......
}
/* Cut 'n' paste the name from the caller's name buffer */
Else {
OBJ-> o_name.name = Name-> name;
OBJ-> o_name.nhash = Name-> nhash;
Name-> name = NULL;
}
/* Attach to appropriate object class list */
OBJ-> o_class = clss;
If (obj-> o_name.name)
List_add (& obj-> o_objlist,
& Clss-> oc_nobjs [obj-> o_name.nhash & objclassnobjsmask]);
Else
List_add (& obj-> o_objlist, & clss-> oc_aobjs );
......
Err = clss-> Constructor (OBJ, data);/* call the object constructor */
If (ERR = 0) goto cleanup_1;
......
Cleanup_1:
Write_unlock (& clss-> oc_lock );
Cleanup_0:
Return OBJ;
}/* End _ allocobject ()*/
First, kmalloc () allocates storage space for the created object. Next, initialize the object structure, including copying the Object Name (and its hash value) to the o_name in the object structure.
Next, if an object name exists, the created object structure is mounted to the corresponding hash queue of the object category based on its hash value. Otherwise, the structure is mounted to the unknown object queue of the object category.