Compatible kernel 25: Windows structured exception handling (2)

Last Update:2018-12-03 Source: Internet

Author: User

Tags case statement

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First look at the call parameters. The exception record block predictionrecord is prepared previously; the pointer predictionframe is null, which is a kexception_frame structure pointer. But in the code, this is only for the PowerPC chip, this is used to store some additional register content. The processor chip in the 386 architecture does not have this requirement, and the trap framework pointer TF points to the framework formed by exceptions on the stack, we call it an exception framework before ". In addition, the previusmode is kernelmode, And the firstchance is true, indicating that the first effort is coming soon.
A large part of the code of this function is used for user space exceptions, because kiusertraphandler () finally calls this function. But now we need to focus on system space exceptions, so we have deleted those codes.
In addition to the previously prepared exception record block, exception handling also requires a lot of information in the trap framework, as well as other information, such as field information about floating point processors, therefore, you must use ketrapframetocontext () to organize and collect the information into a context data structure before handling substantive exceptions. After processing is completed, if you want to return the information from kidispatchexception (), then the original information is updated through kecontexttotrapframe (), because the exception may be changed during troubleshooting.
Some exceptions may occur in the process of program debugging. If so, the debugging program must handle the exceptions first. Some are directly taken by the debugging program, and some are determined by the debugging personnel. For example, exceptions caused by program breakpoints (in fact self-traps) can only be taken by debugging programs and debugging personnel. Therefore, as long as the exception is in the debug state, the first step of exception handling is to be handled by the debug program (debugger. Of course, the kernel debugging is not as simple as the application software, but the truth is the same. First, the kernel image used must be a debuggable version. debugging options are added during compilation and connection. On the other hand, we have to have a debugging tool, which is usually kD, or "kernel debugger )". These two conditions are indispensable.
Kdpenterdebuggerexception () can be used to submit two types of debugging program processing results for this exception:
L The kernel is not in the debugged state, or the debugging program (and the debugging personnel) cannot solve the problem. This exception must be handled by the seh mechanism, and the constant kdhandleexception is returned.
L The debugging program has solved the problem and can continue running without the intervention of seh processing. At this time, the constant kdcontinue is returned.
If kdcontinue is returned, it will jump to the program tag handled. In this case, the content in the context of the data structure may have changed. Therefore, you can use kecontexttotrapframe () to modify the exception framework and then return it. This exception is handled in this way. For the change of the context structure content, you may consider that the debugging personnel may choose to jump the program execution to another point or modify the content of a register, at this time, we need to modify the return address or register image in the context.
In general, the kernel takes three steps to handle system space exceptions:
1. The first step, "firstchance", is to submit the problem to the debugging program. If the problem cannot be solved (kdcontinue is not returned), rtldispatchexception () is called for substantive seh processing. In actual operation, there are always a few debugging tasks. In most cases, the core of the first step is seh processing. The seh mechanism has three possible results for exception handling:
L if it is claimed by a seh framework and implements a long-range jump, the program will not return, and the function call framework at the moment will be crossed and discarded along with other frameworks.
L it is claimed by a seh framework, but it is considered that it can continue running without a long-range jump (for example, you only need to execute the aftermath function ). In this way, the program returns from rtldispatchexception () and the returned value is true. At this point, the problem has been solved, so it is returned through the code under the program tag handled.
L all seh frameworks reject claim, which means that the first step to handle this exception has failed.
2. If the first step fails, perform the second step and submit the problem to the debugging program again through kdpenterdebuggerexception. Note that the values of the last two call parameters are set to false, and the values of the first call are set to true. One of these two parameters is firstchance, which is self-explanatory; the other is GDB, which indicates that other debugging support is required. If this time is successful and the problem is solved, the returned value is kdcontinue. Otherwise, you must take the third step.
3. step 3: in fact, it is no longer feasible. The function of the macro operation kebugcheckwithtf () is to display the error information and "dump" the error scene to the file for later analysis, then, the CPU enters the stopped state.
For system space exceptions, the three steps or three efforts are completed within the kidispatchexception (). If the firstchance parameter is 1, the call is completed in one go, however, when calling this function, you can set firstchance to 0 and skip the first step.
For exceptions that occur in the user space, the scanning of predictionlist is performed in the user space, and there is a similar exception () in the user space (); however, some measures can only be implemented through system space, so they may not be implemented in one breath in the same function.
Apparently, the core of seh processing is to scan the exceptionlist, which is completed by rtldispatchexception (). In fact, most exceptions can be properly handled through this function.
The following is the substantive seh processing implemented by rtldispatchexception. In some documents, this kind of processing is called "frame-based" exception handling, and its foundation is of course the predictionlist.

[_ Kitrap14 ()> kipagefaulthandler ()> kikerneltraphandler ()> kidispatchexception ()
> Rtldispatchexception ()]

Boolean
Ntapi
Rtldispatchexception(In pexception_record exceptionrecord,
In pcontext context)
{
Pexception_registration_record registrationframe, nestedframe = NULL;
......

/* Get the current stack limits and registration frame */
Rtlpgetstacklimits (& stacklow, & stackhigh );
Registrationframe =Rtlpgetexceptionlist();
Dprint ("registrationframe is 0x % P/N", registrationframe );

/* Now loop every frame */
While (registrationframe! = Prediction_chain_end)
{
/* Find out where it ends */
Registrationframeend = (ulong_ptr) registrationframe + sizeof (* registrationframe );

/* Make sure the registration frame is located within the stack */
If (registrationframeend> stackhigh) |
(Ulong_ptr) registrationframe <stacklow) |
(Ulong_ptr) registrationframe & 0x3 ))
{
......
Continue;
......
}

......

/* Call the handler */
Dprint ("executing handler: % P/N", registrationframe-> handler );
Returnvalue =Rtlpexecutehandlerforexception(Predictionrecord,
Registrationframe, context, & dispatchercontext,
Registrationframe-> handler );
Dprint ("handler returned: % P/N", (pvoid) returnvalue );

/* Check if this is a nested frame */
If (registrationframe = nestedframe)
{
/* Mask out the flag and the nested frame */
Predictionrecord-> predictionflags & = ~ Prediction_nested_call;
Nestedframe = NULL;
}

/* Handle the dispositions */
If (returnvalue = predictioncontinueexecution)
{
/* Check if it was non-continuable */
If (predictionrecord-> predictionflags & prediction_noncontinuable)
{
/* Set up the exception record */
Predictionrecord2.exceptionrecord = predictionrecord;
Predictionrecord2.exceptioncode =
Status_noncontinuable_exception;
Predictionrecord2.exceptionflags = prediction_noncontinuable;
Predictionrecord2.numberparameters = 0;

/* Raise the exception */
Dprint ("non-continuable/N ");
Rtlraiseexception(& Exceptionrecord2 );
}
Else
{
/* Return to caller */
Return true;
}
}
Else if (returnvalue = predictionnestedexception)
{
/* Turn the nested flag on */
Predictionrecord-> predictionflags | = prediction_nested_call;

/* Update the current nested frame */
If (nestedframe <dispatchercontext) nestedframe = dispatchercontext;
}
Else if (returnvalue = predictioncontinuesearch)
{
/* Do nothing */
}
Else
{
/* Set up the exception record */
Predictionrecord2.exceptionrecord = predictionrecord;
Predictionrecord2.exceptioncode = status_invalid_disposition;
Predictionrecord2.exceptionflags = prediction_noncontinuable;
Predictionrecord2.numberparameters = 0;

/* Raise the exception */
Rtlraiseexception(& Exceptionrecord2 );
}

/* Go to the next frame */
Registrationframe = registrationframe-> next;
}

/* Unhandled, return false */
Dprint ("false/N ");
Return false;
}

The foundation of seh is the exception handling linked list. It is the node in this queue and its content that makes the long-range jump possible. So at the beginning, we found the exception handling linked list through rtlpgetexceptionlist (), which is of course the pointer exceptionlist in the kpcr structure of the current CPU. This Pointer Points to the first node in the linked list. Its data structure is prediction_registration_record. The data structure is defined as follows:

Typedef struct _ exception_registration_record
{
Struct _ exception_registration_record * next;
Pexception_routine handler;
} Prediction_registration_record, * pexception_registration_record;

This is essentially the same as the _ sehregistration_t structure mentioned in the previous article, except that the structure and the score name are different. The handler here is a function pointer and its class type is pexception_routine, which is also the same as _ sehframehandler_t in the _ sehregistration_t structure. Therefore, the handler of the function pointer is the previously set function pointer ser_handler. The reason for this is that the relevant code is reused for exception handling in the user space, and different names are used for historical reasons. Note that the _ sehregistration_t data structure is the first component in the _ sehportableframe_t data structure. Therefore, a _ sehregistration_t structure pointer is obtained, and a pointer pointing to its _ sehportableframe_t structure is obtained.
As mentioned in the previous article, every data structure in the exception handling linked list is on the stack and is a local volume in the corresponding function call framework. Therefore, rtlpgetstacklimits () is used to obtain the stack location of the current thread (system space), that is, its stacklow and stackhigh. The following compares the data structures in the processing queue first, to confirm the rationality of its location. However, if an exception occurs during DPC processing (Windows DPC is equivalent to Linux's BH or "Soft Interrupt, therefore, because DPC uses an independent stack for processing, it is necessary to adjust stacklow and stackhigh and then compare them again. However, if the comparison result does not match, it cannot continue.
Then, a while loop is used to search and process each node in the predictionlist linked list in sequence, and rtlpexecutehandlerforexception () is used to try it. Each node in the linked list represents a local she frame stack. As mentioned in the previous article, the local she framework stack is a framework formed by a set of seh domains that are not only essentially nested, but also (CODE) form. However, in fact, the seh fields embedded in the form have not been seen in the existing reactos code. Therefore, each node in the predictionlist actually only represents a single seh field. Therefore, for the convenience of narration, as long as it does not cause misunderstanding or conflict with the code, it is assumed that every node in the predictionlist only represents a seh framework (rather than a stack ).
Because the exception handling linked list is a back-to-first-out queue, the first node (data structure) in it represents the recently entered seh framework. If there are more than one node in the linked list, it means there is a seh frame embedding. In the case of nesting, the first node in the queue represents the bottom-layer protection domain. If the node (after the filter function is executed) refuses to claim this exception, this indicates that this seh domain is not targeted at the exception, so we should go up one layer to see if it is the exception targeted at the previous layer of seh domain. So the next node following the exception handling linked list is to run layer by layer until a node claims the exception. After a node claims an exception, it generally executes a pre-defined long-range jump and directly runs into the if statement in that framework, at present, the call framework of this function is discarded due to long-range jump. However, the readers will see that there is an "unwinding" process before the long-range jump, which is to call all the aftercare functions that are crossed the seh framework.
Therefore, under normal conditions, this while loop is doomed to be short-lived. This while loop may end with an exhaustion of the entire queue only when every she framework in the queue refuses to claim the exception or an error occurs during processing.
If rtlpexecutehandlerforexception () is returned, the return value may be as follows:

# Define predictioncontinueexecution 0
# Define predictioncontinuesearch 1
# Define exceptionnestedexception 2
# Define exceptioncollidedunwind 3
The returned value is actually an indication of how to proceed. Therefore, the subsequent program is basically equivalent to a case statement with this condition.
L predictioncontinueexecution: claimed, but not a long-range jump. This indicates that the problem has been solved, or the exception can be ignored. In short, you can continue to execute the original program. However, there is another condition that the program interrupted by the exception can continue to be executed. At this time, we need to check the prediction_noncontinuable flag in the exception record block predictionrecord. If the value is 1, the execution cannot be continued. Therefore, the rtlraiseexception () is used to cause a "soft exception" with the status_noncontinuable_exception type ". If you can continue, the loop is directly ended and the returned result is finally returned from this exception (at this time, the exception framework is still on the stack ). Note that the return value of the function is true, indicating that the problem has been resolved. If the return value is false, the operation fails.
L predictioncontinuesearch indicates that the request is not claimed. You should continue to examine the next node in the queue, that is, the seh framework that has risen to a higher layer. This is the reason for the predictionlist and the while loop here. At this time, there is no need to do anything in this round of loop, just move forward to the next node in the predictionlist queue.
L exceptionnestedexception indicates that rtlpexecutehandlerforexception () is a nested exception, that is, a new exception occurs during exception handling. As you can see below, in order to capture nested exceptions, a temporary "Protection node" must be inserted in the header of the predictionlist linked list before you inspect/process a node in the predictionlist ", after the target node is processed, delete the protected node. In this way, when a nested exception occurs, the scanning/processing of the predictionlist is interrupted, and the new exception must be handled first, and the while loop is entered again. Obviously, the temporary protection node is first investigated, while predictionnestedexception is returned at this time, and a pointer is returned through the dispatchercontext parameter, point to the target node protected by the temporary node, that is, the node that is being processed. In this case, place the local nestedframe in the Code to the protected target node, and set the exception_nested_call flag in the exception record block to 1, then, you can continue to search in predictionlist. This flag and nestedframe are cleared to 0 after the protected node is passed. In this way, as long as the flag in the exception record block is 1, it indicates that it is in the seh framework where nested exceptions occur during processing. Note that the handling of nested exceptions may run farther in the predictionlist, that is, the seh framework of the higher layer. Because the depth of nesting may be greater than 1, the actual code is more complex than above.
L if there are other return values, including predictioncollidedunwind, a serious error occurs. In essence, such an error is equivalent to an exception. In the predictionlist, the corresponding node should be ready for this, but the hardware of the CPU will not cause a hard exception. Therefore, rtlraiseexception () is used to cause a soft exception of the status_invalid_disposition type. This is of course a nested exception because the original exception framework is still running.
For the problem of soft exceptions caused by rtlraiseexception (), we will introduce it in detail in the next article. Here we will give a brief introduction. The so-called soft exception is to simulate an exception by calling a function. In typical cases, each seh domain has a filter function that checks the type of the exception record block to determine whether to claim and handle this exception. The first case of the preceding column is used as an example. Assume that the exception No. 14 is caused by memory access failure and the type is status_access_violation. This operation is processed and the execution is continued; however, the status flag bit indicates that it cannot continue. In this way, the problem is no longer the original access to memory failure, but is changed to another question of the Type status_noncontinuable_exception, which may need to be handled by another seh domain for this exception, therefore, this type simulates an exception and causes the seh mechanism to search for the exceptionlist again. This is the intention to initiate a soft exception.
Finally, if the while () loop ends, it means that all nodes in the exception handling linked list are not prepared for this type of exception, that is to say, the occurrence of this type of exception is not estimated in advance, and no arrangement is made for it. Therefore, false is returned, so that the second step of the previous layer of kidispatchexception () is taken.
Obviously, the key lies in rtlpexecutehandlerforexception (), which is to examine and process specific nodes in the predictionlist. Let's first look at its call interface:

Prediction_disposition
Rtlpexecutehandlerforexception(Pexception_record exceptionrecord,
Pexception_registration registrationframe, pcontext context,
Pvoid dispatchercontext, pexception_handler exceptionhandler );

The first parameter points to the exception record block, which is information about the current (actual) exception. The second parameter points to a node in the predictionlist queue, this is information about the current seh field. The third parameter points to the context data structure when this exception occurs. The fourth parameter dispatchercontext is a pointer and is only valid for temporary protection nodes used for nested exceptions. The last parameter is a function pointer pointing to the Framework processing function provided by the current node. For common nodes, this is _ sehframehandler (), which is set in _ sehenterframe_f.
Let's look at the specific implementation. This is a piece of assembly code.

[_ Kitrap14 ()> kipagefaulthandler ()> kikerneltraphandler ()> kidispatchexception ()
> Rtlpdispatchexception ()> rtlpexecutehandlerforexception ()]

_ Rtlpexecutehandlerforexception @ 2 0:
/* Copy the routine in edX */
MoV edX, offset _ rtlpexceptionprotector
/* Jump to common routine */
JMP _ rtlpexecutehandler @ 20

First, let the register edX point to a function rtlpexceptionprotector (). The function will be shown below. Then jump to the label _ rtlpexecutehandler. In fact, the code below _ rtlpexecutehandler is shared by rtlpexecutehandlerforexception () and another function rtlpexecutehandlerforunwind (). The difference is that the function pointer is placed in EDX.

_ Rtlpexecutehandlerforunwind @ 20:
/* Copy the routine in edX */
MoV edX, offset _ rtlpexceptionprotector
/* Run the common routine */
_ Rtlpexecutehandler @ 20:
/* Save non-volatile */
Push EBX
Push ESI
Push EDI
/* Clear REGISTERS */
XOR eax, eax
Xor ebx, EBX
Xor esi, ESI
Xor edi, EDI

/* Call the 2nd-Stage executer */
Push [esp + 0x20]
Push [esp + 0x20]
Push [esp + 0x20]
Push [esp + 0x20]
Push [esp + 0x20]
Call _ rtlpexecutehandler2 @ 20

/* Restore non-volatile */
Pop EDI
Pop ESI
Pop EBX
RET 0x14

It is reasonable to say that the edX pointer placed in rtlpexecutehandlerforunwind () should point to rtlpunwindprotector (), not to rtlpexceptionprotector () as in rtlpexecutehandlerforexception (). This is probably an error. After all, version 0.3.0's reactos is quite new, and some errors are not surprising. In version 0.2.6, the Code does point to rtlpunwindprotector (), which should be correct.
Obviously, the specific processing is implemented by _ rtlpexecutehandler2 (). Here we only prepare for the function call and recover it afterwards.

[_ Kitrap14 ()> kipagefaulthandler ()> kikerneltraphandler ()> kidispatchexception ()
> Rtldispatchexception ()> rtlpexecutehandlerforexception ()> _ rtlpexecutehandler2 ()]

_ Rtlpexecutehandler2 @ 20:
/* Set up stack frame */
Push EBP
MoV EBP, ESP
/* Save the frame */
Push [EBP + 0xc]/* to the original node, which is the target node to be protected */
/* Push handler address */
Push edX/* Become the handler pointer in the new node, pointing to the protection function */

/* Push the exception list */
Push [FS: teb_exception_list]/* to become the next pointer in the new node */
/* Link us to it */
MoV [FS: teb_exception_list], ESP/* point predictionlist to a new node */

/* Call the handler */
Push [EBP + 0x14]
Push [EBP + 0x10]
Push [EBP + 0xc]
Push [EBP + 8]
MoV ECx, [EBP + 0x18]/* parameter predictionhandler */
Call ECx/* Call exceptionhandler, with four call parameters */

/* Unlink us */
MoV ESP, [FS: teb_exception_list]
/* Restore it */
Pop [FS: teb_exception_list]/* The new node has been removed from the predictionlist */

/* Undo stack frame and return */
MoV ESP, EBP/* new nodes no longer exist on the stack */
Pop EBP
RET 0x14

The constant teb_exception_list is defined as 0, so "FS: teb_exception_list" is "FS: 0", that is, pointing to the first field in the kpcr structure, that is, predictionlist. However, it is misleading to reference the constant teb_exception_list here, because it is in the system space, not in the user space. The reason is that this function is also used for exception handling (code reuse) in the user space, and the first field of Teb is indeed the exceptionlist.
Note that the call command here references ECx rather than EDX. The ECX content comes from the call parameter on the stack, which is the last parameter predictionhandler. For common nodes (not common nodes), this is actually _ sehframehandler ().
This code is a bit mysterious. Note that registrationframe, as the parameter pointer, is first pushed to the stack, and then the content of the register edX, that is, the function pointer pointing to _ rtlpexceptionprotector () or _ rtlpunwindprotector (), is pushed to the stack, then, we press the [FS: 0], that is, the content of the pointer exceptionlist into the stack, and then write the current stack pointer to the exceptionlist. In this way, the content of predictionlist refers to a _ sehregistration_t structure pointer, which is a function pointer at the top of the pointer (and a pointer to the target node at the top ). We can regard these two pointers as a _ sehportableframe_t data structure, and here we operate on the stack and [FS: 0, in fact, another node is inserted into the header of the exception handling Queue (logically the tail part ). However, this new node is different from the original node in this queue. Although the original node is the same as the _ sehregistration_t data structure, it is a component of the _ sehportableframe_t data structure. Now, the nodes mounted to the queue exist in isolation (with a pointer ). But this is not a problem, because the function pointer points to different functions. After all, this function determines how to access and use the relevant data structure. To distinguish it from the Framework handler in the target node, we call this function "Protector )". Correspondingly, we may call such a node as "Protection node" and the original node as "normal node ".
The existence of the protected node is temporary. The new node is deleted from the two commands returned by the protection function, and the predictionlist is restored to its original state. Because the nodes exist on the stack, the queue operations are very clean. Later, the reader will see that the protection node will not perform long-range jump, so it will certainly return.
So why should we insert a protection node in the predictionlist? This is because _ sehframehandler () will be executed on a common node. The process of executing this function itself (for example, calling the filter function) may cause new exceptions, therefore, we should protect it and prepare for possible exceptions. If an exception occurs, the corresponding protection function is _ rtlpexceptionprotector (). In fact, this function does not really play a protective role. Its purpose is only to indicate that a nested exception occurs.
A new exception occurs during exception handling, which is a nested exception. However, do not confuse nested exceptions with nested seh fields.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More