The original Article translated from lusiphir is here.
It was recently known that the trigger rules for the static constructors of struct and class are different, unlike the trigger of the static constructor when the class is used for the first time. If you only access fields of the struct instance, static constructor calls are not triggered. Tests show that when accessing static fields, struct functions (static and instance) and constructors with parameters will cause the execution of static constructors. However, it is not possible to call the default constructor or unoverwritten basic functions. Why?
Let's take a look at the differences between class and struct when calling constructors. Class uses the newobj command, while struct uses the initobj command to construct the object. Newobj applies for a memory block on the heap and calls the corresponding constructor for initialization. Then, it returns the object address to the computing stack. Initobj loads allocated struct instances from the local variable table and initializes fields of struct. This initialization process is performed internally by the CLR, rather than adding a default constructor to the class (this is why struct cannot add the default value to the field. However, if a default value is added to a field in the class, the compiler automatically adds a field value to the constructor ). If a constructor with parameters is defined for struct, the system will not use the initobj command, but directly call the constructor with parameters using the call command.
Call and callvirt are the most common commands for calling a function. Call commands are used for static functions, and callvirt commands are used for classes (no matter whether the functions in the class are virtual or not ). The call command is used only when the child class calls the parent class function (avoid recursive call) and in the constructor (the parent class field is initialized by the compiler. For struct, we find that the call command is used as long as the called function is defined by struct itself. The difference between the call and callvirt commands is that the call regards the called function as a static function and does not care whether the instance pointer (this) is null when the current function is called. This is why struct calls all functions because the struct instance cannot be set to null. In fact, when calling a non-virtual function, the class actually uses the call function. It only performs a step-by-step verification-whether this is null. Let's verify it.
Class class_test
{
Public void test1 (){}
Public Virtual void Test2 (){}
Public static void test3 (){}
Public override string tostring ()
{
Return base. tostring ();
}
}
Class_test c = new class_test ();
C. test1 ();
C. Test2 ();
Class_test.test3 ();
String STR = C. tostring ();
The corresponding compilation is as follows:
C. test1 (); // non-virtual function of the Instance
10000006b mov ECx, ESI // put this in ECx, and ECx saves the first parameter in. net function call rules.
2017006d cmp dword ptr [ECx], ECx // verify whether this is null. if the pointer is null, dword ptr [ECx] will report an error
0000006f call ffeec130 // call a function
00000074 NOP
C. Test2 (); // instance virtual function
00000075 mov ECx, ESI
00000077 mov eax, dword ptr [ECx] // obtain the address of the method table. The first four bytes of the reference type on the stack are the address of the method table.
00000079 call dword ptr [eax + 38 H] // the address of the function to be called is calculated every time a virtual function is called.
2017007c NOP
Class_test.test3 (); // static function
00000083 call ffeec140 // call a function
00000088 NOP
Public override string tostring () // subclass calls the parent class function
{
// Omitting the previous Assembly
Return base. tostring (); // endless loop if callvirt is used
00000026 mov ECx, EDI // obtain this from ECx
00000028 call 77a00f68 // call a function
10000002d mov ESI, eax //. net function call rules in which eax saves the returned value
10000002f mov EBX, ESI
00000031 NOP
00000032 JMP 00000034
}
Through the Assembly above, we can see that the call command is used in essence when the class calls a non-virtual function, while the call command is used directly when the parent class function is called, in addition, you do not need to verify whether this is null in the instance function. Let's talk about the problem here, in Il, we often see that it is not a very slow operation to load local variables into the computing stack or save the results in the computing stack to the local variables during function execution? In fact, in most cases, the registers ESI and EDI are used as caches. If there are many local variables, they will be saved to the corresponding stack. From this we have confirmed the fact that the stack frame created by the. NET thread stack for every function execution contains a parameter table, a local variable table, return address, and computing stack.
Let's continue to talk about the call command. I mentioned above that struct itself defines the call command. If you do the experiment yourself, you will find that I am not correct. If struct overwrites the base class function (gethashcode, tostring), it will use callvirt to call it when calling Il. Is it true that I am wrong?
Struct struct_test
{
Bool _;
Int _ B;
Int _ C;
Public struct_test (bool a, int C, int B)
{
This. _ A =;
This. _ B = B;
This. _ C = C;
}
Public void test (){}
Public override string tostring ()
{
Return string. Format ("{0}, {1}, {2}", this. _ A, this. _ B, this. _ C );
}
}
Struct_test S = new struct_test (true, 15, 20 );
String STR = S. tostring ();
Il_0001: ldloca. s
Il_0003: LDC. i4.1
Il_0004: LDC. i4.s 15
Il_0006: LDC. i4.s 20
Il_0008: Call instance void test_console.struct_test:. ctor (bool, int32, int32)
Il_000d: NOP
Il_000e: ldloca. s
Il_0010: Constrained. test_console.struct_test
Il_0016: callvirt instance string [mscorlib] system. Object: tostring ()
Il_001b: stloc.1
If you observe it carefully, you will find the constrained command on the callvirt call. Let's take a look at the dizzy explanation in msdn:
If the callreceive method command is prefixed with constrainedthistype, the command is executed as follows:
If thistype is of the reference type (relative to the value type), PTR is disreferenced and passed as the "This" pointer to callvirt of method.
If thistype is of the value type and thistype implements the method, the PTR as the "This" pointer is passed to the callmethod command without any modification, so that thistype can implement the method.
If thistype is of the value type and thistype does not implement the method, the PTR is not referenced, boxed, and passed as the "This" pointer to the callemedimethod command.
To put it bluntly, if the value type is implemented by the virtual function when calling a virtual function, it will be called in the form of call. If not, it will be called in the form of callvirt, bind the value type. For more detailed analysis of constrained, see here. The following uses a simple method to verify this conclusion:
Struct_test S = new struct_test (true, 15, 20 );
Console. writeline (GC. gettotalmemory (false ));
Int hash = 0;
For (INT I = 0; I <10000000; ++ I)
{
Hash = S. gethashcode ();
}
Console. writeline (GC. gettotalmemory (false ));
Console. writeline (GC. collectioncount (0 ));
The running result is:
141200
399104
127
From the above results, we can see that the failure to overwrite the virtual function does cause packing. Let me compare it with the difference when calling tostring. For S. tostring (), please refer to the disassembly;
S. tostring ();
10000003d Lea ECx, [ebp-44h]
00000040 call ffe4c0b0
00000045 NOP
S. gethashcode ();
00000046 mov ECx, 7c3810h // struct_test method table address
2017004b call ffe31fac // allocate space on the heap
00000050 mov EBX, eax
00000052 Lea EDI, [EBX + 4]
00000055 CMP ECx, dword ptr [EDI]
00000057 Lea ESI, [ebp-44h] // copy data on the stack to the stack
2017005a movq xmm0, mmword PTR [esi]
2017005e movq mmword PTR [EDI], xmm0
00000062 add ESI, 8
00000065 add EDI, 8
00000068 movs dword ptr es: [EDI], dword ptr [esi]
00000069 mov ECx, EBX
10000006b mov eax, dword ptr [ECx] // call a virtual function
2017006d call dword ptr [eax + 30 h]
So be careful when using struct. Do not overwrite the virtual function and cause unnecessary performance loss. In addition, because the struct_test function is not called, the execution of the static structure is not triggered. Finally, struct needs to obtain the this pointer when calling the function, for example, il_000e: ldloca. S. Note that this is not ldloc, so the first parameter for the function call of struct_test is ref struct_test. It seems that this parameter modifier of ref is used here to best reflect the value.
The above is my understanding of struct problems. If you still have any questions after reading this article or if I have written a wrong explanation, please leave a message. I very much hope to discuss with friends who are interested in the underlying layer.