C #: Plot value type, reference type, stack, heap, ref,out [go]

Last Update:2015-08-20 Source: Internet

Author: User

Tags mscorlib

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Principles of program execution

It doesn't seem easy to get a sense of the bunch of concepts and their relationships, because most C # programmers don't know about the managed heap (the "heap") and the line stacks ("stack"), or know them, but don't know much about it: the reference type is kept in the managed heap, and the value type "usually" Saved in the stack. To understand the relationship between the stacks of concepts, I think it's important to understand the fundamentals of program execution to understand the role of stacks and managed heaps in order to clarify their relationship. Consider the following code, and main calls METHOD1,METHOD1 call METHOD2:

1234567891011121314151617181920 classProgram{ staticvoidMain(string[] args) { varnum = 120; Method1(num); } staticvoidMethod1(intnum) { varnum2 = num + 250; Method2(num2); Console.WriteLine(num); } static voidMethod2(inti) { Console.WriteLine(i); }}

As you all know, Windows programs are usually multiple threads, and there are no multithreading issues to consider here. The program is entered into execution by the main method, when the (main) thread allocates a 1M size that belongs to its own line stacks. this 1M of stack space is used to pass parameters to the method, defining local variables. So before the main method enters the METHOD1, we must have a "memory map" of the mental surface: The num is pressed into the line stacks , such as:

Then the NUM as a parameter into the Method1 method, also defined in the METHOD1 a local variable num2, call Add method to get the last value, so before entering Method2, "memory Diagram" as follows, NUM is the parameter, num2 is a local variable

Then call METHOD2 the process of the same, and then exit the Method2 method, back to the look, then exit the Method1 method, then go back to the first image of the appearance, and then quit the program, the whole process such as:

So remove those if,for, multithreading and so on concepts, only the object memory allocation related concepts, the implementation of the program can be briefly summarized as follows:

The program is executed by the main method, and repeats "defining local variables, calling methods (which may pass parameters), returning from methods, and finally exiting from the main method." In the process of program execution, constant pressing into parameters and local variables to the line Cheng, also constantly out of the stack.

Note that in fact, there is a way to press the return address of the stack, etc., this is ignored.

Reference types and Heaps

In the example above, I only used a simple int value type to focus only on the stack (growth) and stack (extinction) of the line stacks. It is obvious that C # has a reference type, introduces a reference type, and then consider the above question, see the following code:

123456789101112 staticvoidMain(string[] args){ varuser = newUser { Age = 15 }; varnum = 23; Console.WriteLine(user.Age); Console.WriteLine(num);}class User{ publicintAge;}

I think a lot of people should know that this should introduce the concept of managed heap, but here I want to be the same as above, first from the perspective of the stack to consider the problem, so before calling WriteLine, "Memory Diagram" should be this (the address is scrambled):

This is what people often say: For reference types, the stack holds the address (pointer, reference) of the instance object that is pointing to the heap. Since it's just an address, getting an instance of an object should have a step based on an address or looking for an object, and that's exactly what happens if Console.WriteLine (num) Gets the value of num in the stack as a step in the WriteLine method, To get the instance object of the user above, it is a two-step process at run time, which is the step of finding the field or method of the instance object in the managed heap based on the address. Il decompile the above main method, deleting some extraneous code:

12345 //load local 0=> get local variable 0 (is an address) il_0012: ldloc.0 //load field = Pushes the value of the field in the specified object onto the stack. il_0013: ldfld int32 cildemo.program/user:: Age il_0018: call void [mscorlib]system.console::writeline (Int32)

123	`//load local 1=>获取局部变量1(是一个值)` `IL_001e: ldloc.1IL_001f: call` `void[mscorlib]System.Console::WriteLine(int32)`

Before the second WriteLine method, only one ldloc.1 (load local 1) is required to read the local variable 1 instruction to get the value to WriteLine, and the first WriteLine requires two instructions to complete the task, which is said in two steps.

Of course, we all know that this is transparent to us, so many people like to draw such diagrams to help understand, after all, we do not feel that the 0X0612ECB4 address exists.

It is also said that the reference type is stored in two segments, one in the managed heap (instance object), and the other is the variable that holds its reference. For a local variable (parameter), the reference is in the stack, and as a field variable of type, the reference follows the object.

fields and local variables (parameters)

As you can see, the value of age as a value type is stored in the managed heap and is not stored in the stack, which is the error that many C # Novices make: Value type values are stored in the stack.

Obviously they don't know the conclusion is that when we discuss the principle of program operation, the local variables (parameters) are pressed and the stack is drawn and the result is the conclusion of this particular scenario. We have to figure out that, like the code above, in addition to the value of the local variable that can define the int type num, we can also define an int type age field member in a type to store an integer number, which is obviously not stored on the stack, so the conclusion should be: The value of a value type is stored at the location it declares. That is, the value of the local variable (parameter) is in the stack, and as a member of the type, it follows the object (instance object, etc.).

Of course, the value of the reference type (the instance object) is always in the managed heap, and the conclusion is correct.

Ref and Out

C # has the difference between a value type and a reference type, and then there are two keywords, ref and out, that make it more ambiguous to understand the relevant concepts. To understand this problem, still need to understand from the perspective of the stack. We have four kinds of situations to discuss: Normal pass value type, normal pass reference type, ref (out) pass value type, ref (out) pass reference type.

Note that for runtime, ref and out are the same, they differ from the C # compiler's distinction between them, ref requires initialization well, out is not required. Because out does not require initialization, the called method cannot read out parameters and must be assigned before the method returns.

Normal Pass value type

123456789101112 staticvoidMain(string[] args){ varnum = 120; Method1(num); Console.WriteLine(num);//输出=>120}staticvoidMethod1(int num){ Console.WriteLine(num); num = 180;}

This scene is familiar to everyone, theMethod1 of the sentence assignment is not working , if you want to draw, it is similar to the second picture above:

That is, the argument is to copy the value of the stack to the Method1 num parameter , METHOD1 operation is its own parameters, the main local variables have no effect, that is, the main method is not affected by the data in the stack.

Normal pass-through reference types

1234567891011121314 staticvoidMain(string[] args){ varuser = newUser(); user.Age = 15; Method2(user); Debug.Assert(user != null); Console.WriteLine(user.Age);//输出=> 18}staticvoidMethod2(User user){ user.Age = 18; user = null;}

Note The METHOD2 code here, set age to 18, affecting the user of the Main method, and setting the user to null has no effect. To analyze this problem, or to start from the perspective of the stack, the stack diagram is as follows (address scrambling):

See the second picture, we should probably understand this fact: regardless of the value type or the reference type, the normal parameter is to copy the value of the stack to the parameters, from the point of view of the Stack, C # By default is passed by value.

Since it is all "by value", why does the reference type exhibit a difference in the value type of the local variable that can affect the calling method? It's not hard to think about it. This difference in performance is not caused by the difference in the way the data is transmitted, but by the difference in memory between the value type and the reference type's local variables (parameters). for the local variables of the main method, the user and METHOD2 parameters are stored separately in the stack, and the data (addresses, pointers, references) in the stack do not affect each other, but they all point to the same instance object in the managed heap, and the user. Age = 18 This sentence is the operation of the instance object in the managed heap, not the data in the stack (address, pointer, reference). num = 180 operates on the data in the stack, while the user. Age = 18 is the managed heap, which is what makes a different performance.

For user = null, the sentence does not respond to the local variable of main, and it should be easy to see if the third picture is the user = null and user. Age = 18 is different,user = null is the data in the stack (address, pointer, reference) is set empty , so does not affect the user of main.

Here again, for reference types, var user = Null,var user = new User (), User1 = User2 will affect the data in the stack (address, pointer, reference), the first one will be set to NULL, the second will get a new data (address, pointer, reference) , and the third one is the stack data copy, just like the previous parameter.

Ref (out) Pass value type

1234567891011121314151617181920 staticvoidMain(string[] args){ varnum = 10; Method1(num); Console.WriteLine(num);//输出=> 10 Method3(refnum); Console.WriteLine(num);//输出=> 28}staticvoidMethod1(intnum){ Console.WriteLine(num); num = 18;}staticvoidMethod3(refintnum){ Console.WriteLine(num); num = 28;}

The code is simple, and the output should be clear, without difficulty. The use of ref seems to be simple, and the underlying fact is that C # does most of the work for us. To draw, the "stack map" is as follows (address scrambling):

See this figure, a lot of people should be confused, Method3 's parameters are written in the int type num, how in the stack is a pointer (address, reference) it? This actually C # "deceives" us, IL anti-compile look:

As you can see, the Method3 compiled by the ref (out) parameter is not the same, then take a look at the IL code for the parameter values in the method:

123456789 //这是Method1的代码//load arg 0=>读取索引0的参数，直接就是一个值IL_0001: ldarg.0//这是Method3的代码//load arg 0=>读取索引0的参数，这是一个地址IL_0001: ldarg.0//将位于上面地址处的 int32 值作为 int32 加载到堆栈上。IL_0002: ldind.i4

As can be seen, the same is obtained parameter value to writeline,method1 only one instruction, and Method3 need 2, that is, a more than the address to find the value of the steps. It is not difficult to think that the assignment has the same difference:

12345678910111213 //Method1//把18放入栈中IL_0008: ldc.i4.s 18//store arg=> 把值赋给参数变量numIL_000a: starg.s num//Method3//load arg 0=>读取索引0的参数，这是一个地址IL_0009: ldarg.0//把28放入栈中IL_000a: ldc.i4.s 28//在给定的地址存储 int32 值。IL_000c: stind.i4

Yes, although the same num = 5 is an assignment to a parameter, there is no ref (out) keyword, and what actually happens at run time is not the same. A method with ref (out) has the same address as the value above and then goes to operate (here is the assignment) instruction.

See here you should understand that when the parameter is added ref (out), the argument is the reference pass, then pass the stack address (pointer, reference), otherwise the normal value of the transfer-stack data replication.

Ref (out) passing reference type

The argument for the reference type with ref (out) is a mystery, and this is left to everyone to think about. To be sure, or from the perspective of the stack, there is no difference between the value type, is the delivery stack address.

I personally think it is useless to cite ref (out) as a reference type.

Summarize

In considering this large pile of conceptual problems, we first have to understand the basic principles of program execution, but the process of growth and extinction of the stack. After understanding this process, we should learn to think from the perspective of the stack, so many things will be solved. Why is it called "value" type and "reference" type? In fact, this "value" and "reference" is from the perspective of the stack, in the stack, the value type of data is a value, reference type in the stack is just an address (pointer, reference). Also note that a variable can exist as a type of field member, in addition to being a local variable (parameter). After knowing these, "value type objects are stored there?" "These questions should be clear." Finally, it is understood that C # By default is the value of the parameter, that is, the stack of data assigned to the parameters, which is the same method to assign a variable to the same type of another variable is the same, and added ref (out) Why this magic, in fact, C # behind do more things, compiled into different IL code.

Reference: "CLR via C #"

C #: Plot value type, reference type, stack, heap, ref,out [go]

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More