In-depth C # Memory Management to analyze the differences between several concepts of Value Type & reference type, packing & unpacking, and stack

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

-C # how many questions are frequently asked by beginners? The differences between the value type and the reference type, packing, unpacking, and stack are different. After reading this article, I should be able to explain it.

As the saying goes, it is literary programmers who program with ideas, ordinary programmers who program with experience, and 2B programmers who program with copy and paste. Joke ^_^.

I believe that people who have experience in the C # interview must be familiar with the following sentence:

The value type directly stores its value, and the reference type stores Reference to the value. The value type exists on the stack, and the reference type is stored on the managed stack. The value type is converted to the reference type, which is called packing, converting the reference type to the value type is called unpacking.

However, simply memorizing this sentence is not enough.

C # programmers do not need to manually manage the memory, but to write efficient code, they still need to understand what is happening in the background.

What teachers often say at school is: the concept is unclear. In the simplest example, I have memorized all the calculus formulas. When I encounter a problem, I have a set of formulas, but I cannot solve them, because I don't know how the formulas are derived, the basic principle is not clear.

(Someone is dead to make us live well. Someone is dead to make us live well: Newton and lavenitz =. = ).

A little too far. Next we will discuss with me how C # stack and managed heap work, And go deep into the memory to understand the above basic concepts of C.

I. Concepts of stack and heap in different fields

　　InC/C ++Medium:

　　StackIt is called the stack zone. It is automatically assigned and released by the compiler to store the parameter values of functions and the values of local variables.

HeapIt is called the heap zone, which is allocated and released by the programmer. If the programmer is not released, the program may be recycled by the OS at the end of the program.

InC #Medium:

　　StackStack,HeapIt refers to the managed heap. Different Languages have different definitions. (If there is an error, please correct it ).

What needs to be clarified here is in the languageStack and heapIt refers to a region in the memory, which is different from the stack in the data structure (the linear table that comes first and goes first) and the heap (a binary tree that goes through a certain sort ).

Before talking about a concept, we must first describe its background.

Unless otherwise stated, the stack mentioned in this Article refersStackManaged heap refersHeap.

Ii. How C # stacks work

Windwos uses a virtual addressing system to map available memory addresses of programs to actual addresses in Hardware Memory, each process on a 32-bit processor can use 4 GB of memory-no matter how much hard disk space the computer has (on a 64-bit processor, this number is larger ). The 4 GB memory contains all parts of the program-executable code, loaded DLL, and all variables. The 4 GB memory is called virtual memory.

Each 4 GB storage unit starts from 0 and goes up. The value of the storage space to access the memory. You need to provide the number of the storage unit. In advanced languages, the compiler converts a name that we can understand into a memory address that the processor can understand.

　　In the virtual memory of a process, a region is called a stack to store value types.. In addition, when a method is called, all parameters passed to the method through stack replication are used.

Note the scope of variables in C #. If variable A enters the scope before variable B, B will first go out of the scope. See the following example:

{    int a;    //do something    {        int b;        //do something    }}

After a is declared, B is declared in the internal code block, and then the internal code block is terminated, B is out of scope, and then a is out of scope. When releasing variables, they are always in the opposite order of allocating memory to them. Do you think of the stack (LIFO -- last in first out) in the data structure ). This is how the stack works.

We do not know where the stack is located in the address space. In fact, C # development does not need to know this.

　　Stack pointerA variable maintained by the operating system pointing to the next free space address in the stack. When the program runs for the first time, the stack pointer points to the end of the memory block reserved for the stack.

Stack is filled down, that isFill in from high address to low address. When data is imported into the stack, the stack pointer is adjusted accordingly, pointing to the next free space. Let's give an example.

The stack pointer is 800000, And the next free space is 799999. The following code tells the compiler that it needs some storage units to store an integer and a double-precision floating point number.

{    int a=1;    double b = 1.1;    //do something}

Both are value types, which are naturally stored in the stack. After a is declared as 1, A enters the scope. Int type requires 4 bytes, and A is stored in 799996 ~ On the 799999. In this case, the stack pointer is reduced by 4, pointing to the end of the new space used by 799996, And the next free space is 799995. After the next row declares that B is assigned a value of 1.1, double occupies 8 bytes, so it is stored in 799988 ~ On 799995, stack pointer minus 8.

When B is out of scope, the computer will know that this variable is no longer needed. The lifetime of a variable is always nested. When B is in the scope, no matter what happens, the stack pointer can always point to the space where B is stored.

When the B variable is deleted, the stack pointer increments by 8 and now points to the space used by B, where curly braces are closed. Then a is out of scope, and the stack pointer increments by 4.

If a new variable is added, the storage unit starting from 799999 will be overwritten.

Ii. How heap hosting works

The stack has a very high performance, but requires that the lifecycle of the variable must be nested (determined by first-in-first-out). In many cases, this requirement is too high... We usually want to use a method to allocate memory to store some data, and the data is still available for a long time after the method exits. This possibility exists when the new operator is used to request space-for example, all reference types. At this time, we need to use the hosting heap.

If you have compiled C ++ code that requires low-level memory management, you will be familiar with heap (Heap),The managed heap is different from the heap used by C ++. It works under the control of the garbage collector and has significant performance advantages over the traditional heap..

Managed heap is another region where the process is available 4 GB. We use an example to understand how managed heap works and allocate memory for referenced data types. Suppose we have a customer class.

1  void DoSomething()2      {3          Customer john;4          john = new Customer();
5      }

The code in the third line declares the reference John of a customer and allocates storage space for this reference on the stack. However, this is only a reference, not an actual customer object. John references the address that contains the customer object. 4 bytes are required to set ~ The address between 4 GB is stored as an integer-therefore, the John reference occupies 4 bytes.

The fourth line of code first allocates memory on the hosting stack to store the customer instance, and then sets the value of the variable John to the memory address allocated to the customer object.

Customer is a reference type, so it is placed in the memory hosting heap. For convenience, assume that the customer object occupies 32 bytes, including its instance fields and. NET information used to identify and manage its class instances. To locate the storage location of a new customer object in the managed stack ,. net Runtime Library will search for a continuous unused 32-byte space in the heap, assuming that its starting address is 200000.

John references account for 799996 of stacks ~ Location 799999. The memory should be like this before the John object is instantiated ,.

Memory content after space is allocated to the customer object. Different from the stack, the memory on the stack is allocated upwards, and all free space is above the used space.

The preceding example shows that the process of referencing a variable is much more complex than that of creating a value variable, and the performance cannot be reduced -. net Runtime library needs to maintain the heap Information Status. When adding new data to the heap, the information also needs to be updated (this will be mentioned in the heap garbage collection mechanism ). Despite these performance losses, there is also a mechanism that will not be restricted by the stack when allocating memory to variables:

Assign the value of referenced variable A to another variable B of the same type. Both referenced variables reference the same object. When variable B is out of scope, it is deleted by the stack, but the object referenced by it is still on the stack, because another variable A is referencing this object. The object will be deleted only when its data is no longer referenced by any variable.

This is the power of the reference data type,We can independently control the data lifecycle.As long as there is a reference to the data, the data must be stored on the stack.

3. Hosting heap garbage collection

　　Objects that are no longer referenced in the heap will be deleted. If this is the case, over time, the free space on the stack will be dispersed, and it will be difficult to allocate memory to new objects ,. net Runtime Library must search for the entire heap to find a memory block that is large enough to store the entire new object.

But when the Garbage Collector hosting the heap is running, as long as it releases objects that can be released, it will compress other objects and push them to the top of the heap to form a continuous block. When moving an object, you must update the addresses referenced by all objects, resulting in performance loss. However, when using the managed heap, you only need to read the value of the heap pointer, instead of searching the entire link address list to find a place to place new data.

Therefore, it is much faster to instantiate objects in. net, because objects are compressed to the same memory area of the heap, and fewer pages are exchanged when accessing objects. Microsoft believes that although the Garbage Collector needs to do some work to modify all object references it moves, resulting in performance degradation, this performance will be compensated.

Iv. packing and unpacking

With the above knowledge, let's look at the following code.

Int I = 1; object o = I; // binning Int J = (INT) O; // unpack

Int I = 1; a 4-byte space is allocated in the stack to store variable I.

Object o = I;

Packing process: first, allocate a 4-byte space in the stack to store the reference variable O,

Then, a certain amount of space is allocated in the managed heap to store copies of I, which is a little larger than the space occupied by I. A method table pointer and a syncblockindex are added, and return the memory address.

Finally, assign the address to the variable O, which is the reference to the object. No matter how the O value changes, the I value will not change. On the contrary, the O value will not change because they are stored in different places.

Int J = int (O );

Binning: saves variable J in 4 bytes of space allocated in the stack, copies the value of the O instance to the memory of J, and assigns the value to J.

Note: Only boxed objects can be split. If O is not the int type after packing, an exception will be thrown if the above code is executed.

There is a warning here that the binning must be very careful to make sure that the value variable has enough space to store the value obtained after the binning.

 long a = 999999999; object b = a; int c = (int)b;

C # int only has 32 bits. If the 64-bit long value is split into int, an invalidcastexecption exception is generated.

----------------------------------------------------------------- I am a split line --------------------------------------------------------------

If you have any questions, please correct them. I hope this will help you understand some basic concepts.

According to _ longmao's prompt, I found an interesting phenomenon. In my opinion, let's look at the following code. Suppose we have a member class with the fields name and num:

Member member1 = new Member { Name = "Marry", Num = "001" };Member member2 = member1;member1.Name = "John";Console.WriteLine("member1.Name={0}  member2.Name={1}",member1.Name,member2.Name);int i = 1;object o = i;object o2 = o;o = 2;Console.WriteLine("o={0}  o2={1}", o, o2);string str1 = "Hello";string str2 = str1;str1 = "Hello，World！";Console.WriteLine("str1={0}  str2={1}", str1, str2);Console.ReadKey();

According to our previous theory, member1 and member2 refer to the same object in the heap and modify one of them, and the other will inevitably change.

Therefore, the output should be member1.name = John member2.name = John, which is beyond doubt.

The object and string are the only pre-defined two reference types in C #. What is the result?

By reasoning, the expected results will be o = 2 O2 = 2 and str1 = Hello, world! Str2 = Hello, world !. Run the command, OMG. It's wrong.

The result is O = 2 O2 = 1 and str1 = Hello, world! Str2 = hello.

The explanation for this phenomenon is that the string type is special (as explained in the link provided by _ Chinanet) because a string variable is created at the beginning, the space occupied by the heap is determined.

Modify a string variable, such as str1 = "Hello, world! ", You must re-allocate the appropriate space to store larger data (this will happen in hours), that is, create a new object, update the address stored in str1, and point to the new object.

Therefore, str2 still points to the previous object. Str1 points to the newly created object, which is already referenced by different objects.

I can understand why the object is so... It may be because, as two preset reference types, they are all virtues.

Thank you, alimama. Otherwise, I will not notice this.

! Back, actually, the object and string are really a virtue. As a base class, an object can be bound to all types. For example, give him

int i=1；object o=i;

Obviously, the object referenced by O occupies more than 4 bytes on the stack (and. net is used to identify and manage some information about its class instances: A method table pointer and a syncblockindex), suppose it is 6 bytes.

What if I bind a long type to o?

o=(long)100000000;

If we only fill in the data to the original memory space, these 6-byte temple may not accommodate more than 8 bytes.

You can only allocate new space to save new objects.

The string and object types are unchangeable once initialized. (See C # advanced programming ). The so-called immutable, including the memory size immutable. Once the size is fixed, the method and operator used to modify its content are actually to create a new object and allocate new memory space, because the previous size may not be suitable. Basically, this is an overload of the '=' operator.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

In-depth C # Memory Management to analyze the differences between several concepts of Value Type & reference type, packing & unpacking, and stack

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

In-depth C # Memory Management to analyze the differences between several concepts of Value Type & reference type, packing & unpacking, and stack

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support