The. NET concept that arises from a performance problem

Source: Internet
Author: User
Concept | problem | Performance A. NET concept derived from a performance problem
Keywords:. NET performance GC value type reference type heap stack string

1 Primer
Let's take a look at two sets of code, which section of each group of code is more efficient?

First group:

Code 1:

for (int i = 0; i < 10000; i++)

{

Addressdata ds = new Addresssdata ();

ds = Addresss.getaddress ();

}



Code 2:

for (int i = 0; i < 10000; i++)

{

Addressdata ds;

ds = Addresss.getaddress ();

}

Second group:

Code One:

String strnames = "@" +guid.newguid (). ToString (). Replace ("-", "") + ", @" +guid.newguid (). ToString (). Replace ("-", "") + ", @" +guid.newguid (). ToString (). Replace ("-", "");

for (int i = 0; i < 10000; i++)

{

......

}



Code 2:

for (int i = 0; i < 10000; i++)

{

String strnames = "@" +guid.newguid (). ToString (). Replace ("-", "") + ", @" +guid.newguid (). ToString (). Replace ("-", "") + ", @" +guid.newguid (). ToString (). Replace ("-", "");

......

}

Each group of code, two pieces of code to achieve the same function, the difference between them is very small, but its efficiency will be surprisingly different, one of which will be very frequent GC, why?

In answer to this question, we first understand a few. NET Concepts.

2 What is GC
The full name of the GC is garbage collection, the Chinese name Garbage collection, which is a function of. NET for memory management. The garbage collector tracks and reclaims objects allocated in managed memory and periodically performs garbage collection to reclaim memory allocated to objects that do not have a valid reference. The GC automatically occurs when the memory request is not satisfied with available memory.

When garbage collection is done, the garbage collector searches the managed object in memory first, then searches for the referenced object from managed code and marks it as valid, then releases the object that is not marked as valid and reclaims the memory, and finally collates the memory to move the valid objects together. This is the four steps of the GC.

From the above, GC is very bad for performance, so generally this kind of thing is still as good as possible.

To reduce some of the performance impact,. NET's GC supports object aging, or the concept of generational, which is the unit of measurement of objects in memory relative to the current period, and the algebra or existential period of an object that describes the generation of the object. The current. NET garbage collector supports three generations. Each time a GC is made, objects that are not recycled automatically ascend a generation. Objects that are created more recently belong to newer generations, which are lower than the algebra of objects created earlier in the application life cycle. Objects in the most recent generation are located in the 0 generation. At each GC time, the objects in the 0 generation are recycled first, and the higher algebra objects are recycled only if the lower algebraic objects are not satisfied with the requirements after the collection is complete.

3 Stacks and heaps
Memory has the concept of stacks and heaps. The stack follows the LIFO principle, and the object that is pushed onto the stack is bound to implement this pull stack, which ensures that this part of the memory is compact and basically does not need to consider the memory address problem. The heap does not have this principle, and any object may enter the heap at any time or be moved out of the heap at any time. It's obvious that we have to think about where each object is saved, so we need to keep every address in the stack that each object is saved in the heap. At the same time after a while we will find a lot of gaps in the heap, that is, fragments, in order to improve system performance, we often need to organize the heap to clear debris. About stacks and heaps, as shown in the following illustration:


4 GC and Stack, heap
From the concept of the stack and heap, we can see that the stack does not exist garbage collection problem, only need to directly press the stack, and the heap, it is faced with a very complex problem of garbage collection. The GC operates entirely on the heap, and the judgment of whether the object in the heap is valid is achieved by traversing the stack. This involves the concept of reference counting, which counts the number of references to objects in the heap, and when an object's reference count is zero, the object can be recycled. When GC is performed, the garbage collector traverses the stack, and when a heap address is found, it adds a reference count of the objects on that address in the heap to 1, and then destroys all objects in the heap that have a reference count of zero, reclaims the memory and collates the fragments in the heap.

5 value types and reference types
As we all know, the data types in the computer are divided into value types and reference types. So what exactly is a value type and what is a reference type?

Most programming languages provide built-in data types, such as integers and floating-point numbers, that are replicated when passed as parameters (that is, they are passed by value). In the. NET Framework, these are called value types. The runtime supports two types of values: built-in value types and user-defined value types.

A reference type stores a reference to the memory address of the value. A reference type can be a self-describing type, a pointer type, or an interface type. The type of a reference type can be determined by the value of the self-describing type. The self-describing types are further subdivided into arrays and class types. A class type is a user-defined class, a boxed value type, and a delegate.

As a variable of value type, each has its own copy of the data, so operations on one variable do not affect other variables. A variable that is a reference type can refer to the same object, so operations on one variable affect the same object that is referenced by another variable.

6 value types, reference types and stacks, heap
Knowing the value type and reference type, how do these two types behave in memory?

Value types are stored on the stack, and reference types are stored in the heap, and then a reference to the objects in the heap (also called pointers) is stored on the stack, as shown in the following illustration:


Because of this type of storage, the effect on variable operations is different, for example, by referencing pointer B's changes to the data, as well as by referencing the resulting data in pointer C. This is not the case with a value type operation.

These two differences will also be expressed in our way, for example:

Let's assume that the Modifyclass () method is Value1 plus 2 for the field in ClassA.

ClassA CA = new ClassA ();

Ca. Value1 = 2;

Modifyclass (CA);

int getValue = ca. Value1;

......

At this point you can see that the value of GetValue is 4, and the Modifyclass () method does not return any data.

And for value types, we find that it is not possible to do so, and you have to let the method have return data, such as:

int CA = 2

int getValue = Modifyvalue (CA);

The difference between a value type and a reference type is also when declaring a new variable, such as ClassA CA = null is legal, and int CA = NULL is illegal.

7 Class instantiation steps
Class is the most common and most commonly used type of reference, and we know that instantiating a class uses one of our commonplace statements:

ClassA CA = new ClassA ();

So in this short sentence, what does the computer do?

In fact, computers have done a couple of things in this process:

First, when the CA is ClassA, an empty reference pointer is generated and pushed onto the stack:


Then, when new ClassA () is generated, the ClassA is created and placed in the heap:


In the assignment number = This step, point the CA's reference pointer to the new instance just generated:


This is the time to complete the operation of the entire statement.

Well, after we've learned these concepts, we can answer the questions that we started with this article:

8 answer the question that starts with this article
For the first set of code, we first need to understand

Addressdata ds = new Addresssdata ();

ds = Addresss.getaddress ();

In the case of memory, we already know that after the first sentence of the above code is complete, it will look like this:


And after the second code is done, it becomes like this:


One of ClassA's instance 1 is generated in the code, instance 2 is generated by the addresss.getaddress () method, instance 2 is the valid object, and instance 1 unfortunately falls out of one generation and waits for the garbage collector to reclaim its fate.

Generally speaking, this practice does not cause much performance problems, but in some cases? For example, this article starts with a demo of such a loop?

This time it will generate a lot of garbage in the heap, occupy a lot of memory, so have to constantly GC, which seriously affect performance.

So what about the second set of code?

It seems that the second group of code is very different from the first set of code, one is the typical reference type of the class, and one is the string, which usually looks like a value type.

In fact, a string is a very special thing, and it has the characteristics of a value type and a reference type, for example, we have to deal with him in such a way:

String ds = "This is a Test";

ds = Modifystring (ds);

One thing to note is the second sentence, which is a typical way of manipulating a value type, and the Modifystring () method must return the data, but on the other hand, when we initialize a new string variable, we can write: string ds = null. It's weird, right? I felt that the reason for this might be that the designer wanted it to be as consistent as possible with other types of data such as Int,float, which, after all, were so similar to the way they felt to us, but it was not fixed in length, int and float and so on we all have a clear set of how many bits it is, but not string.

For this reason, the performance difference between the second group of two pieces of code is caused.

9 Some digression: Must use class?
We know that class is a reference type, struct is a value type, class is unique to the Object-oriented programming era, and what we call struct struct is only a workaround for object-oriented programming in the embryonic phase, So much so that Java has abandoned the concept of struct, so why does. NET not give up what Java has given up?

struct is not rubbish. First, struct is a value type so that it can be stored on the stack, not in the heap, that is, it does not result in the performance impact of the GC, and secondly, for example, if there are 100 elements in an object, how much memory space would it occupy if the object were defined as classes and structs? The answer is that the definition of a class consumes 101 blocks of memory, 100 elements and a reference pointer, and the definition of a struct takes only 100 blocks. You might say that it doesn't matter much to take up such a piece, but it increases by 1%, so what if the object has only three elements?

So we can quite clearly conclude that class is not the only one.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.