Primitive type, reference type and value type, reference

Source: Internet
Author: User
Tags mscorlib

Primitive type, reference type and value type, reference

1. primitive type

Some data types are often used in code writing, such as int and string. For example, we define an integer below:

int a =0;

We can also define it in the following statement:

System.Int32  a = new System.Int32();

The results of the above two statements are the same. Why ?, Everyone must know that it is the optimization of the compiler. I prefer the first method, because this syntax not only enhances the readability of the code, but also the generated IL code and System. the Int32 IL code is exactly the same. So what is a primitive type? The data type that the compiler can directly support is called the primitive type. The primitive type is directly mapped to the type existing in the. NET Framework class library (FCL. Will it be hard to understand? For example, the int type in c # is directly mapped to the System. Int32 type. Let's look at the following code. Although the writing method is different, the IL code they generate is completely consistent. As long as the type complies with the common language specification (CLS), no matter which language provides similar primitive types. However, the CLS type language is not supported.

int a = 0;System.Int32 a = 0;int a = new int();System.Int32 a =new System.Int32();

However, the author of CLR via C # does not recommend this concise method. The following is a copy of the reason why it is not recommended. You can refer to it for reference:

1. Many developers struggle to use String or string. Since the string (a keyword) of c # is directly mapped to System. String (an FCL type), there is no difference between the two. Both can be used. Similarly, some developers say that an application runs on a 32-bit operating system. int represents a 32-bit integer. int represents a 64-bit integer. This statement is completely false. C #'s int Is Always mapped to System. Int32, so no matter what operating System is running, it represents a 32-bit integer. If programmers are used to using Int32 in code, there will be no misunderstanding like this.

2. c # long maps to Sytem. int64, but in other programming languages, Long may be mapped to Int16 or Int32. for example, c ++/CLI regards long as Int32. people who are used to writing programs in one language can easily mistakenly understand the code intent when looking at the source code written in another language. In fact, most languages do not even regard long as a keyword and do not compile and use its code.

3. Many methods of FCL use type names as part of method names. For example, methods of the BinaryReader type include ReadBoolean, ReadInt32, and ReadSingle. Methods of the System. Convert type include ToBoolean, ToInt32, and ToSingle. Although the syntax of the following code is correct, the line containing float looks awkward and cannot be judged at once:

BinaryReader br = new BinaryReader (...); float val = br. readSingle (); // correct, but it feels awkward Singleval = br. readSingle (); // correct, natural

4. at ordinary times, many programmers who only use c # gradually forget that they can write CLR-oriented code in other languages, and "c # doctrine" gradually intrude into class library code. For example, Microsoft's FCL is almost completely written in c #. The FCL team introduced methods like Array's GetLongLength to the library. This method returns the Int64 value. This value is indeed long in c #, but not in other languages. Another example is the LongCount method of System. Linq. Enumerable.

The above is why the author does not recommend the primitive type, so that all types described in CLR via c # Are FCL type names. However, I think it seems better to use the primitive type than the FCL type, so I am still used to it. At least two types of generated IL code are the same, I didn't say that the performance is good or that the performance is poor. The Boyou image is referenced here. The figure represents the primitive type of C # and the corresponding FCL type.

// Reference type class Ref {public int x;} // Value Type struct Val {public int x;} static void Demo () {Ref r1 = new Ref (); // managed heap allocation Val v1 = new Val (); // stack allocation r1.x = 5; // pull pointer v1.x = 5; // modify Console on stack. writeLine (r1.x); // display 5 Console. writeLine (v1.x); // also displays 5 Ref r2 = r1; // copy only the pointer Val v2 = v1; // allocate and copy the member r1.x = 8 on the stack; // both r1.x and r2.x change v1.x = 9; // v1.x changes and v2.x does not Console. writeLine (r1.x); // displays 8 consoles. writeLine (r2.x); // displays 8 consoles. writeLine (v1.x); // Display 9 Console. writeLine (v2.x); // display 5}

The main advantage of value types is that they are not allocated on managed stacks. Of course, compared with the reference type, the value type also has some limitations. The differences between the value type and the reference type are listed below.

1. The value type can be boxed or unboxed, and the reference type is always in the boxed format.

2. The value type is always derived from System. ValueType. It has the same method as System. Object. However, ValueType overrides the Equals method, returns true if the field values of the two objects are completely matched, and also overwrites the GetHashCode method.

3. All methods of the value type cannot be abstract and are implicitly sealed (cannot be rewritten ).

4. assign a value type variable to another value type variable and perform field-by-field replication. Assign a variable of the reference type to another variable of the reference type to copy only the memory address. Therefore, both variables of the reference type point to the same object in the heap, therefore, executing an operation on a variable may affect the object referenced by another variable.

5. because unboxed value types are not allocated in the heap, once an instance method of this type is defined not active, the storage allocated to them will be released, rather than waiting for garbage collection.

6. The reference type variable contains the reference address of the object in the heap. When a variable of the reference type is created, the default Initialization is null, which means that no object is currently directed. An NullReferenceException will be thrown when you try to use null to reference the type variable. On the contrary, all members of the value type are initialized to 0, and the Access value type cannot throw an NullReferenceException. CLR has added an empty identifier for the value type.

 

3. packing and unpacking

When talking about the reference type and value type, we must talk about packing and unpacking. The following describes:

The value types are "lighter" than the reference types because they are not allocated as objects in the managed heap, are not garbage collected, or referenced through pointers. However, in many cases, you need to obtain reference to a value-type instance. For example, in the following example, create an ArrayList (Here we use ArrayList to install the value type for an example. We 'd better not use it like this, because FCL already provides a generic collection class, list <T> the operation value type is not packed or unpacked.) to accommodate a group of Point structures, run the following code:

// Declare the value type struct Point {public int x, y;} static void Main () {ArrayList arraylist = new ArrayList (); Point p; // allocate a point for (int I = 0; I <10; I ++) {p. x = p. y = I; // initialize the member arraylist. add (p); // bind the pair value type and Add the reference to ArrayList} Console. readLine ();}

The code above can easily be seen that each iteration initializes a Point field and stores the object in the arraylist. But what is stored in ArrayList? Is it a point structure, an address, or something else? To know the answer, let's take a look at the Add method of ArrayList to understand what its parameters are defined. The Code is as follows:

//// Summary: // Add the object to the end of System. Collections. ArrayList. //// Parameter: // value: // System. Object to be added to the end of System. Collections. ArrayList. This value can be null. //// Returned result: // System. Collections. ArrayList index, where value has been added. //// Exception: // System. NotSupportedException: // System. Collections. ArrayList is read-only. -Or-System. Collections. ArrayList has a fixed size. Public virtual int Add (object value );

We can see that the Add method obtains an object parameter. That is to say, Add gets a reference (or pointer) to an object on the managed stack as a parameter. However, point is a value type, so it must be converted into a real object hosted in the heap. Converting a value type to a reference type is called packing. So what happened to packing?

1. allocate memory in the managed heap. The amount of memory allocated is the amount of memory required for each field of the value type, and there are two additional members (type object pointer and synchronized block index.

2. Copy the value type field to the newly allocated heap memory.

3. Return the object address. The Value Type becomes the reference type.

After knowing the packing, let's take a look at how the unpacking works:

The unboxing process does not directly reverse the packing process. It is actually the process of getting a pointer. The Pointer Points to the original value type contained in an object, and then copies the field. Therefore, the price for unpacking is much lower than that for packing.

In this case, the following internal events occur when the binning of a boxed value instance:

1. If the variable containing "reference to boxed Value Type instance" is null, an NullReferenceException is thrown.

2. If the referenced object is not a boxed instance of the required value type, an InvalidCastException is thrown.

Use code to look at the packing and unpacking examples

Static void Main () {int val = 5; // create the unpacked Value Type Variable object obj = val; // val to pack the variable val = 123; // change the val value to 123 Console. writeLine (val + "," + (int) obj); // display 123,5}

Can you see from the code above how many times there are packing and unpacking? You can use ILDasm to view the IL of this Code clearly:

. Method private hidebysig static void Main () cel managed {. entrypoint // code size 47 (0x2f ). maxstack 3. locals init ([0] int32 val, [1] object obj) IL_0000: nop // load 5 to val IL_0001: ldc. i4.5 IL_0002: stloc.0 // pack val and store the reference pointer in obj. IL_0003: ldloc.0 IL_0004: box [mscorlib] System. int32 IL_0009: stloc.1 // load 123 to val IL_000a: ldc. i4.s 123 IL_000c: stloc.0 // pack val and keep the pointer on the stack to perform the Concat operation IL_000d: ldloc.0 IL_000e: box [mscorlib] System. int32 // load the string to the stack IL_0013: ldstr "," // unpack obj and obtain a pointer pointing to the Int32 field IL_0018: ldloc.1 IL_0019: unbox on the stack. any [mscorlib] System. int32 IL_001e: box [mscorlib] System. int32 // call the Concat method IL_0023: call string [mscorlib] System. string: Concat (object, object, object) // returns the String IL_0028: call void [mscorlib] System. console: WriteLine (string) IL_002d: nop // return from main, terminate the reference Program IL_002e: ret} // end of method Program: Main

The above IL shows three boxes and one unbox. It can be seen from the first packing, but the second packing may be incomprehensible to some students, when the writeline method is called, a String object is returned. Therefore, the c # compiler generates code to call the String static method Concat. This method has several overloaded versions, and the following version is called here:

public static string Concat(object arg0, object arg1, object arg2);

So in the code, val and obj converted to Int32 are packed and passed to Concat. If you are interested, you can change the above Code and enter the structure code to the Console. writeLine (val + "," + obj); and then look at the IL code, you will find that the size of the Code is reduced by about 10 bytes. So it is proved that the additional packing and unpacking will allocate an additional object in the hosting heap, and then garbage collection will be performed. It can be seen that too many packing operations will affect the program performance and memory consumption. Therefore, we try our best to reduce packing in our own code.

If you know that your code will be repeatedly packed in the compiler, it is best to manually pack it, for example:

Static void Main () {int val = 5; // The Console is packed three times. writeLine ("{0} {1} {2}", val); // manually bind object obj = val; // The Console is not boxed. writeLine ("{0} {1} {2}", obj );}

The above lists the differences between the primitive type, reference type, and value type. Finally, we add packing and unpacking, and the text code is illustrated. I hope it will bring you a deep impression, though not deep enough, I hope it will serve as an example. This article references CLR via C #. We recommend that you read this book if you have time. In the future, I will write more articles in this series and share them with you.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.