Concepts cannot be lost-a deep understanding of C # value types and reference types

Source: Internet
Author: User
Document directory
  • 1. General Type System
  • 2. Value Type
  • 3. Reference Type
  • 4. Deployment of value and reference types in memory
  • 5. Use the value type and reference type correctly
  • 6. Summary

Conceptually, the value type directly stores its value, while the reference type stores its reference to its value. These two types are stored in different places in the memory. In C #, we must determine the behavior of a type instance when designing the type. This decision is very important, in the words of Jeffrey Richter, author of CLR via C, "programmers who do not understand the differences between reference types and value types will introduce weird bugs and performance issues to the Code (I believe that a developer who misunderstands the difference between reference types and value types will introduce subtle bugs and performance issues into their code .) ". This requires us to correctly understand and use the value type and reference type.

  • 1. General Type System
  • 2. Value Type
  • 3. Reference Type
  • 4. Deployment of value and reference types in memory
    • 4.1 deployment of arrays in memory
    • 4.2 nesting of value type and reference type
  • 5. Use the value type and reference type correctly
    • 5.1 usage of value types and reference types
    • 5.2 implement value types as constant and atomic types as much as possible
    • 5.3 make sure that 0 is a valid status of the Value Type
    • 5.4 reduce packing and unpacking as much as possible
  • 6. Summary
  • 7. Reference
1. General Type System

In C #, whether a variable is a value or a reference only depends on its data type.

The basic data types of C # are defined in a platform-independent manner. The pre-defined type of C # is not built into the language, but is built into the. NET Framework .. . Net uses a general type System (CTS) to define predefined data types that can be used in the intermediate language (IL. net Language is eventually compiled into Il, that is, compiled into Cts-type code.

For example, when C # declares an int variable, it is actually an instance of system. int32 in CTS. This has important significance:

  • Ensure the security of the forced type on IL;
  • Achieves interoperability between different. NET languages;
  • All data types are objects. They can have methods, attributes, and so on. For example:

Int I;
I = 1;
String S;
S = I. tostring ();

The msdn diagram shows how each type of CTS is related. Note that instances of the type can only be of the value type or custom description type, even if these types have subcategories.

2. Value Type

All value types of C # are implicitly derived from system. valuetype:

  • Struct: struct (directly derived from system. valuetype );
    • Value Type:
      • Integer: sbyte (system. sbyte alias), short (system. int16), INT (system. int32), long (system. int64), byte (system. byte), ushort (system. uint16), uint (system. uint32), ulong (system. uint64), char (system. char );
      • Float: Float (system. Single), double (system. Double );
      • The high-precision decimal type used for financial computing: decimal (system. decimal ).
    • Bool type: bool (alias of system. Boolean );
    • User-defined struct (derived from system. valuetype ).
  • Enumeration: Enum (derived from system. Enum );
  • Can be null type (derived from system. nullable <t> generic struct, T? Is actually the alias of system. nullable <t> ).

Each value type has an implicit default constructor to initialize the default value of this type. For example:

Int I = new int ();

It is equivalent:

Int32 I = new int32 ();

It is equivalent:

Int I = 0;

It is equivalent:

Int32 I = 0;

When the new operator is used, the system calls the default constructor of a specific type and assigns the default value to the variable. In the preceding example, the default constructor assigns the value 0 to I. Msdn has a complete default table.

For more information about int and int32, see understanding system. int32 and INT in C #.

All value types are seal, so a new value type cannot be derived.

It is worth noting that system. valuetype is directly derived from system. object. That is, system. valuetype is a class type rather than a value type. The key is that valuetype overrides the equals () method to compare the value type by instance value rather than by reference address.

You can use the type. isvaluetype attribute to determine whether a type is a value type:

Testtype = new testtype ();
If (testtypetype. GetType (). isvaluetype)
{
Console. writeline ("{0} is value type.", testtype. tostring ());
}

3. Reference Type

C # has the following reference types:

  • Array (derived from system. array)
  • The following types are defined by the user:
    • Class: Class (derived from system. Object );
    • Interface: interface (the interface is not a "thing", so there is no problem where it is derived. Anders said in C # programming language that an interface only represents a Convention [contract]);
    • Delegate: Delegate (derived from system. Delegate ).
  • Object (alias of system. Object );
  • String: string (alias of system. String ).

We can see that:

  • If the reference type is the same as the value type, the struct can also implement interfaces;
  • The reference type can be derived from a new type, but the value type cannot;
  • The reference type can contain null values and the value type cannot (the null type function allows null to be assigned to the value type );
  • The value assignment of the reference type variable only copies the reference to the object, instead of copying the object itself. When a value type variable is assigned to another value type variable, the included values are copied.

For the last one, the string is often obfuscated. I once saw in an earlier version of a book that the string variable is more efficient than the string variable. I also often hear that string is a reference type, string is a value type, and so on. For example:

String S1 = "hello ,";
String S2 = "world! ";
String S3 = S1 + S2; // S3 is "Hello, world! "

This does look like a value type assignment. Another example is:

String S1 = "";
String S2 = S1;
S1 = "B"; // S2 is still ""

Changing the S1 value does not affect S2. This makes the string look like a value type. In fact, this is the result of operator overloading. When S1 is changed,. Net re-allocates the memory For S1 on the managed heap. The purpose is to implement the string as the reference type as a string in the General Semantics.

4. Deployment of value and reference types in memory

I often hear about it and often see in the book that the value type is deployed on the stack, and the reference type is deployed on the managed stack. In fact, it is not that simple.

All reference types are deployed on the managed stack. This is easy to understand. When you create an application type variable:

Object reference = new object ();

The new keyword will allocate memory space on the managed stack and return the address of the memory space. The reference on the left is located on the stack. It is a reference that stores a memory address, and the memory (in the managed heap) pointed to by this address stores its content (a system. object instance ). For convenience, the reference type is deployed on the hosting platform.

Let's look at the value type. The wording in the C # language specification is "the struct does not require memory allocation on the heap (however, unlike classes, structs are value types and do not require heap allocation) instead of allocating memory on the stack ". This is confusing: Where is the value type actually deployed?

Array 4.1

Consider Arrays:

Int [] reference = new int [100];

According to the definition, arrays are all reference types, so the int array is of course a reference type (that is, reference. GetType (). isvaluetype is false ).

The elements in the int array are all Int. According to the definition, Int Is a value type (that is, reference [I]. GetType (). isvaluetype is true ). So is the value type element in the reference type array located on the stack or heap?

If you use windbg to view the specific location of reference [I] in the memory, you will find that they are not on the stack, but on the managed stack.

In fact, for Arrays:

Testtype [] testtypes = new testtype [1, 100];

If testtype is a value type, a storage space is allocated for 100 value-type Elements on the managed heap at a time, and the 100 elements are automatically initialized, store the 100 elements in the memory.

If testtype is a reference type, a space is allocated to testtypes in the managed heap, and no elements are automatically initialized (testtypes [I] is null ). When code initializes an element in the future, the storage space of the referenced element will be allocated to the managed stack.

4.2 type nesting

What is more confusing is that the reference type contains the value type and the value type contains the reference type:

Public class referencetypeclass
{
Private int _ valuetypefield;
Public referencetypeclass ()
{
_ Valuetypefield = 0;
}
Public void method ()
{
Int valuetypelocalvariable = 0;
}
}
Referencetypeclass referencetypeclassinstance = new referencetypeclass (); // Where is _ valuetypefield?
Referencetypeclassinstance. Method (); // Where is valuetypelocalvariable?
Public struct valuetypestruct
{
Private object _ referencetypefield;
Public valuetypestruct ()
{
_ Referencetypefield = new object ();
}
Public void method ()
{
Object referencetypelocalvariable = new object ();
}
}
Valuetypestruct valuetypestructinstance = new valuetypestruct (); // Where is _ referencetypefield?
Valuetypestructinstance. Method (); // Where is referencetypelocalvariable?

Simply look at valuetypestructinstance, which is a struct instance and seems to be thrown to the stack as a whole. However, the field _ referencetypefield is of the reference type, and the local variable referencetypelocalvarible is also of the reference type.

Referencetypeclassinstance has the same problem. The referencetypeclassinstance itself is a reference type and should be deployed on the hosting stack as a whole. But the field _ valuetypefield is a value type, and the local variable valuetypelocalvariable is also a value type. Are they on the stack or on the managed stack?

The rule is:

  • The reference type is deployed on the managed stack;
  • The value type is always assigned to the place it declares: when it is used as a field, it is stored with the variable (Instance) to which it belongs; when it is used as a local variable, it is stored on the stack.

Let's analyze the above Code. For an instance of the reference type, that is, referencetypeclassinstance:

  • From the context, referencetypeclassinstance is a local variable, so it is deployed on the managed stack and held by a reference on the stack;
  • Value Type field _ valuetypefield is part of the referencetypeclassinstance of the reference type. Therefore, referencetypeclassinstance of the reference type is deployed on the managed stack (a bit similar to the array );
  • Valuetypelocalvariable is a value type local variable, so it is deployed on the stack.

For Value Type instances, that is, valuetypestruct:

  • According to the context, the value type instance valuetypestructinstance itself is a local variable rather than a field, so it is located on the stack;
  • The reference type field _ referencetypefield does not have the following problem. It must be deployed on the managed stack and held by a reference (this reference is part of valuetypestruct and is located on the stack );
  • Its reference type local variable referencetypelocalvariable is obviously deployed on the managed stack and held by a stack reference.

Therefore, simply put, "The value type is stored on the stack, and the reference type is stored on the managed stack" is incorrect. Specific analysis is required.

5. Use the value type and reference type correctly

This part mainly refers to Objective C #, which is not original to you. I hope that you can better understand the value type and reference type.

5.1 usage of value types and reference types

In C #, we use struct/class to declare a type as value type/reference type.

Consider the following example:

Testtype [] testtypes = new testtype [1, 100];

If testtye is a value type, you only need to allocate it once, and the size is 100 times that of testtye. If testtye is of the reference type, it needs to be allocated for 100 times at the beginning. After the allocation, each element value in the array is null, and then 100 elements are initialized. A total of 101 assignments are required for the result. This will consume more time and cause more memory fragments. Therefore, if the type is mainly responsible for data storage, the value type is suitable.

Generally, value types (not supporting polymorphism) are suitable for storing data that is operated by C # applications, and reference types (supporting polymorphism) should be used to define the behavior of applications.

Generally, we create more reference types than value types. If all the answers to the following questions are yes, we should create a value type:

  • Is the primary role of this type used for data storage?
  • Is the common excuse for this type completely defined by the Access attribute of some data members?
  • Are you sure this type can never be a subclass?
  • Are you sure this type will never have polymorphism?
5.2 implement value types as constant and atomic types as much as possible

The constant type is simple:

  • If the parameter validity is verified during construction, it will remain valid after construction;
  • Saves many error checks because modification is prohibited;
  • Ensure thread security because multiple readers access the same content;
  • It can be safely exposed to the outside world because the caller cannot change the internal status of the object.

Atomic types are single entities. We usually Replace the entire content of an atomic type directly.

The following is a typical variable type:

Public struct address
{
Private string _ city;
Private string _ province;
Private int _ zipcode;
Public String City
{
Get {return _ city ;}
Set {_ city = value ;}
}
Public String Province
{
Get {return _ province ;}
Set
{
Validateprovince (value );
_ Province = value;
}
}
Public int zipcode
{
Get {return _ zipcode ;}
Set
{
Validatezipcode (value );
_ Zipcode = value;
}
}
}

Create an instance as follows:

Address = new address ();
Address. City = "Chengdu ";
Address. Province = "Sichuan ";
Address. zipcode = 610000;

Then change the instance:

Address. City = "Nanjing"; // now province and zipcode are invalid
Address. zipcode = 210000; // now province is still invalid
Address. Province = "Jiangsu ";

It can be seen that changes in the internal state may violate the invariant of the object, at least temporary violation. If the above is a multi-threaded program, during the city change process, another thread may see inconsistent data views. If it is not a multi-threaded program, there are also problems:

  • When the zipcode value is invalid and an exception is thrown, the object is changed only in part, so it is in an invalid state. To solve this problem, you need to add a considerable amount of internal verification code to the address;
  • To achieve exception security, we need to put defensive code in all customer code that changes multiple fields;
  • Thread security also requires that we add thread synchronization checks on the accessors of each attribute.

Obviously, this is a considerable workload. Here we implement the address type as a constant:

Public struct address
{
Private string _ city;
Private string _ province;
Private int _ zipcode;
Public Address (string city, string province, int zipcode)
{
_ City = city;
_ Province = province;
_ Zipcode = zipcode;
Validateprovince (province );
Validatezipcode (zipcode );
}
Public String City
{
Get {return _ city ;}
}
Public String Province
{
Get {return _ province ;}
}
Public int zipcode
{
Get {return _ zipcode ;}
}
}

If you want to change the address, you cannot modify the existing instance. You can only create one new instance:

Address = new address ("Chengdu", "Sichuan", 610000); // create a instance
Address = new address ("Nanjing", "Jiangsu", 210000); // modify the instance

Address does not have any invalid temporary status. The temporary states only exist during the execution of the address constructor. In this way, address is exceptionally safe and thread-safe.

5.3 make sure that 0 is a valid status of the Value Type

The default initialization mechanism of. Net sets the reference type to 0 in the binary sense, that is, null. For the value type, no matter whether or not we provide the constructor, there will be a default constructor, which is set to 0.

A typical case is enumeration:

Public Enum sex
{
Male = 1;
Female = 2;
}

Then, use a member as a value type:

Public struct employee
{
Private sex _ sex;
// Other
}

An invalid sex field is obtained when the employee struct is created:

Employee Employee = new employee ();

The _ sex of the employee is invalid because it is 0. We should explicitly express 0 as an initialization value:

Public sex
{
None = 0;
Male = 1;
Female = 2;
}

If the value type contains a reference type, another initialization problem occurs:

Public struct errorlog
{
Private string _ message;
// Other
}

Then create an errorlog:

Errorlog = new errorlog ();

The _ Message Field of errorlog is an empty reference. We should use an attribute to expose _ message to the Customer Code, so that the problem is limited to the errorlog:

Public struct errorlog
{
Private string _ message;
Public String message
{
Get
{
Return (_ message! = NULL )? _ Message: String. empty;
}
Set {_ message = value ;}
}
// Other
}

5.4 reduce packing and unpacking as much as possible

Packing refers to placing a value type into an unnamed type of reference type, for example:

Int valuetype = 0;
Object referencetype = I; // boxing

The unboxing method extracts the value type from the preceding packing object:

Object referencetype;
Int valuetype = (INT) referencetype; // unboxing

Packing and unpacking are performance-consuming and introduce some strange bugs. We should avoid packing and unpacking.

The biggest problem with packing and unpacking is that it will automatically happen. For example:

Console. writeline ("a few numbers: {0}, {1}.", 25, 32 );

The parameter types received by console. writeline () are string, object, and object ). Therefore, the following operations are actually performed:

Int I = 25;
Obeject o = I; // boxing

Then pass O to the writeline () method. Inside the writeline () method, in order to call the tostring () method on I, the following code is executed:

Int I = (INT) O; // unboxing
String output = I, tostring ();

Therefore, the correct method should be:

Console. writeline ("a few numbers: {0}, {1}.", 25. tostring (), 32. tostring ());

25. tostring () is just to execute a method and return a reference type, there is no problem of packing/unpacking.

Another typical example is the use of arrylist:

Public struct employee
{
Private string _ name;
Public Employee (string name)
{
_ Name = Name;
}
Public string name
{
Get {return _ name ;}
Set {_ name = value ;}
}
Public override string tostring ()
{
Return _ name;
}
}
Arraylist employees = new arraylist ();
Employees. Add (new employee ("old name"); // boxing
Employee CEO = (employee) employees [0]; // unboxing
CEO. Name = "new name"; // employees [0]. tostring () is still "old name"

The above Code not only has performance problems, but also easily leads to errors.

In this case, it is better to use a generic set:

List <employee> employees = new list <employee> ();

Because list <t> is a strongly typed set, the employees. Add () method does not perform type conversion, so there is no packing/unpacking problem.

6. Summary

In C #, whether a variable is a value or a reference only depends on its data type.

The value types of C # include struct (numeric type, bool type, user-defined struct type), enumeration, and null type.

C # references include arrays, user-defined classes, interfaces, delegates, objects, and strings.

Array elements, whether of the reference type or value type, are stored on the managed stack.

The reference type stores a reference in the stack, and its actual storage location is located in the managed heap. For convenience, this document deploy the reference type in hosting and pushing.

The value type is always assigned to the place it declares: when it is used as a field, it is stored with the variable (Instance) to which it belongs; when it is used as a local variable, it is stored on the stack.

Value types are more efficient in memory management and do not support polymorphism. They are suitable for storing data. reference types support polymorphism and are suitable for defining the behavior of applications.

The value type should be implemented as constant and atomic as much as possible.

Make sure that 0 is the valid status of the value type as much as possible.

We should try to reduce packing and unpacking as much as possible.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.