Summary: Recently, I saw someone in the blog Park discussing some of the features of C # string. In most cases, a string is discussed from a coding perspective. I feel very curious, in the running tense, how string is associated with these features. This article will focus on explaining some of the features of C # string by observing the layout of the string within the process through WinDbg.
Problem
C # string has two more interesting features.
- the constant of the string. String cross-characterization means that once a string is created, it cannot be changed. So that means when we change the string value, we redistribute a new piece of memory on the managed heap without affecting the value stored on the original memory address.
- the presence of a string . CLR Runtime maintains a table that holds a string, called a detention pool, that contains a reference to each unique string that is declared or created programmatically in a program. Therefore, an instance of a string with a specific value is only one in the system.
In response to two features, I have some questions.
- How does string constancy make interesting results when strings are compared? Why does its comparison result be different from the results of other reference types?
- What kind of string will be placed in the detention pool?
- What is the data structure of the detention pool? Is it really a hashtable?
- Will the string that resides in the detention pool be GC, and how long is its life cycle (when it will be recycled)?
The constant of string
Let's take a look at the following example:
Private Static voidcomparation () {stringA ="Test String"; stringb ="Test String"; stringc =A; Console.WriteLine ("a vs B:"+Object. ReferenceEquals (A, b)); Console.WriteLine ("a vs C:"+Object. ReferenceEquals (A, c)); Simpleobject SMP1=NewSimpleobject (a); Simpleobject SMP2=NewSimpleobject (a); Console.WriteLine ("SMP1 vs SMP2:"+Object. ReferenceEquals (SMP1, SMP2)); Console.ReadLine ();}classsimpleobject{ Public stringName =string. Empty; PublicSimpleobject (stringname) { This. Name =name; }}
From the results, although it is different variables A, B, C. Because the contents of the strings are the same, the results of the comparisons are identical. Comparing the Simpleobject instances, the values of SMP1 and SMP2 are the same, but the result of the comparison is false.
Here's a look at the run-time, these objects situations.
In the running tense, everything is an address. Determines whether two variables are the same object, and intuitively can be judged from whether the address is the same address.
Use the DSO command to print out the corresponding objects on the stack. You can see that the test String "appears 3 times, but they all correspond to an address 0000000002473f90 . Simpleobject object instances occur 2 times, and the addresses are different, respectively, 0000000002477670 and 0000000002477688 .
Therefore, when you use String, you are essentially reusing the same string object. When new is an instance of Simpleobject, each time new initializes the structure of the object on the new address. Each time is a new object.
0:000>!dsoos Thread id:0x3f0c (0) rsp/reg Object Name ... 000000000043e7300000000002473f90system.string000000000043e7380000000002473f90system.string000000000043e7400000000002473f90system.string000000000043e748 0000000002477670 consoleapplication3.simpleobject000000000043e750 0000000002477688 Consoleapplication3.simpleobject ..... 0:000>!do 0000000002473f90 name:system.stringmethodtable:00007ffdb0817df0eeclass:00007ffdb041e560size:48 (0x30 ) BYTESGC generation:0 (C:\windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll) String:Test StringFIELDS:MT Field Offset Type VT Attr Value Name00 007ffdb081f060 4000096 8 System.Int32 1 Instance m_arraylength00007ffdb081f060 4000097 C System.Int32 1 instance m_stringlength00007ffdb0819838 4000098 syste M.char 1 Instance m_firstchar00007ffdb0817df0 4000099 System.String 0 Shared Static Empty >> domain:value 0000000000581880:0000000002471308 <<00007ffdb 08196e8 400009a system.char[] 0 shared static Whitespacechars >> Domain:value 0000000000581880:0000000002471BE0 <<
When the string content changes, any minor changes will recreate a new string object. When we call this code,
Console.WriteLine ("A vs B:" + object.) ReferenceEquals (A, b));
The CLR runtime actually does two things. A new address was assigned to the character "a vs B". The comparison was stitched together with the characters just now, and assigned to another new address. If you stitch a string more than once, it will be allocated to more new addresses, which may quickly consume large amounts of virtual memory. That's why Microsoft recommends using StringBuilder in this case.
0:000>!dsolisting objects from:0000000000435000 to 0000000000440000 from thread:0 [3f0c]address Method Table
heap Gen Size type.....0000000002473fc0 00007ffdb0817df0 0 0 a vs B: 0000000002474138 00007ffdb0817df0 0 0 a vs b:true.....
The presence of a string
CLR Runtime maintains a table that holds a string, called a detention pool, that contains a reference to each unique string that is declared or created programmatically in a program. Therefore, an instance of a string with a specific value is only one in the system. Let's take a look at how to understand the sentence.
Here is the sample code:
Static voidMain (string[] args) { inti =0; while(true) {simplestring (i++); Console.WriteLine (i+": Run GC. Collect ()"); Gc. Collect (); Console.ReadLine (); }}Private Static voidSimplestring (inti) { strings ="Simplestring Method"; stringc ="Concat String"; Console.WriteLine (S+c); Console.WriteLine (S+i.tostring ()); Console.ReadLine ();}
This is the first time the results are executed. At this point only executes to the simplestring inside, has not returned from this method.
We can see that there are 4 strings on the stack. The contents of a string are stitched together by code logic, respectively. From here we can do this by actually creating multiple string objects on the heap when we are stitching the strings.
0:000>!dsolisting objects from:0000000000386000 to 0000000000390000 from thread:0 [3f50].....0000000002a93f70 00007ff db0817df0 0 0 simplestring method 0000000002a93fb8 00007ffdb0817df0 0 0 Concat string0000000002a93ff0 00007ffdb0817df0 0 0 simplestring method Concat string 0000000002a97a90 00007ffdb0817df0 0 0 00000000002a97ab0 00007ffdb0817df0 0 0 simplestring method 0...
Feel free to use one of them to check its references.
From the results of!gcroot, this string is referenced in two places. One is the current thread. It is very normal to be able to see this because it is being used by the current thread.
The other one is root on a system.object[] array. This array is pinned on the app Domain 0000000000491880 . This shows that the string is actually residing on a system.object[], rather than a lot of people guessing the Hashtable. But suppose the CLR should have a mechanism to get the right string from this array quickly. But this is beyond the scope of this discussion.
0:000>!gcroot 0000000002a93f70note:roots found on stacks could be false positives. Run "!help gcroot" Formore info. Scan thread 0 osthread 81a0rsp:b9e9b8: root:0000000002a93f70 (System.String) scan thread 2 Osthread 7370DOMAIN ( 0000000000c51880): HANDLE (Pinned): 217e8:root:0000000012a93030 (system.object[)), 0000000002a93f70 ( System.String)
We can check what's inside this system.object[].
From this array you can see the string in the code that shows the declaration. The first element is a null value, which retains the instance of our most commonly used String.Empty . The second element is "Run GC." Collect () ". This is in the main function of code. It is not currently executed, but it has been jited into the array. The other two strings that are displayed are also found in this array. It is also possible to confirm that the concatenation of the string, the temporary generation of the string did not appear here. However, the concatenation of the string is not inside the array. Although the spliced string is also assigned to the heap, it is not stored in the array.
0:000>!dumparray-details 0000000012a93030name:system.object[]methodtable:00007ffdb0805be0eeclass: 00007ffdb041eb88size:1056 (0x420) Bytesarray:rank 1, number of elements, Type classelement methodtable:00007ffdb0817 6e0[0] 0000000002a91308 Name:System.String methodtable:00007ffdb0817df0 eeclass:00007ffdb041e560 size:26 (0 x1a) bytes (C:\windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll) String:fields: MT Field Offset Type VT Attr Value Name 00007ffdb081f060 4000096 8 System.Int32 1 Instance 1 m_arraylength 00007ffdb081f060 4000097 c Sy Stem. Int32 1 Instance 0 m_stringlength 00007ffdb0819838 4000098 System.Char 1 instance 0 M_firstchar 00007ffdb0817df0 4000099 System.String 0 shared static Empty >> Domain:value 0000000000c51880:0000000002a91308 << 00007ffdb08196e8 400009a system.char[] 0 shared static Whitespacechars >> Domain:value 0000000000c51880:000000 0002A91BE0 <<[1] 0000000002a93f30 Name:System.String methodtable:00007ffdb0817df0 eeclass:00007ffdb041e5 size:64 (0x40) bytes (C:\windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll) String: : Run GC. Collect () FIELDS:MT Field Offset Type VT Attr Value Name 00007ffdb0 81f060 4000096 8 System.Int32 1 instance m_arraylength 00007ffdb081f060 4000097 C System.Int32 1 Instance m_stringlength 00007ffdb0819838 4000098 syste M.char 1 Instance m_firstchar 00007ffdb0817df0 4000099 System.String 0 Shared Static Empty >> Domain:value 0000000000c51880:0000000002a91308 << 0 0007ffdb08196e8 400009a system.char[] 0 shared static Whitespacechars >> domain:value 0000000000c51880:0000000002a91be0 <<[2] 0000000002a93f70 Name:System.String methodtable:00007ffdb0817df0 eeclass:00007ffdb041e560 size:66 (0x42) bytes (C:\windows\assembly\GAC_64\mscorli b\2.0.0.0__b77a5c561934e089\mscorlib.dll) String: Simplestring Method FIELDS:MT Field Offset Type VT Attr Value Name 00007ffdb 081f060 4000096 8 System.Int32 1 instance m_arraylength 00007ffdb081f060 4000097 C System.Int32 1 Instance m_stringlength 00007ffdb0819838 4000098 Syst Em. Char 1 Instance m_firstchar 00007ffdb0817df0 4000099 System.String 0 Shared Static Empty >> domain:value 0000000000c51880:0000000002a91308 << 000 07ffdb08196e8 400009a system.char[] 0 shared static Whitespacechars >> domain:value 0000000000c51880:0000000002a91be0 <<[3] 0000000002a93fb8 Name:System.String Me thodtable:00007ffdb0817df0 eeclass:00007ffdb041e560 size:52 (0x34) bytes (C:\windows\assembly\GAC_64\mscorlib\ 2.0.0.0__b77a5c561934E089\mscorlib.dll) String: Concat StringFIELDS:MT Field Offset Type VT Attr Value Name 00007ffdb081f060 4000096 8 System.Int32 1 instance m_arraylength 00007ffdb081f060 4000097 C System.Int32 1 Instance m_stringlength 00007ffdb0819838 4000098 System.Char 1 instance m_firstchar 00007ffdb0817df0 4000099 System.String 0 shared s Tatic Empty >> domain:value 0000000000c51880:0000000002a91308 << 00007ffdb 08196e8 400009a system.char[] 0 shared static Whitespacechars >> Domain:value 0000000000C51880:0000000002A91BE0 <<
To keep the code going, we need to do a few GC. Verify that the string that resides is GC-out after not being used.
After the GC is complete, the string above the callstack has been cleared, as envisioned. Also because the GC has been performed, the GC heap has been compressed, and the address of the object that has not been pinned has changed. So to verify that the hosted string is recycled, you can start by hosting the array. Since the array is pinned, its address will not change even if the action of the GC occurs. So you can list the strings that reside within the array by using the same command.
The result is consistent with my expectations. Only strings that are defined by the display remain within the array, and the strings are not recycled. A string produced by splicing zeros is not added to this array and is recycled after the GC has not been referenced.
0:000>!dumparray-details 0000000012a93030name:system.object[]methodtable:00007ffdb0805be0eeclass: 00007ffdb041eb88size:1056 (0x420) Bytesarray:rank 1, number of elements, Type classelement methodtable:00007ffdb0817 6e0[0] 0000000002a91308 Name:System.String methodtable:00007ffdb0817df0 eeclass:00007ffdb041e560 size:26 (0 x1a) bytes (C:\windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll) String: ... [1] 0000000002a93f30 Name:System.String methodtable:00007ffdb0817df0 eeclass:00007ffdb041e560 size:64 (0x40 ) bytes (C:\windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll) String::Run GC. Collect ()... [2] 0000000002a93f70 Name:System.String methodtable:00007ffdb0817df0 eeclass:00007ffdb041e560 size:66 (0x42 ) bytes (C:\windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll) String:Simplestring Method... [3] 0000000002a93fb8 Name:System.String methodtable:00007ffdb0817df0 eeclass:00007ffdb041e560 size:52 (0x34 ) bytes (C:\windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll) String:Concat String...
So, through the above observations, it is possible to conclude that the live string of the dwell period is very long. So, when will he be recycled?
From the above gcroot results, you can see that the mainstream array is pinned live. And the app Domain 0000000000c51880that references this array.
Print out all the app domain information using the!dumpdomain-stat command. You can see that this app domain is the domain (ConsoleApplication3.exe) where our code runs. This resident array is maintained by the CLR and is linked to the current app domain. So, theoretically, the lifetime of these dwell arrays is consistent with this app domain.
0:000>!dumpdomain-stat--------------------------------------System domain:00007ffdb1f16f60lowfrequencyheap: 00007ffdb1f16fa8highfrequencyheap:00007ffdb1f17038stubheap:00007ffdb1f170c8stage:openname: None--------------------------------------Shared domain:00007ffdb1f17860lowfrequencyheap: 00007ffdb1f178a8highfrequencyheap:00007ffdb1f17938stubheap:00007ffdb1f179c8stage:openname:noneassembly: 000000000047fa60--------------------------------------Domain 1:0000000000491880lowfrequencyheap: 00000000004918c8highfrequencyheap:0000000000491958stubheap:00000000004919e8stage:opensecuritydescriptor: 0000000000494140name:consoleapplication3.exeassembly:000000000047fa60 [C:\windows\assembly\GAC_64\mscorlib\ 2.0.0.0__b77a5c561934e089\mscorlib.dll]classloader:000000000047f820securitydescriptor:000000000047f9a0 Module name00007ffdb03e1000 C:\windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll
Write on the last side
- The constant of the string. String cross-characterization means that once a string is created, it cannot be changed. So that means when we change the string value, we redistribute a new piece of memory on the managed heap without affecting the value stored on the original memory address.
- The presence of a string. CLR Runtime maintains a table that holds a string, called a detention pool, that contains a reference to each unique string that is declared or created programmatically in a program. As a result, instances of strings with specific values have only one instance of the system (APP Domain).
The string declared directly inside the code is maintained within a object[] by the CLR runtime.
0 o'clock the generated string or the stitched string is not maintained in this mainstream array.
The life span of the residing array is as long as the app domain it resides in. So the GC does not affect the string that resides in the array that is referenced by the GC.
The following link can be used to deepen the understanding of these two features.
http://blog.csdn.net/fengshi_sh/article/details/14837445
Http://www.cnblogs.com/charles2008/archive/2009/04/12/1434115.html
Http://www.cnblogs.com/instance/archive/2011/05/24/2056091.html
Look at string from the perspective of WinDbg