C # string Knowledge collation

Source: Internet
Author: User

    1. How the system handles text

" new knowledge Points ". NET Framework

Definition of the. NET Framework: It contains a common language runtime (Common Language Runtime), and a class library

Several of these concepts are:

CLI Common Language Infrastructure (Common Language Interface): The CLI defines the specifications for executable code and the operating environment.

Operating environment: Virtual operating System (Vsan execution System,ves).

CTS Common Type Systems (Common type System): CTS is the core of the CLI.

The C # language compiled MSIL is actually a CLR instruction set.

Managed Heap:

The CLR reserves a chunk of memory for the application, which is the managed heap, and the application type is stored in the heap.

The structure of the new knowledge point char type is a Unicode character, where each Unicode character occupies 2 bytes (no proxy pair is used, and the proxy pair is composed of two to 2 bytes).

The difference between the new knowledge point value type and the app type: value types include integers, enumerations, characters, and structs. Reference types include classes, interfaces, arrays, and strings. For a value type, the compiler knows the memory size of its data type at compile time, so it is allocated to a virtual stack. For reference types, it is only when the code is run that the memory of that type is known, so he needs to allocate dynamically, allocate in the managed heap, declare a reference-type variable, and assign it a reference, if you want to change the value of the variable, for example: Apple a= New Apple (); A=new Apple ();

In turn, A was assigned two times, of which two new Apple () was generated with a total of two references. This is where the performance limit for string types is when handling strings.

new points of Knowledge in C #, Char is a two-byte character, the same size as short, which is Unicode. The string type is essentially a char array in storage, so string is also Unicode. In memory, typically in the stored procedure, the length of the first string variable is saved, or the last one adds an 0x00, which identifies the end of the string.

Some systems have cache mode in order to allocate the smallest amount of memory space to a string, and Microsoft has a BSTR (binary string) binary string. Long string connections can take longer to allocate time to the heap, which means that strings with shorter lengths are better able to exploit performance.

new knowledge point built in: The CLR automatically maintains a "built-in pool." The role of the built-in pool, when the program initializes a string variable with the same string, the CLR references only one instance of the built-in pool. An address is not reassigned in memory. This helps save memory, knowing that variables of the general string type are large. The built-in pool is implemented by a hash table, and when the built-in pool is retrieved, the hash code is used, so the speed will be fast. The string that is created at run time is not automatically built-in. We can use String.intern (S1) to check if the string is built-in.

2.String class and StringBuilder class

" new knowledge point " When assigning a new value or connection to a string class, the performance loss is mainly in two places:

Reallocation of memory "reference type"

Copying of strings

The main solution: reduce the number of times a string reference is generated, allocate more space, and allocate more memory space at initialization time. "So there is a waste of memory", which is how the Builderstring class works.

The " new knowledge Point " StringBuilder class has two properties, one is the length property and one is the capacity property. Length is a string in the current StringBuilder, which can be set or obtained. The capacity property represents the size of the capacity in the current StringBuilder. This capacity increases with the change of the string.

" string optimization operation Tips " recommended the following tips

Try to set the string as a constant so that the built-in pool is used, minimizing the number of machine instructions required.

If the string class works effectively, do not use the StringBuilder class.

If you want to loop through a large character database, use StringBuilder.

If internationalized strings are required, then only compare () can be used. Use the CompareOrdinal () method.

If you boast that the strings are identical, then you should use Equals () instead of the COMPAREORDIANL () method.

Typically, you use the method equals instead of the "=" operator.

3. Internationalization

new knowledge Point Unicode: in different countries or regions, different computers use different character sets, causing the same character to have distinct values. This increases the complexity of data conversion between different computers. To solve this problem, the Unicode association specifies a positional value for each written character in the world. At the beginning, Unicode encoding is made up of two bytes and can represent more than 65,000 characters, but not all characters. Later, a surrogate pair was invented, using two pairs of two bytes of memory to represent a character.

The " new knowledge Point " character in the culture, in the world, the same thing has a lot of different symbols, for example, when the 100000 yuan, China and the United States have the following two ways, the system is based on your own set of the character set on the PC to know its representation. One of the classes involved is the CultureInfo class.

Classification of " new knowledge Points " culture:

Invariant culture: A culture that is totally non-cultural, and that many system processes need to be independent of culture, that is, the character of the culture has nothing to do with culture.

Neutral culture: The culture is related to the language and is not related to the region.

Specific culture: This culture is related to language and national area. The main difference between a neutral culture and a specific culture is that the latter provides additional information about the relevant date, time, currency, and number format options for a specific region or country.

The application of the " new knowledge Point " Culture in the program:

The value of CurrentUICulture determines how a form resource is loaded. Can be a neutral culture.

The value of CurrentCulture determines other aspects-date format, number format, string case and comparison, and so on. Must be a specific culture.

int money = 100000;

Console.WriteLine (Money. ToString ("C"));

Thread.currentthread.currentculture=new CultureInfo ("en-us"); Specific Culture

Console.WriteLine (Money. ToString ("C"));

Culture and comparison of " new knowledge points "

In string comparisons, cultures can be differentiated and cultures are not differentiated. If culture is not differentiated, then it is compared according to the Unicode code, if culture is differentiated, then it is compared with the current culture.

For

String a = "Ciao";

String B = "character";

It is obvious that a>b is not a cultural distinction.

String a = "Ciao";

String B = "character";

if (String.Compare (A, b,true, new CultureInfo ("Cs-cz")) < 0)

Console.Write ("B>a");

If the current culture is Czech, then b>a.

String.compareordinal () is a culture-insensitive comparison.

String.Compare () is a culture-sensitive comparison.

String.Equals () is also a culture-insensitive comparison. Online said is a non-secure comparison, so the fastest.

C # string Knowledge collation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.