From: http://blog.sina.com.cn/s/blog_4d66279f010009ho.html
These days I have studied VC and made it a multi-threaded server. I suffered repeated setbacks on the cstring and learned some articles from the experts on the Internet (in fact, it was well written by msdn, but it was hard to understand English.In fact, the most important thing is that cstring is a class and a reference. Some operations can directly operate on memory.This is where cstring is special and easy to make a mistake. I am here to spell out some articles on the Internet that I think are very good at learning from each other.
String operations are one of the most common and basic operations in programming. as a VC programmer, cstring has been used by cainiao or experts. and it seems that it is difficult to leave it apart in actual programming (although it is not a library in Standard C ++ ). because the class provided in MFC is too convenient for us to operate strings, cstring not only provides a variety of operation functions, Operator overloading, so that we use strings as intuitive as in basic; in addition, it also provides dynamic memory allocation, which reduces the risk of overcrossing the array of strings. However, we also realized that cstring is too prone to errors and some are unpredictable. Therefore, many high people are standing up and we recommend that you discard it.
In fact, cstring encapsulation is indeed perfect, and it has many advantages, such as "easy to use, powerful functionality, dynamic memory allocation, it can save memory resources and improve execution efficiency when a large number of copies are made, it is fully compatible with standard C and supports both multibyte and wide byte. Due to the exception mechanism, it is safe and convenient to use it. "In fact, it is prone to errors during use, that's because we don't know enough about it, especially its implementation mechanism. Most of us do not like to go into the documents about it in depth at work. Besides, it is in English.
Therefore, in order to reduce frequent memory requests or release the memory, cstring first applies for a large memory block to store strings. In this way, when the length of a string increases in the future, if the total length of the added string does not exceed the length of the pre-applied memory block, you do not need to apply for memory. When the length of the added string exceeds the pre-applied memory, cstring first releases the original memory and then re-applies for a larger memory block. Similarly, when the string length is reduced, the extra memory space is not released. Instead, the excess memory will be released at a time when the accumulation reaches a certain level.
1Cstring implementation mechanism.
Cstring is used to manage strings through "reference". I believe that the word "reference" is not unfamiliar to everyone. For example, window kernel objects and COM objects are implemented through references. Cstring also uses this mechanism to manage allocated memory blocks. In fact, the cstring object only has one pointer member variable, so the length of any cstring instance is only 4 bytes.
That is: int Len = sizeof (cstring); // Len equals 4
This pointer points to a reference memory block: cstring STR ("ABCD ");
___
____________ |
|
| 0x04040404 | head, which indicates information about the referenced memory block.
| ____________ |
STR | ___ |
| 'A' | 0x40404040
| 'B' |
| 'C' |
| 'D |
| 0 |
Because of this, such a memory block can be referenced by multiple cstrings, such as the following code:
Cstring STR ("ABCD ");
Cstring A = STR;
Cstring B (STR );
Cstring C;
C = B;
The result of the above Code is: the member variable pointers in the above four objects (STR, A, B, C) have the same value, all of which are 0x40404040. how does this memory block know how many cstrings reference it? It also records some information. Such as the number of referenced items, string length, and memory allocation length.
The structure of the referenced memory block is defined as follows:
Struct cstringdata
{
Long nrefs; // indicates how many cstrings reference it. 4
Int ndatalength; // the actual length of the string. 4
Int nalloclength; // total allocated memory length (excluding the 12 bytes of this header). 4
};
With this information, cstring can correctly allocate, manage, and release referenced memory blocks.
If you want to obtain this information during program debugging. You can enter the following expression in the watch window:
(Cstringdata *) (This-> m_pchdata)-1) or
(Cstringdata *) (Str. m_pchdata)-1) // STR indicates a cstring instance
Because of this mechanism, cstring is not only highly efficient but also has less memory allocated when copying a large number of data.
2 lpctstr and getbuffer (INT nminbuflength)
These two functions provide compatible conversions with standard C. In practice, the usage frequency is very high, but it is the most error-prone place. The two functions actually return pointers, but what are their differences? What is the process behind the scenes after they are called?
(1) the execution process of the lpctstr is actually very simple. It only returns the string address that references the memory block. It is provided as an operator overload,
Therefore, the code can be implicitly converted, but sometimes forced conversion is required. For example:
Cstring STR;
Const char * P = (lpctstr) STR;
// Assume there is such a function, test (const char * P); you can call it like this
Test (STR); // The value is implicitly converted
(2) getbuffer (INT nminbuflength), which is similar, returns a pointer, but it is a little different, returns lptstr
(3) What is the difference between the two? I would like to tell you that it is completely different in nature. Generally, after the conversion of lpctstr, it should only be used as a constant or as an input parameter of the function; and getbuffer (...) after getting the pointer, you can use this pointer to modify the content or make the input parameter of the function. Why? This code may often exist:
Cstring STR ("ABCD ");
Char * P = (char *) (const char *) STR;
P [2] = 'Z ';
In fact, there may be no errors in your program after such code, and the program runs well. But it is very dangerous. Check again
Cstring STR ("ABCD ");
Cstring test = STR;
....
Char * P = (char *) (const char *) STR;
P [2] = 'Z ';
Strcpy (P, "akfjaksjfakfakfakj"); // It's finished.
Do you know the value of test at this time? The answer is "abzd". It also changes. This is not what you expect. But why? If you think about it a little bit, you will understand that as cstring points to the referenced block, STR and test point to the same part. When you p [2] = 'Z, of course, test will also change. Therefore, you can only read this piece of data after using it for conversion, and never change its content.
What should I do if I want to directly modify the data through a pointer? Getbuffer (...) is used to view the following code:
Cstring STR ("ABCD ");
Cstring test = STR;
....
Char * P = Str. getbuffer (20 );
P [2] = 'Z'; // run to this end. The value of test is still "ABCD"
Strcpy (P, "akfjaksjfakfakfakj"); // execute to this end. The value of test is still "ABCD"
Why? In fact, when getbuffer (20) is called, it actually creates a new block storage and allocates a 20-byte buffer. The original memory block reference count is also reduced by 1. so after code execution, STR and test point to two different places, so they are safe.
(4) but here is another note: Str. after getbuffer (20), the allocation length of STR is 20, that is, the pointer P points to the buffer only 20 bytes long. When assigning a value to it, it cannot exceed, otherwise, the disaster is not far from you. If the specified length is smaller than the original string length, for example, getbuffer (1), it will actually allocate four bytes (that is, the original string length). In addition, when getbuffer (...) is called (...) you must call releasebuffer () to update the header information of the referenced memory block based on the string content.
(5) The last note is the following code:
Char * P = NULL;
Const char * q = NULL;
{
Cstring STR = "ABCD ";
Q = (lpctstr) STR;
P = Str. getbuffer (20 );
Afxmessagebox (Q); // valid
Strcpy (P, "this is test"); // valid,
}
Afxmessagebox (Q); // invalid, possibly finished
Strcpy (P, "this is test"); // invalid, possibly finished
What we need to say here is that after returning these pointers, if the cstring object ends, these pointers will also be invalid.
The following shows the code execution process.
Void test ()
{
Cstring STR ("ABCD"); // STR points to a referenced memory block (the reference count of the referenced memory block is 1,
The length is 4, and the allocation length is 4)
Cstring A; // A points to an initial data status,
A = STR; // A and STR point to the same referenced memory block (the reference count of the referenced memory block is 2,
The length is 4, and the allocation length is 4)
Cstring B (a); // A, B, and STR point to the same referenced memory block (reference of the memory block)
Count 3, length 4, allocation length 4)
{
Lpctstr temp = (lpctstr) A; // temp points to the first address of the string that references the memory block.
(The reference count of the referenced memory block is 3, the length is 4, and the allocation length is 4)
Cstring d = A; // A, B, D and STR point to the same referenced memory block (the reference count of the referenced memory block is 4, the length is 4, and the allocation length is 4)
B = "Testa"; // This statement actually calls the cstring: Operator = (cstring &) function.
B points to the newly allocated referenced memory block. (Newly allocated referenced memory block
The reference count is 1, the length is 5, and the allocation length is 5)
// At the same time, the reference count of the original referenced memory block is reduced by 1. A, D, and STR still point to the original
Reference memory block (the reference count of the referenced memory block is 3, the length is 4, and the allocation length is 4)
} // Because D ends, the Destructor is called and the reference count is reduced by 1 (Reference Memory
The block reference count is 2, the length is 4, and the allocation length is 4)
Lptstr temp = A. getbuffer (10); // This statement will also cause a new memory block to be allocated.
Temp points to the first address of the string that is allocated to the referenced memory block (New
The allocated reference memory block has a reference count of 1 and length.
0, the allocation length is 10)
// At the same time, the reference count of the original referenced memory block is reduced by 1. Only STR remains
Point to the original referenced memory block (the reference count of the referenced memory block is 1,
The length is 4, and the allocation length is 4)
Strcpy (temp, "Temp"); // The reference count of the referenced memory block pointed by a is 1, the length is 0, and the allocation length is 10
A. releasebuffer (); // Note: The reference count of the referenced memory block pointed by a is 1, the length is 4, and the allocation length is 10.
}
// The execution ends. The str a B object calls its own structure.
// Function. The referenced memory block is also reduced by 1.
// Note that the count of the referenced memory blocks pointed to by str a B is 0, which causes the allocated memory block to be released.
By observing the above execution process, we will find that although cstring can point multiple objects to the same referenced Block Storage, when copying, assigning values, and changing the content of the string, its processing is intelligent and safe, and completely achieves mutual interference and mutual influence. Of course, you must make sure that your code is correctly and appropriately used, especially when it is more complex to use, such as function parameters, references, and sometimes stored in the cstringlist, if a small part is improperly used, unexpected errors may occur.
5 Functions of freeextra ()
Read this code
(1) cstring STR ("test ");
(2) lptstr temp = Str. getbuffer (50 );
(3) strcpy (temp, "there are 22 character ");
(4) Str. releasebuffer ();
(5) Str. freeextra ();
When the code above is executed to the row (4), we all know that the reference memory block count pointed by STR is 1, the length is 22, and the allocation length is 50. run Str. freeextra (), it releases the allocated excess memory. (The reference memory block count is 1, the length is 22, and the allocation length is 22)
6 format (...) and formatv (...)
This statement is the most error-prone in use. Because it is the most skillful and flexible. Here, I have no plans to analyze it in detail. In fact, how to use sprintf (...), how to use it. I only need to pay attention to one point when using it: it is the particularity of its parameters, because the compiler cannot verify the type and length of the format string parameter and the corresponding variable element during compilation. Therefore, you must note that the two must correspond to each other,
Otherwise, an error occurs. For example:
Cstring STR;
Int A = 12;
Str. Format ("First: % L, second: % s", a, "error"); // result? Change % L to % d.
7 lockbuffer () and unlockbuffer ()
The two functions are used to lock and unlock referenced memory blocks.
But what is the function of using it and what is the substantial impact on the cstring string after it is executed. It's actually quite simple. Let's look at the following code:
(1) cstring STR ("test ");
(2) Str. lockbuffer ();
(3) cstring temp = STR;
(4) Str. unlockbuffer ();
(5) Str. lockbuffer ();
(6) STR = "error ";
(7) Str. releasebuffer ();
After execution (3), unlike in general cases, temp and str do not point to the same referenced memory block. You can use this expression (cstringdata *) (Str. m_pchdata)-1) in the watch window.
In fact, it is described in msdn:
While in a locked state, the string is protected in two ways:
No other string can get a reference to the data in the locked string, even if that string is assigned to the locked string.
The locked string will never reference another string, even if that other string is copied to the locked string.
8 is cstring a processing string?
No, cstring can not only operate strings, but also process memory block data. Complete functions! Read this code
Char P [20];
For (INT loop = 0; loop <sizeof (p); loop ++)
{
P [loop] = 10-loop;
}
Cstring STR (lpctstr) P, 20 );
Char temp [20];
Memcpy (temp, STR, str. getlength ());
STR can completely reprint the memory block P to the memory block temp. Therefore, you can use cstring to process binary data.
8 allocsysstring () and setsysstring (BSTR *)
These two functions provide string and BSTR conversion. Note: after calling allocsysstring (), you must call it sysfreestring (...)
9. Parameter security check
Multiple macros are provided in MFC for parameter security check, such as assert. among them, cstring is no exception. There are many such parameter checks. In fact, this also shows that the code is highly secure. Sometimes we may find this annoying, and it also leads to different debug and release versions, for example, sometimes the program debug works normally, while the program release crashes. Sometimes the opposite is true. In fact, I personally think that when we use cstring, we should strive for a high quality of code, so we cannot see any assertion boxes in the debug version, even if the release operation seems
Everything looks normal. But it is not safe. The following code:
(1) cstring STR ("test ");
(2) Str. lockbuffer ();
(3) lptstr temp = Str. getbuffer (10 );
(4) strcpy (temp, "error ");
(5) Str. releasebuffer ();
(6) Str. unlockbuffer (); // when the execution ends, the debug version will pop up the error box.
10 cstring Exception Handling
I just want to emphasize that cmemoryexception can be thrown only when memory is allocated.
Similarly, in the function declaration in msdn, all functions with throw (cmemoryexception) have the possibility of re-allocating or adjusting the memory.
Because the data structure is complex (using cstringdata), many problems occur during use, the most typical one is to describe that the attribute value of the memory block is inconsistent with the actual value. The reason for this problem is that cstring provides operations to facilitate some applications. These operations can directly return the string address values in the memory block, you can modify the address pointed to by this address value. However, after the modification, operations1 is not called to make the value in cstringdata consistent. For example, you can first get the string address through operations, and then add some new characters to the string to increase the length of the string. However, because it is directly modified by the pointer, therefore, the ndatalength in the cstringdata that describes the length of the string is still the original length. Therefore, when the length of the string is obtained through getlength, the returned value must be incorrect.
Last thought: I did not explain it because there are many technical skills in cstring. Such as many overloaded operators and searches. I think it is better to look at msdn in detail. I only focus on the situations where errors may occur. Of course, if there are any errors in the description, please kindly advise. Thank you very much!
Attach a bit of cstring type conversion:
1. cstring to char *
After forced type conversion, the cstring type can be converted to char *, for example:
Cstring CSTR = "Hello, world! ";
Char * zstr = (char *) (lpctstr) CSTR;
2. char * To cstring
The char * type can be directly sent to the cstring for automatic conversion. For example:
Char * zstr = "Hello, world! ";
Cstring CSTR = zstr;
3. cstring to lpcstr
To convert a cstring to an lpcstr string, you must obtain the length of the cstring. For example:
Cstring CSTR = _ T ("Hello, world! ");
Int nlen = CSTR. getlength ();
Lpcstr lpszbuf = CSTR. getbuffer (nlen );
4. cstring to lpstr
This is the same as the 3rd tips, for example:
Cstring CSTR = _ T ("Hello, world! ");
Int nlen = Str. getlength ();
Lpstr lpszbuf = Str. getbuffer (nlen );
5. Char [] to int
You can use the atoi function to convert the string type to the integer type. For example:
Char C [10];
Int N;
N = atoi (C );
6. Char [] to float
Like the 5th techniques, the use of the atof () function can be converted to the float type, for example:
Char C [10];
Float F;
F = atof (C );
7. char * to int
This is exactly the same as the 5th tips, for example:
Char * STR = "100 ";
Int I;
I = atoi (STR );