String manipulation is one of the most common and basic operations in programming. As a VC programmer, both novice or master have used CString. And it seems that it's hard to get away with it in real programming (although it's not a library in standard C + +). Because this class that is provided by MFC is too convenient for us to manipulate strings, CString not only provides a variety of operational functions, Operator overloading, which allows us to use strings more intuitively than basic, and it also provides dynamic memory allocation, which allows us to reduce the hidden dangers of how many string arrays are out of bounds. However, we also realized in the use of the CString is simply too error-prone, and some unpredictable. So there are many tall man stand up, suggest abandon it. Here, I personally think: CString package is really perfect, it has many advantages, such as "easy to use, powerful, dynamic allocation of memory, a large number of copies when it can save memory resources and execution efficiency, and standard C is fully compatible with the support of multi-byte and wide-byte, because there is an abnormal mechanism to use IT security party "In fact, the use of the process is prone to error, it is because we know it is not enough, especially its implementation mechanism." Because most of us do not love to go into the document about it in the work, moreover it is English. Because I met a few days ago in the work of this is not a problem but particularly difficult, particularly difficult to solve and inexplicable surprise problem. Finally found is due to CString triggered, later, no way, I took the whole CString to see the realization of all, just panic dawu, and thoroughly understand the cause of the problem (this problem, I have been on the csdn open paste). Here, I would like to sum up some of my knowledge about CString, for his (her) people to learn from, perhaps in which I understand the error, hope that the discovery can inform me, greatly appreciated.
1 CString implementation of the mechanism. CString is a "reference" to manage the string, "reference" the word I believe that everyone is not unfamiliar, such as window kernel objects, COM objects, etc. are implemented by reference. CString also manages the allocated blocks of memory through such mechanisms. In fact, the CString object has only one pointer member variable, so the length of any CString instance is only 4 bytes. namely: int len = sizeof (CString);//len equals 4 This pointer points to a related reference memory block, as shown in figure: CString str ("ABCD"); ___ ____________ | | | | | | | 0x04040404 | | | Head, for reference memory block related information |____________| | | STR |___| |' A ' | 0x40404040 |' B ' | |' C ' | |' d ' | | 0 |
Because of this, one such block of memory can be referenced by multiple CString, such as the following code: CString Str ("ABCD"); CString a = str; CString b (str); CString C; c = b; The result of the above code is that the member variable pointers in the above four objects (STR,A,B,C) have the same value, both of which are 0x40404040. How does this block of memory know how many CString refer to it? Again, it will record some information. such as the number of references, the length of the string, allocate memory length. The structure of this block of referenced memory is defined as follows: struct CStringData { Long nrefs; Indicates how many CString refer to it. 4 int ndatalength; string actual length. 4 int nalloclength; The total allocated memory length (excluding the 12 bytes of this header). 4 }; With this information, CString can correctly allocate, manage, and release reference memory blocks. If you want to get this information when debugging a program. You can type the following expression in the Watch window: (cstringdata*) ((cstringdata*) (This->m_pchdata)-1) or (cstringdata*) ((cstringdata*) (Str.m_pchdata)-1)//str refers to CString instance
It is precisely because of such a good mechanism that the CString is not only efficient but also allocates less memory when it is heavily copied.
Continued
2 LPCTSTR and GetBuffer (int nminbuflength) These two functions provide a compatible conversion to standard C. In practice the frequency is very high, but it is the most error-prone place. These two functions actually return pointers, but what is the difference between them? And when they are called, what is the process done behind the scenes? (1) lpctstr its execution is actually very simple, just return the string address of the reference memory block. It is provided as an operator overload, So sometimes it can be implicitly converted in code, but sometimes it needs to be forced to be transformed. Such as: CString str; Const char* p = (LPCTSTR) str; Suppose there is a function like this, Test (const char* p); You can call Test (str);//This will be implicitly converted to LPCTSTR (2) GetBuffer (int nminbuflength) It is similar and returns a pointer, but it's a bit different, and it returns LPTSTR (3) What is the difference between the two? I want to tell you that it is fundamentally different, it is generally said that LPCTSTR after conversion should only be used as a constant, or to do the function of the parameter, while the GetBuffer (...) After you remove the pointer, you can use this pointer to modify the contents of the inside, or to do the function of the parameter. Why is it. There may be a lot of code like this: CString Str ("ABCD"); char* p = (char*) (const char*) str; P[2] = ' Z '; In fact, perhaps there is such a code, your program is not wrong, and the program also runs very well. But it is very dangerous. Look again CString Str ("ABCD"); CString test = str; .... char* p = (char*) (const char*) str; P[2] = ' Z '; strcpy (P, "akfjaksjfakfakfakj");//This is the end of it. Do you know what the value in test is at this point? The answer is "ABZD". It's changed, and it's not what you expect. But why is that so? If you think about it, you'll understand, as mentioned earlier, because CString is pointing to the reference block, str and test point to the same block, and when you p[2]= ' Z ', of course test will change as well. So use it to do LPCTSTR, you can only read this piece of data, never change its content.
What if I want to modify the data directly through the pointer? is to use GetBuffer (...). Look at the following code: CString Str ("ABCD"); CString test = str; .... char* p = str. GetBuffer (20); P[2] = ' Z '; To do this, now the test value is still "ABCD" strcpy (P, "akfjaksjfakfakfakj"); To do this, now the test value is still "ABCD" Why is that. In fact GetBuffer (20) when called, it actually creates a new inner block, and allocates 20 bytes of buffer length, and the original memory block reference count is correspondingly reduced by 1. So after executing the code, STR and test point to two different places, so it's peaceful.
Continued
(4) But here's one more thing to note: Str. GetBuffer (20), the allocation length of STR is 20, that is, the pointer p it points to the buffer is only 20 bytes long, give it value, must not exceed, otherwise the disaster is not far away from you, if the specified length is less than the original string length, such as GetBuffer (1), In fact it allocates 4 bytes in length (that is, the original string length); In addition, when calling GetBuffer (...) and change its contents, be sure to remember to call ReleaseBuffer (), which updates the header information of the referenced memory block based on the string contents. (5) Finally, there is a note, see the following code: char* p = NULL; Const char* q = NULL; { CString str = "ABCD"; Q = (LPCTSTR) str; p = str. GetBuffer (20); AfxMessageBox (q);//Legal strcpy (P, "This is Test");//Legal, } AfxMessageBox (q);//illegal, possibly finished strcpy (P, "This is Test");//illegal, possibly finished The point here is that, when these pointers are returned, the pointers are also invalidated if the CString object lives over.
The following shows a code execution procedure void Test () { CString Str ("ABCD");//str points to a reference memory block (reference count of memory blocks is 1, Length is 4, allocation length is 4) CString a;//a points to an initial data state, a = str; A and Str point to the same reference memory block (reference memory block reference count is 2, Length is 4, allocation length is 4) CString B (a);//a, B, and Str point to the same reference memory block (reference to memory block) Count is 3, length is 4, allocation length is 4) { LPCTSTR temp = (LPCTSTR) a;//temp points to the string header address of the reference memory block. (Reference memory block reference count is 3, length is 4, allocation length is 4) CString d = A; A, B, D and Str point to the same reference memory block (reference memory block reference count is 4, length is 4, allocation length is 4) b = "Testa"; This statement is actually called the cstring::operator= (cstring&) function. B points to a newly allocated reference memory block. (The newly allocated reference memory block Reference count is 1, length is 5, allocation length is 5) At the same time, the original reference memory block reference count minus 1. A, D, and Str still point to the original Reference memory block (reference memory block reference count is 3, length is 4, allocation length is 4) }//due to the end of D life, call the destructor, to the reference count minus 1 (reference memory The reference count for the block is 2, the length is 4, and the allocation length is 4) LPTSTR temp = A.getbuffer (10);//This statement also causes a new memory block to be reassigned. Temp points to the header address of the newly allocated reference memory block (new The reference count for the allocated reference memory block is 1, the length is 0, the allocation length is 10) At the same time, the original reference memory block reference count minus 1. Only STR is still Point to the original referenced memory block (reference count of memory blocks is 1, Length is 4, allocation length is 4) strcpy (temp, "temp"); A refers to a reference memory block that has a reference count of 1, a length of 0, and an allocation length of 10 A.releasebuffer ();///Note: A refers to a reference memory block with a reference count of 1, a length of 4, and an allocation length of 10 } By doing this, all the local variable lifecycles have ended. Object STR A B each calls its own destructor function, the reference memory block that you point to is also reduced by 1 Note that the count of reference memory blocks pointed to by str a B is 0, which causes the allocated memory block to be freed
By observing the process above, we will find that CString can point to multiple objects in the same reference block, but when they make various copies, assign values and change the contents of the string, its processing is very intelligent and very safe, completely do not interfere with each other, do not affect each other. Of course, you must be required to use the correct code, especially in the actual use of more complex situations, such as function parameters, references, and sometimes to be saved to the cstringlist, if even a small piece of place improper use, the result will cause unpredictable errors
5 Effect of FreeExtra () Look at this piece of code. (1) CString str ("test"); (2) LPTSTR temp = str. GetBuffer (50); (3) strcpy (temp, "There is character"); (4) Str. ReleaseBuffer (); (5) Str. FreeExtra (); When the above code executes to line 4, it is known that Str points to a reference memory block count of 1, a length of 22, and an allocation length of 50. Then execute str. FreeExtra (), it frees the allocated excess memory. (Reference memory block count is 1, length is 22, allocation length is 22)
6 Format (...) With FORMATV (...) This statement is most error-prone in use. Because it is the most skillful, but also very flexible. Here, I do not intend to analyze it carefully, in fact sprintf (...) How to use it, how it is used. I only remind the use of one thing to be aware of: it is the particularity of its parameters, because the compiler at compile time does not go to verify the format string parameters and corresponding arguments of the type and length. So you have to be aware that the two must correspond, Otherwise, it will go wrong. Such as: CString str; int a = 12; Str. Format ("first:%l, Second:%s", A, "error");//result? Try it.
Continued
7 LockBuffer () and UnlockBuffer () As implies, the function of these two functions is to lock and unlock the reference memory block. But what is the effect of using it and the effect on the CString string after it has been executed? Actually quite simple, look at the following code: (1) CString str ("test"); (2) Str. LockBuffer (); (3) CString temp = str; (4) Str. UnlockBuffer (); (5) Str. LockBuffer (); (6) str = "Error"; (7) Str. ReleaseBuffer (); After execution (3), unlike the usual case, temp and STR do not point to the same reference memory block. You can look at the Watch window with this expression (cstringdata*) ((cstringdata*) (Str.m_pchdata)-1). In fact, this is illustrated in MSDN: While in a locked state, the string was protected in and ways:
No other string can get a reference to the data in the locked string, even if of string is assigned to the locked string . The locked string would never reference another string, even if, other string was copied to the locked string.
8 CString just deal with strings. No, CString not only can manipulate strings, but also handle memory block data. It's a perfect feature. Look at this piece of code. Char p[20]; for (int loop=0; loop<sizeof (P); loop++) { P[loop] = 10-loop; } CString Str ((LPCTSTR) p, 20); Char temp[20]; memcpy (temp, str, str.) GetLength ()); STR is fully capable of reproducing the memory block p into the memory block temp. So we can handle binary data with CString.
8 allocsysstring () and setsysstring (bstr*) These two functions provide a conversion of a string to a BSTR. Note: When you call AllocSysString (), you must call it SysFreeString (...).
9 Safety inspection of parameters Multiple macros are provided in MFC to perform security checks on parameters, such as: ASSERT. Among them in the CString is no exception, there are many such parameters test, in fact, this also shows that the code security is high, but sometimes we will find this very annoying, also cause debug and release version is not the same, such as sometimes the program debug Pass normal, and release the program crashes; , debug not, release line. In fact, I personally believe that our use of CString, we should strive to high-quality code, can not appear in the debug version of any assertion box, even if the release run seems to Everything looks fine. But it's not safe. The following code: (1) CString str ("test"); (2) Str. LockBuffer (); (3) LPTSTR temp = str. GetBuffer (10); (4) strcpy (temp, "error"); (5) Str. ReleaseBuffer (); (6) Str. ReleaseBuffer ();//execution to this point, the debug version will play an error box
The exception handling of ten CString I just want to emphasize that it is possible to throw cmemoryexception only when memory is allocated. Similarly, in the function declaration in MSDN, a function that has throw (cmemoryexception) has the possibility of reallocating or adjusting memory.
11 CString when crossing modules. That is, when the parameter in the interface function of a DLL is cstring&, what happens to it. Answer the questions I met Problem. My question has been posted, the address is: http://www.csdn.net/expert/topic/741/741921.xml?temp=.2283136
When constructing a CString object such as CString str, do you know the reference memory block pointed to by STR at this time? Maybe you'll think it points to null. In fact, if this is the case, the reference mechanism used by CString will have trouble managing the memory block, so CString when constructing an empty string object, it will point to a fixed initialization address, the data is declared as follows: afx_static_data int _afxinitdata[] = { -1,0,0,0}; A brief description summarizes: When a CString object string is empty, such as empty (), CString A, and so on, its member variable m_pchdata points to the address of the _afxinitdata variable. When this CString object life cycle ends, it normally goes to the reference memory block count minus 1, and if the reference count is 0 (that is, when there is no CString reference), the reference memory is freed. The situation now is that if CString refers to a block of memory that is initialized with a block of memory, no memory is freed.
Having said so much, it has nothing to do with the problems I have encountered. It's really a big relationship. The real reason is that if the EXE module and the DLL module have a is a static build connection. Then this CString initialization data has a different address in the EXE module and the DLL module, because the static connection will have a copy of the source code in this module. In another case, if the two modules are share connected, the CString implementation code is implemented in another separate DLL, and the afx_static_data specified variable is only loaded once, so _afxinitdata in the two modules have the same address. Now the question is fully understood. You can show it yourself. __declspec (dllexport) void Test (cstring& str) { str = "ABDEFAKDFJ";//if it is a static connection and the incoming STR is an empty string, there is an error. }
The last idea: write here, in fact, there are a lot of CString in the good stuff, I did not explain. such as many overloaded operators, lookups, and so on. I think it's better to look at MSDN in more detail than I can tell. I only focus on those situations that can go wrong. Of course, I describe if there are errors, please expert guidance, thank you very much.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.