String Conversion Method (zt) in Visual C ++. net)

Source: Internet
Author: User

Visual c ++. Net involves multiple programming methods such as ATL/ATL server, MFC, and hosting C ++. It is not only powerful but also widely used. In programming, we often encounter Character String Conversion operations for different encoding types of ANSI, Unicode, and BSTR. This article first introduces the basic string type, then describes the related classes, such as ccombstr, _ bstr_t, and cstringt, and finally discusses their conversion methods, including using the latest atl7.0 conversion classes and macros, such as ca2ct and ca2tex.

  I. BSTR, lpstr, and lpwstr

In all programming methods of Visual C ++. net, we often use such basic string types, such as BSTR, lpstr, and lpwstr. These data types are similar to the above because of data exchange between different programming languages and support for ANSI, Unicode, and multi-byte character sets (MBCS.

So What Are BSTR, lpstr, and lpwstr?

BSTR (Basic string, basic string) is a unicode string of the olechar * type. It is described as a Type compatible with automation. Because the operating system provides corresponding API functions (such as sysallocstring) to manage it and some default scheduling code, BSTR is actually a com string, however, it is widely used in a variety of scenarios other than automation technology. Figure 1 describes the structure of BSTR, where the DWORD value is the actual number of bytes occupied by the string, and its value is twice the Unicode Character in the string.

Lpstr and lpwstr are a string data type used by Win32 and VC ++. Lpstr is defined as an eight-character ANSI character array pointer pointing to a null ('/0') end, lpwstr is a 16-bit dubyte character array pointer pointing to a null end. In VC ++, there are similar string types, such as lptstr and lpctstr. Their meanings are 2.

For example, "Long pointer to a constant generic string" indicates "a long pointer type pointing to a general String constant ", maps to const char * of C/C ++, while lptstr maps to char *.

Generally, the following types are also defined:

# Ifdef Unicode
Typedef lpwstr lptstr;
Typedef maid;
# Else
Typedef lpstr lptstr;
Typedef maid;
# Endif

  Ii. cstring, cstringa, and cstringw

In Visual C ++. net, cstringt is used as the "General" string class shared by ATL and MFC. It has three forms: cstring, cstringa, and cstringw, which operate on strings of different character types respectively. These character types are tchar, Char, and wchar_t. Tchar is equivalent to wchar (16-bit Unicode character) on the UNICODE platform, and is equal to char in ANSI. Wchar_t is generally defined as unsigned short. Because cstring is often used in MFC applications, it is not repeated here.

  3. Variant, colevariant, and _ variant_t

In Ole, ActiveX, and COM, the variant data type provides a very effective mechanism because it includes both the data itself and the data type, therefore, it can realize various automatic data transmission. Let's take a look at a simplified version defined by variant in The IDL. h file:

Struct tagvariant {
Vartype VT;
Union {
Short ival; // vt_i2.
Long lval; // vt_i4.
Float fltval; // vt_r4.
Double dblval; // vt_r8.
Date; // vt_date.
BSTR bstrval; // vt_bstr.
...
Short * pival; // vt_byref | vt_i2.
Long * plval; // vt_byref | vt_i4.
Float * pfltval; // vt_byref | vt_r4.
Double * pdblval; // vt_byref | vt_r8.
Date * pdate; // vt_byref | vt_date.
BSTR * pbstrval; // vt_byref | vt_bstr.
};
};

Obviously, the variant type is a C structure, which contains a type member VT, some reserved bytes, and a large union type. For example, if VT is vt_i2, we can read the value of variant from ival. Similarly, when assigning values to a variant variable, you must specify its type first. For example:

Variant Va;
: Variantinit (& VA); // Initialization
Int A = 2002;
Va. Vt = vt_i4; // specifies the long data type.
Va. lval = A; // value assignment

To facilitate variable processing of the variant type, windows also provides such useful functions:

Variantinit -- initialize the variable to vt_empty;

Variantclear -- remove and initialize variant;

Variantchangetype -- change the variant type;

Variantcopy -- releases the memory connected to the target variant and copies the source variant.

The colevariant class encapsulates the variant structure. Its constructor has very powerful functions. When constructing an object, it first calls variantinit for initialization, then calls the corresponding constructor according to the standard type in the parameter, and uses variantcopy for conversion and value assignment, when a variant object is out of the valid range, its destructor is automatically called. Because the Destructor calls variantclear, the corresponding memory is automatically cleared. In addition, colevariant's value assignment operator provides great convenience for us in the conversion from variant type. For example, the following code:

Colevariant V1 ("this is a test"); // directly construct
Colevariant v2 = "this is a test ";
// The result is of the vt_bstr type and the value is "this is a test"
Colevariant V3 (long) 2002 );
Colevariant V4 = (long) 2002;
// The result is of the vt_i4 type and the value is 2002.

_ Variant_t is a variant class used for com. Its functions are similar to those of colevariant. However, when using the Visual C ++. Net MFC application, you must add the following two sentences before the code file:

# Include "comutil. H"

# Pragma comment (Lib, "comsupp. lib ")

4. ccombstr and _ bstr_t

Ccombstr is an ATL class encapsulated for the bstr data type. It is easy to operate. For example:

Ccombstr bstr1;
Bstr1 = "bye"; // direct value assignment
Olechar * STR = olestr ("ta"); // width of 5 Characters
Ccombstr bstr2 (wcslen (STR); // defines the length as 5
Wcscpy (bstr2.m _ STR, STR); // copy the wide string to BSTR
Ccombstr bstr3 (5, olestr ("Hello World "));
Ccombstr bstr4 (5, "Hello World ");
Ccombstr bstr5 (olestr ("Hey there "));
Ccombstr bstr6 ("Hey there ");
Ccombstr bstr7 (bstr6 );
// Copy during construction. The content is "Hey there"

_ Bstr_t is the encapsulation of BSTR by C ++. Its constructor and destructor call the sysallocstring and sysfreestring functions respectively. Other operations use the bstr api functions. Similar to _ variant_t, comutil. h and comsupp. Lib must be added for use.

  5. BSTR, char *, and cstring Conversion

(1) convert char * To cstring

If char * is converted to cstring, you can use cstring: format in addition to direct value assignment. For example:

Char charray [] = "this is a test ";
Char * P = "this is a test ";

Or

Lpstr P = "this is a test ";

Or in the use of Unicode applications that have been defined

Tchar * P = _ T ("this is a test ");

Or

Lptstr P = _ T ("this is a test ");
Cstring thestring = charray;
Thestring. Format (_ T ("% s"), charray );
Thestring = P;

(2) convert cstring to char *

If the cstring type is converted to the char * (lpstr) type, the following three methods are often used:

Method 1: use forced conversion. For example:

Cstring thestring ("this is a test ");
Lptstr lpsz = (lptstr) (lpctstr) thestring;

Method 2: Use strcpy. For example:

Cstring thestring ("this is a test ");
Lptstr lpsz = new tchar [thestring. getlength () + 1];
_ Tcscpy (lpsz, thestring );

It should be noted that the second parameter of strcpy (or _ tcscpy of Unicode/MBCS) is const wchar_t * (UNICODE) or const char * (ANSI ), the system compiler will automatically convert it.

Method 3: Use cstring: getbuffer. For example:

Cstring S (_ T ("this is a test "));
Lptstr P = S. getbuffer ();
// Add the code using P here
If (P! = NULL) * P = _ T ('/0 ');
S. releasebuffer ();
// Release immediately after use, so that other cstring member functions can be used.

(3) convert BSTR to char *

Method 1: Use convertbstrtostring. For example:

# Include
# Pragma comment (Lib, "comsupp. lib ")
Int _ tmain (INT argc, _ tchar * argv []) {
BSTR bstrtext =: sysallocstring (L "test ");
Char * lpsztext2 = _ com_util: convertbstrtostring (bstrtext );
Sysfreestring (bstrtext); // release after use
Delete [] lpsztext2;
Return 0;
}

Method 2: Use the _ bstr_t value assignment operator to overload. For example:

_ Bstr_t B = bstrtext;
Char * lpsztext2 = B;

(4) convert char * To BSTR

Method 1: Use API functions such as sysallocstring. For example:

BSTR bstrtext =: sysallocstring (L "test ");
BSTR bstrtext =: sysallocstringlen (L "test", 4 );
BSTR bstrtext =: sysallocstringbytelen ("test", 4 );

Method 2: Use colevariant or _ variant_t. For example:

// Colevariant strvar ("this is a test ");
_ Variant_t strvar ("this is a test ");
BSTR bstrtext = strvar. bstrval;

Method 3: Use _ bstr_t, which is the simplest method. For example:

BSTR bstrtext = _ bstr_t ("this is a test ");

Method 4: Use ccombstr. For example:

BSTR bstrtext = ccombstr ("this is a test ");

Or

Ccombstr BSTR ("this is a test ");
BSTR bstrtext = BSTR. m_str;

Method 5: Use convertstringtobstr. For example:

Char * lpsztext = "test ";
BSTR bstrtext = _ com_util: convertstringtobstr (lpsztext );

(5) convert cstring to BSTR

Generally, cstringt: allocsysstring is used. For example:

Cstring STR ("this is a test ");
BSTR bstrtext = Str. allocsysstring ();
...
Sysfreestring (bstrtext); // release after use

(6) convert BSTR to cstring

Generally, you can perform the following operations:

BSTR bstrtext =: sysallocstring (L "test ");
Cstringa STR;
Str. Empty ();
STR = bstrtext;

Or

Cstringa STR (bstrtext );

(7) Conversion between ANSI, Unicode, and wide characters

Method 1: Use multibytetowidechar to convert ANSI to Unicode, and use widechartomultibyte to convert Unicode to ANSI.

Method 2: Use "_ t" to convert ANSI to a "general" string and use "L" to convert ANSI to Unicode, in the hosted C ++ environment, you can use s to convert an ANSI string to a string * object. For example:

Tchar tstr [] = _ T ("this is a test ");
Wchar_t wszstr [] = l "this is a test ";
String * STR = s "this is a test ";

Method 3: Use the conversion macro and class of ATL 7.0. Based on the original 3.0, atl7.0 has improved and added many Character String Conversion macros and provided corresponding classes. It has three unified forms:

Among them, the first c Represents a "class", so that the ATL 3.0 macro is different, the second C represents a constant, 2 represents a "to", and ex represents a buffer of a certain size. Sourcetype and destinationtype Can Be A, T, W, and OLE. The meanings are ANSI, Unicode, General, and Ole strings. For example, ca2ct converts ansi to a String constant of the normal type. The following is some sample code:

Lptstr tstr = ca2tex <16> ("this is a test ");
Lpctstr tcstr = ca2ct ("this is a test ");
Wchar_t wszstr [] = l "this is a test ";
Char * chstr = cw2a (wszstr );

  Vi. Conclusion

Almost all programs use strings, while visual c ++. NET is powerful and widely used, so the conversion between strings is more frequent. This article involves almost all current conversion methods. Of course, for the. NET Framework, you can also use convert and text classes to convert different data types and character encodings.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.