Understanding the magical BSTR Data Type

Source: Internet
Author: User
Tags string methods

Http://blog.163.com/pugood@126/blog/static/1344175932009111111526409/

Most languages supporting com cannot process character arrays ending with null (whether Unicode or not ). Visual Basic, Java, VBScript, and JScript all want strings to be fixed in bytes. The BSTR data type is a unicode string with a fixed byte length ending with null. It can be used in all com-compatible languages. Although all com-compatible languages can use BSTR, they all operate in their own way. VB programmers use the following code to create BSTR:

'Vb developer made a BSTR.
'
Dim name as string
Name = "Fred Flintstone"

As a C ++ programmer, we use a group of COM packages to create and operate BSTR data. Each BSTR method name has a prefix of "sys-" to indicate that it is a BSTR (system string) operation ). It is interesting that BSTR is a typedef of olechar *, so it is an array of olechar characters.

// Behold the BSTR (<wtypes. h> ).
Typedef olechar * BSTR;

The BSTR operation method is actually different. Next, let's take a look at the commonly used BSTR method to know when to use it.

Create BSTR in C ++

When you want to create BSTR in C ++, you need to use sysallocstring (). This method calculates the length of the string and sets enough cache. For example, we input a unicode string and use the bstrname variable to keep the returned value:

// Sysallocstring () creates a BSTR.
BSTR bstrname;
Bstrname = sysallocstring (L "Fred Flintstone ");

Of course, in most cases, you do not want to use a hard-code string to initialize BSTR, but use a variable. Therefore, you can use the olechar * variable to create BSTR (use the olestr macro to ensure that the correct type is used ):

// Create a BSTR using an array of olechar types (cocould be char or wchar_t ).
Olechar * polestr;
Polestr = olestr ("Fred flinstone ");
BSTR bstrname;
Bstrname = sysallocstring (polestr );

Operate BSTR

Once you create a BSTR, you may reset its value in the program. Use sysreallocstring () to modify an existing BSTR, which will release the previous space, recalculate the string length and set the cache:

// Change existing bstrname to 'Mr. slate'
Sysreallocstring (& bstrname, l "mr. Slate ");

The sysstringlen () method calculates the length of the existing BSTR cache for you:

// Mr. Slate = 9
Int length = sysstringlen (bstrname );

It is important that any BSTR created using sysallocstring () must be cleared using sysfreestring. Any BSTR you obtain from the interface method also needs to be cleared using sysfreestring.

// All done with the string.
Sysfreestring (bstrname );

Note: If you forget to use sysfreestring () to clear BSTR, memory leakage will occur. The importance of this is the same as that of using new in C ++ to obtain the memory and forgetting to delete the memory with Delete.

Additional BSTR Methods

The sysallocstring (), sysstringlen (), and sysfreestring () methods are a good start for learning to operate BSTR. The bstr api also defines some other methods. All the methods defined in <oleauto. h> are listed here. The online help provides more comprehensive annotations:

Sysallocstring ()
Create a new BSTR.

Sysreallocstring ()
Reset an existing BSTR.

Sysstringlen ()
Returns the length of BSTR.

Sysfreestring ()
Destroy an existing BSTR.

Sysreallocstringlen ()
Used to create a BSTR based on some length of characters.

Sysstringbytelen ()
Returns the length of BSTR bytes. (Win32)

Sysallocstringbytelen ()
Use binary data to create BSTR. You can only use it without converting ANSI to Unicode or Unicode to ANSI. (Win32)


Unicode to ANSI Conversion

Even if we can all accept BSTR (maximizing language independence), we still have an unsolved problem. The string parameters of Win32 APIs are generally ANSI Type. For example, our widely used MessageBox () looks like this:

// This is the MessageBox () method we think we know...
MessageBox (hwnd, lpcstr lptext, lpcstr lpcaption, uint utype );

Based on the method prototype above, it looks like we need to provide two character array constants (lpcstr = long pointer to the constant character array ). However, the reality is always strange. In fact, the root of the Win32 API does not have the MessageBox () method. In fact, this method (all Win32 Methods containing string parameters) is defined as two possible forms:

// Every Win32 function which takes text strings has an ANSI (A) or Unicode (W)
// Variation.
# Ifdef Unicode
# Define MessageBox messageboxw
# Else
# Define MessageBox messageboxa
# Endif //! Unicode

In Win NT, when you choose to use Unicode to compile your current project, the Unicode preprocessing flag is defined (in the project | settings menu ). In this case, all the methods in the API are automatically converted to the wide character version. For example, MessageBox () is converted to the following format:

// Under Unicode builds, all strings come through as an array of constant wchar_t.
Messageboxw (hwnd, lpcwstr lptext, maid, uint utype );

In a non-Unicode structure, MessageBox () is converted to an ANSI character version:

// ANSI builds use const char arrays.
Messageboxa (hwnd, lpcstr lptext, maid, uint utype );

We are faced with a dilemma. If we select the Unicode structure, our project can only run correctly under Win NT. If we select a non-Unicode structure, the program can run on all platforms, although on the UNICODE platform (such as Win NT) converts ANSI to Unicode (which means less efficiency ).

Conversion Method

Win32 defines two very powerful methods that allow you to convert ANSI to Unicode, or convert Unicode to ANSI. These two methods give you maximum flexibility. However, given their complex parameters, it is a little difficult to use:

? Multibytetowidechar (): converts an ANSI string to a unicode equivalent.
? Widechartomultibyte (): converts a unicode string to an ANSI equivalent.

Another option is that C's runtime database provides a simple, convenient, and cross-platform conversion method. To convert a Unicode (such as BSTR) string to an ANSI string, you can call the wcstombs () method (wide character string to multibyte string ):

// Wcstombs (char * ansistring, wchar_t * unicodestring, size_t count );
Char buff [max_length];
BSTR = sysallocstring (L "How did Strings get so uugly? ");
Wcstombs (buff, BSTR, max_length); // P3 = size of target buffer.
Cout <buff <Endl; // pump to console.
Sysfreestring (BSTR );

To convert an ANSI string to Unicode, call the mbstowcs () method (multi byte string to wide character string ):

// Transform an existing char * (ANSI) into a wchar_t * (UNICODE)
Mbstowcs (wchar_t * unicodestring, char * ansistring, size_t count );

After we transfer from com to ATL, we will have a complete set of macros for conversion, which simplifies character conversion operations and forgets the previous four methods. At the same time, the ccombstr class in ATL saves us from complicated string methods. But now, we still have a long way to go, and we still need to use those conversion methods.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.