Internal implementation of the VBS string _VBS

Source: Internet
Author: User
Tags chr strlen python script truncated
The recent discussion of ① truncation of the VBS string CHR (0) has been much discussed, and it seems necessary to introduce the internal implementation of the VBS string. Demon Tips: This article requires some knowledge of C language and Windows programming, and VBScript beginners cautiously enter.

VBS is based on Microsoft's activex/com technology, and COM objects in order to support any language, defined a series of common data types, Microsoft called Automation Object Type (Automation data types), one of them is BSTR. The VBS internally is BSTR to represent the string, BSTR is defined in WTypes.h:
Copy Code code as follows:

typedef wchar_t WCHAR;
typedef WCHAR OLECHAR;
typedef OLECHAR *BSTR;

As you can see from the definition, BSTR is a pointer to the wchar_t type (that is, Unicode in the C language), but BSTR is not a normal wchar_t pointer. The standard BSTR points to a wchar_t array with a length prefix and a NUL terminator. The first 4 bytes of BSTR are a prefix representing the length of the string. The value of the BSTR length field is the number of bytes in the string and does not include the NUL terminator. Refer to the MSDN documentation for common BSTR processing functions.

Theory is a bit abstract, the following code to explain:

Copy Code code as follows:

str = "Hello" & Chr (0) & "World"

This is a very simple VBS code, but what does the VBScript interpreter do internally? In fact, a BSTR variable is initialized (regardless of the string concatenation process):
Copy Code code as follows:

* * For demonstration only, the actual code is certainly not such a * *
BSTR str = SysAllocStringLen (L "Hello\0world", 11); To get a clearer picture of the structure of BSTR, let's change the wording:

/* BSTR contains the length prefix, but actually points to the first character * *
wchar_t arr[] = {22,0, ' H ', ' e ', ' l ', ' l ', ' n ', ' W ', ' O ', ' r ', ' L ', ' d ', ', ', ', '.
BSTR str = &arr[2]; The structure of this BSTR in memory is:

00000000 (6C) 6C 6F 00 00 00
00000010 6F 6C 00 64 00 00 00

Orange indicates a four-byte length prefix. Red highlights the current point of the BSTR pointer, blue highlights the Chr (0) character in the string, and green highlights the BSTR ending character NUL (the character is added to the SysAllocStringLen function, which is Unicode, so it takes up two bytes). That is, if you do not consider the preceding four bytes, BSTR is the null-terminated string in C.

And look at a section of the VBS code:

MsgBox Len (str) uses MsgBox to display the string length just defined, what does the VBScript interpreter do inside? Is it like the C language standard library function strlen, traversing the entire string to NUL as the end of the string?
Copy Code code as follows:

/* C language strlen function simple implementation * *
size_t strlen (const char * str)
{
const char *eos = str;
while (*eos++);
return ((int) (eos-str-1));
}

The answer is obviously negative because the string contains CHR (0), and if implemented like strlen, it will be truncated by CHR (0), and the Len function should return 5, but in effect it returns the correct number 11.

This should be done internally in the Len function of the VBS:
Copy Code code as follows:

/* Ibid, only for demo * *
size_t Len (const BSTR STR)
{
return Sysstringlen (str);
}

Or do not call the Windows API, because the first 4 byte prefixes of BSTR represent the number of bytes of the string (excluding the trailing BUL characters), so just move the pointer:
Copy Code code as follows:

/* cast to int pointer minus one read, then divided by 2 (one Unicode character two bytes) * *
size_t Len (const BSTR STR)
{
return * (int *) str-1)/2;
}

As you can see, because the length of the BSTR can be obtained by prefix, you do not need to use NUL as a string terminator, which means that the VBS string is binary safe (binary security).

So why does the following code only show Hello?

MsgBox Str This seems to contradict the above, but it's not. The VBS string is indeed compatible with the CHR (0) character, and MsgBox is truncated by CHR (0) because MsgBox called the MessageBox function internally, and the function is NUL as a string terminator.
Copy Code code as follows:

/* Simple to implement only one parameter
* The second parameter of the MessageBox is NUL as a terminator
* Pointer to a null-terminated string, that contains, to be displayed.
* So the CHR (0) contained in the VBS string truncates the string
*/
int MsgBox (const BSTR STR)
{
Return MessageBoxW (NULL, str, L "", 0);
}

That is, if the VBS built-in function or some method of the COM component in its internal implementation of the Windows API's string parameter is NUL as the Terminator, it will be truncated by the CHR (0) character.

Now look at the origin of the Chr (0) in Asp/vbscript and the security issues, "ASP upload exploit chr (0) Bypass extension detection script", "ASP defect--a special character chr (0)", "Write ASP page with Python script", There should be no doubt about it.

The time relationship is no longer unfolding, and if you want to learn more about COM components, I recommend that you read Jeff Glatt's God for COM in plain C.

Only in this paper answer the question of wind chimes in the rain.

Note ①: Chr (0) and NUL are used interchangeably in this paper to denote the same meaning.

Original: http://demon.tw/programming/vbs-file-unicode.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.