Tchar char character and character array string operations

Source: Internet
Author: User

The following operations are defined in the real class:

TCHAR m_illegal_chars [13];

TCHAR temp [13] = {_ T ('| '),
_ T ('*'),
_ T ('\\'),
_ T (':'),
_ T (';'),
_ T ('> '),
_ T ('<'),
_ T ('? '),
_ T ('"'),
_ T (','),
_ T ('= '),
_ T (''')};
Int I = sizeof (temp)/sizeof (temp [0]); at this time I = 13
_ Tcscpy_s (m_illegal_chars, I, temp );

There is no problem above.

But the following code

_ Tcscpy_s (m_illegal_chars, 12, temp); Replace I with 12

The program encountered an exception and went down. The m_illegal_chars value is 0 * \:;> <? ", = 'Instead of | * \:;> <? ", = 'The first character is not copied successfully;

Explanation:

Use _ tcscpy_s in the class to pay attention to small details

Copying a string always starts after the last two to three bytes, and the first few bytes remain unchanged. At first, we thought it was a problem with the UTF8 header (ef bb bf), but the few bytes are not, the file is not UTF8 either. One step at a time, see errno_t _ cdecl _ FUNC_NAME (_ CHAR * _ DEST, size_t _ SIZE, const _ CHAR * _ SRC)

The _ DEST of this function actually moved several bytes forward. ALT + 8 looked at the assembly and found that the address was offset from this +, but the offset was 4-byte alignment, there is a bool variable before this offset, resulting in a 3-byte offset difference. vs2010 still 4-byte alignment, resulting in 3 more.

Bytes are aligned in the struct or class.

There are several factors that affect byte alignment.

1) the compiler alignment the byte value. The default value is 4, which can be changed through # pragra.

2) Class Members align with the byte value. Bytes occupied by Members

3) Class alignment byte value. The alignment value of the maximum member in the class. It determines the value of the class when completing the bytes.

4) Valid alignment values of class members. Class member alignment of the byte value and the compiler alignment of the smaller byte value. It determines the size of the class member.

You should have seen the Assert window.

"Buffer is too small"

What is the reason? The string 'str' contains 8 characters and the string Terminator '\ 0'. It must be a buffer of 9. This will cause an error,

To use strcpy_s is to consider the strcpy_s series of functions, which can ensure that cross-border operations are not performed.

A very important question:

Common string processing methods on win32 platforms

 

1: UNICODE is strongly recommended for programming.

1. unicode allows you to use an exe or DLL to support multiple languages for localization.

2. Unicode can save the WIN32API call time and space overhead. The following describes in detail.

3. It can better interact with COM components because COM components only support UNICODE.

4. It can better interact with NETFRAMEWORK.

 

For win32 API functions, UNICODE is used in windows to implement string-related functions. However, generally, the ASCLL and Unicode versions are provided, such as create0000wa and CreateWindowW. In fact, only the latter is actually implemented internally. The former is memory allocation, and the ASCLL string passed in is a wide character, then pass in the latter, wait for the latter to return, release the memory, and return. In the header file, Microsoft defines the macro

# Ifdef UNICODE

# Define CreateWindow CreateWindowW

# Else

# Define CreateWindow create0000wa

# Endif

Determines which one to call by defining UNICODE.

So what about the C standard library? The string processing functions in the C standard library are not like the win32 API and only implement the wide character version. The C standard library has two implementations, one is ANSI and the other is UNICODE, programmers can specify this parameter during compilation. For more information, see Chapter 1 of MFC. For example, strlen is for ASCLL and wcslen is for UNICODE. How can we determine which one to use in the standard library? The standard library defines macros:

# Ifdef _ UNICODE

# Define _ tcslen wcslen

# Else

# Define _ tcslen strlen

# Endif

It can be seen that the standard library defines UNICODE identifiers as underlined characters, while Microsoft's Development Team does not. Microsoft's logo is not underlined. However, in general, our program requires both APIs and standard library functions, so there is a rule that either UNICODE and _ UNICODE are specified, otherwise none are specified.

Here are some principles to summarize :)

1. UNICODE is recommended.

2. Do not use char or wchar_t, but use CHAR or WCHAR. It is best to include tchar. h and use TCHAR.

3. When calculating the length of the character array, use sizeof (ArrayName)/sizeof (ArrayName [0]) for calculation.

4. Both UNICODE and _ UNICODE are defined.

5. Use the windows API functions MultiByteToWideChar and WideCharToMultiByte to convert ACSLL and UNICODE characters.

6. You can also use wcstombs in the C standard library to perform the conversion.

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.