When parsing a digital certificate in the form of a/d, sometimes the value types of commonname, countryname, and so forth from the certificate are bmpstring, especially if the values are in Chinese. If you use wprintf () to output these values under the console of Windows, the result is garbled.
In order to figure out the cause of garbled, find a certificate, look at the countryname, the corresponding ASN.1 encoding type is bmpstring, encoding is: 0x1E, 0x4, 0x4E, 0x2D, 0x56, 0xFD, the corresponding value is "China". In the online query "China" corresponding to the Unicode encoding is {0x4E, 0x2D, 0x56, 0xfd},0x4e, 0x2D corresponding characters "in", 0x56, 0xFD corresponding to the character "country". Put characters 0x4E, 0x2D, 0x56, 0xFD in a character array, call setlocale (), wprintf () function, and output is garbled.
On the Internet, for the bmpstring ASN.1 code, the load part of the Unicode encoding in the UTF-16 encoding, a character encoding accounted for two bytes. But which of the two bytes is used to store the encoded high 8 bits, which is used to store the encoded low 8 bits, in different places have different processing methods. In the ASN.1 encoding, generally for the load part of the encoding are used Big-endian order, so from the digital certificate extracted from the "China" corresponding to the encoding {0x4E, 0x2D, 0x56, 0xFD}, the order is Big-endian order. The Little-endian byte order is typically used on Intel's CPUs, and processing data in Windows is also in Little-endian order, so trying to output Big-endian sequential encoded characters in Windows will, of course, be garbled. (By the way, for Universalstring's ASN.1 encoding, the payload part uses UTF-32 encoding in Unicode encoding, and the encoding of one character takes up four bytes.) )
To solve the problem of garbled output, the method is before the output, the Big-endian sequence encoded characters are converted to Little-endian sequential encoded characters, and then output, will not produce garbled. A sample program is given below:
/************************************************** * Author:han Wei * Author ' s blog:http://blog.csdn.net/henter/* Date:oct 30th, description:demonstrate How to print bmpstring on Windows console******************************** /#include <stdio.h> #include <stdlib.h> #include <string.h> #include <locale.h >/*************************************************** function Name: interchangeendianorder* function: Reverse the BMPString encoding each Endian Order of UTF-16 characters * parameter: bmpstring [in] bmpstring_len [in] bmpstring length, in bytes * return value: 0 Success-1 failed * Standby Note: Bmpstring is usually composed of UTF-16 characters, UTF-16 characters are sometimes in Big-endian order, sometimes in Little-endian order, and the function is reversed endian order ********************* /int interchangeendianorder (unsigned char *bmpstring, unsigned int bmpstring_len) {int i; unsigned char *p, temp; if ((bmpstring_len% 2)! = 0) {#ifdef _debugprintf ("Invalid bmpstring byte length:%d.\n", Bmpstring_len);p rintf ("Bmpst Ring byte length must be MUltiple of 2!\n "); #endifreturn (-1); } p = bmpstring; for (i=0; i < (int) (BMPSTRING_LEN/2); i++) {temp=*p;*p=* (p+1); * (p+1) =temp;p+=2; } return 0;} /*************************************************** function Name: printbmpstring* function: Output bmpstring* parameters under the Windows console interface: bmpstring [in] bmpstring_len [in] bmpstring length, in bytes * return value: 0 Success-1 failure ******************************* /int printbmpstring (unsigned char *bmpstring, unsigned int bmpstring_len) {unsigned char *buffer; unsigned int buffer_len; Buffer_len = Bmpstring_len +2; /* The buffer size is two bytes longer than the bmpstring byte length, and the two bytes are used to hold the UTF-16 encoded string terminator, Its corresponding encoding is 0x0, 0x0 */if (!) ( buffer= (unsigned char *) malloc (Buffer_len)) {#ifdef _debugprintf ("malloc () function failed!\n"); #endifreturn (-1); } memset (buffer, 0, Buffer_len); memcpy (buffer, bmpstring, Bmpstring_len); SetLocale (Lc_all, "CHS"); Interchangeendianorder (buffer, bmpstring_len); wprintf (L "bmpstring:%ls\n", (wchar_t *) buffer); Free (buffer); return 0;} int main (void) {int error_code; unsigned char bmpstring_data1[]={0x4e, 0x2d, 0x56, 0xfd}; /* Chinese string "China" corresponds to Unicode encoding */unsigned char bmpstring_data2[]={0x0, 0x55, 0x0, 0x73, 0x0, 0x65, 0x0, 0x72}; /* Unicode encoding for "User" in the English string */wchar_t str[]=l "China"; unsigned char *p; int i; if (Error_code = printbmpstring (bmpstring_data1, sizeof (BMPSTRING_DATA1))) {printf ("Print bmpstring on Windows console Failed!\n "); return (-1); } if (Error_code = printbmpstring (bmpstring_data2, sizeof (BMPSTRING_DATA2))) {printf ("Print bmpstring on Windows conso Le failed!\n "); return (-1); }/* Below is an example of how Unicode encoded characters are stored in Windows, and the results show that each UTF-16 character is stored in Little-endian order */printf ("\ n"); SetLocale (Lc_all, "CHS"); wprintf (L "%ls\n", (wchar_t *) str); p= (unsigned char *) str; printf ("Wide character length is:%d\n", wcslen (str)); printf ("Unicode encode on Windows platform:"); for (i=0; i < (int) (Wcslen (StR); i++) {printf ("0x%x", *p);p + +; } printf ("\ n"); System ("pause"); return 0;}
Output results such as:
Output the contents of bmpstring in the console interface of Windows