Constructing Unicode code Table of GB2312 Chinese character library

Source: Internet
Author: User
Tags chr printf

The embedded system of Unicode code table for constructing GB2312 Chinese character library can not be separated from the processing of Chinese characters. The common Chinese character processing method is (takes the handset to accept the text message as an example): For example, you receive a text message, the message is decoded according to UTF-16, then we need to according to each Chinese character Unicode code to find its location in the GB2312 library, and then use the corresponding lattice data on the screen display.

Thus, there must be a means of matching the Unicode code with the data of the Chinese font. The most commonly used means is to do a Unicode code table, found in the array of matching Unicode code, with the matching index (array index) value in another by the value of the index corresponding to the data in the array to display.

+-----------------+ table +-----------------+ with index +-------------------+

| Unicode code for Chinese characters | ==> | Unicode Code Table Array | =======> | Chinese character font Data array | ==> Display Output

+-------- ---------+ +-----------------+ +-------------------+

This article briefly describes how to generate a Unicode code table, other related Chinese character processing technology is not within the scope of this article. :)

Use the following two functions to construct a Unicode code table (* Note 1):

void UnicodeToGB2312(unsigned char* pOut,unsigned short uData)
{
   WideCharToMultiByte(CP_ACP,NULL,&uData,1,pOut,sizeof (unsigned short),NULL,NULL);
   return;
}
void Gb2312ToUnicode(unsigned short* pOut,unsigned char *gbBuffer)
{
   MultiByteToWideChar (CP_ACP,MB_PRECOMPOSED,gbBuffer,2,pOut,1);
   return;
}

A simple example is the following (write a piece of code, just to demonstrate the process of constructing an array, do not Shang Ah!) ^_^):

/*-----------------------------------------------*\
|  GB2312 Unicode Table Constructor | |  Author:spark Song | |  file:build_uni_table.c | |
date:2005-11-18 | \*-----------------------------------------------*/#include <stdio.h> #include <windows.h> void
UnicodeToGB2312 (unsigned char* pout,unsigned short uData);
void Gb2312tounicode (unsigned short* pout,unsigned char *gbbuffer);
void Construct_unicode_table ();
	int main (int argc, char *argv[]) {construct_unicode_table ();
return 0; } void Construct_unicode_table () {#define GB2312_MATRIX #define DELTA (0xa0) #define Font_ro W_begin (+ Delta) #define FONT_ROW_END (+ Delta) #define FONT_COL_BEGIN (1 + delta) #define Font_col
    _end (Gb2312_matrix + DELTA) #define FONT_TOTAL (gb2312_matrix) int i, J;
    unsigned char chr[2];
    unsigned short uni; Unsigned short Data[font_total] = {0};
    int index = 0;
    unsigned short buf;
        Generate Unicode Code table for (i=font_row_begin; i<=font_row_end; i++) for (J=font_col_begin; j<=font_col_end; j + +)
            {Chr[0] = i;
            Chr[1] = j;
            Gb2312tounicode (&uni, CHR); Data[index] = Uni;
        index++; //order, and you can use Binary-search for (i=0;i<index-1; i++) for (j=i+1; j<index; j + +) if (d
                Ata[i]>data[j]) {buf = Data[i];
                Data[i] = Data[j];
            DATA[J] = buf;
    //Output to Std_out printf ("Const unsigned short uni_table[]={\n");
        For (i=0 i<index; i++) {uni = data[i];
        UnicodeToGB2312 (CHR, Uni); printf ("0x%.4x%s/* GB2312 code:0x%.2x%.2x ==> Row:%.2d col:%.2d */\n", Uni, i==in Dex-1? " ":", ", Chr[0], chr[1], chr[0]-Delta, Chr[1]-delta;
    printf ("};\n");
return; } void UnicodeToGB2312 (unsigned char* pout,unsigned short uData) {WideCharToMultiByte (cp_acp,null,&udata,1,pout,s
    izeof (unsigned short), null,null);
Return } void Gb2312tounicode (unsigned short* pout,unsigned char *gbbuffer) {MultiByteToWideChar (cp_acp,mb_precomposed,gbbuf
    fer,2,pout,1);
Return }

After compiling with VC, execute in DOS:

build_uni_table.exe > report.txt

You can get the following TXT file:

const unsigned short uni_table[]={
  0x4E00, /* GB2312 Code: 0xD2BB ==> Row:50 Col:27 */
  0x4E01, /* GB2312 Code: 0xB6A1 ==> Row:22 Col:01 */
  0x4E03, /* GB2312 Code: 0xC6DF ==> Row:38 Col:63 */
  0x4E07, /* GB2312 Code: 0xCDF2 ==> Row:45 Col:82 */
... ...
  0x9F9F, /* GB2312 Code: 0xB9EA ==> Row:25 Col:74 */
  0x9FA0, /* GB2312 Code: 0xD9DF ==> Row:57 Col:63 */
  0xE810, /* GB2312 Code: 0xD7FA ==> Row:55 Col:90 */
  0xE811, /* GB2312 Code: 0xD7FB ==> Row:55 Col:91 */
  0xE812, /* GB2312 Code: 0xD7FC ==> Row:55 Col:92 */
  0xE813, /* GB2312 Code: 0xD7FD ==> Row:55 Col:93 */
  0xE814 /* GB2312 Code: 0xD7FE ==> Row:55 Col:94 */
};

It is okey to use this generated array copy into the project code. HoHo, in fact, in the development of code to build code to construct a lot of opportunities, coder do not need to coding help their own development waste AH ~:)

This article supporting source code

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.