How to Use php to read Chinese character lattice data

Source: Internet
Author: User
Welcome to the Linux community forum, and interact with 2 million technical staff to access the method of reading Chinese character lattice data using php: Here is the basic knowledge: What is a location code? According to GB2312 of the national standard, all Chinese characters and symbols of the national standard form a 94 × 94 matrix. in this square matrix, each row is called a "partition", and each column is called a "bit". Therefore

Welcome to the Linux community forum and interact with 2 million technical staff> go to the php-based Chinese character matrix data reading method: Here is the basic knowledge: What is a location code? According to GB2312 of the national standard, all Chinese characters and symbols of the national standard form a 94 × 94 matrix. in this square matrix, each row is called a "partition", and each column is called a "bit". Therefore

Welcome to the Linux community forum and interact with 2 million technicians>

How to Use php to read Chinese Character dot matrix data:

Below we will first popularize the basic knowledge:

What is a location code?

According to GB2312 of the national standard, all Chinese characters and symbols of the national standard form a 94 × 94 matrix. in this square matrix, each row is called a "area", and each column is called a "bit". Therefore, this square matrix actually forms a Chinese character set with 94 areas (0 1 to 94 area numbers respectively) and 94 places (bits are 01 to 94) in each area. the area code and location code of a Chinese character are simply combined to form the "location code" of the Chinese character ". in the Chinese character location code, the upper two digits are the area code, and the lower two digits are the location code. it can be seen that the Location Code corresponds to Chinese characters or symbols one by one.

What is an internal code?

The inner code of Chinese characters refers to the encoding of Chinese characters in the computer. The inner code is slightly different from the location code. Why not use the inner code as the encoding in the computer? This is because the range of the Chinese character code and bit code is between 1 and 94. If you directly use the field code as the machine internal code, it will conflict with the basic ASCII code. the inner code of Chinese characters is usually related to the computer system used. currently, for most computer systems in China, the internal code of a Chinese character occupies two bytes, namely, the high byte and the low byte. The relationship between these two bytes and the location code is as follows: internal code high = area code + A0H (H indicates hexadecimal) Internal code low = bit code + A0H for example, the Chinese character "ah" is "1601 ″, the area code and bit code are expressed as "1H" in hexadecimal notation, respectively, and their inner code is "B0A1H". b0H indicates the high byte of the inner code, and A1H indicates the low byte of the inner code.

Simplified Chinese National Library (set in 1981, mainland China ). 7445 characters, including 6773 first-level Chinese characters and 3755 second-level Chinese characters. it is encoded in 2 bytes (16-bit binary.

PHP code: returns a string consisting of 0 and 1.

$ Str = "php self-learning network";

$ Font_file_name = "simsun12.fon"; // File Name of the dot matrix Font Library

$ Font_width = 12;/***** word width

$ Font_height = 12; // the height of a single word

$ Start_offset = 0; // offset

$ Fp = fopen ($ font_file_name, "rb");

$ Offset_size = $ font_width * $ font_height/8;

$ String_size = $ font_width * $ font_height;

$ Dot_string = "";

For ($ I = 0; $ I <strlen ($ str); $ I ++ ){

If (ord ($ str {$ I})> 160 ){

// First obtain the location code, and then calculate its position in the location code comment table to obtain the offset of this character in the file.

$ Offset = (ord ($ str {$ I})-0xa1) * 94 + ord ($ str {$ I + 1})-0xa1) * $ offset_size;

$ I ++;

} Else {

$ Offset = (ord ($ str {$ I}) + 156-1) * $ offset_size;

}

// Read its dot matrix data

Fseek ($ fp, $ start_offset + $ offset, SEEK_SET );

$ Bindot = fread ($ fp, $ offset_size );

For ($ j = 0; $ j <$ offset_size; $ j ++ ){

// Convert Binary dot matrix data into strings

$ Dot_string. = sprintf ("% 08b", ord ($ bindot {$ j }));

}

}

Fclose ($ fp );

Echo $ dot_string;

?>

There are two dot matrix font files: one is a 16 × 16 chs16.fon, and the other is a 12 × 12 simsun12.fon. The offset is zero.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.