ASCII code and Chinese characters in-machine code:
In doing Hdoj 2030 Chinese character statistics in the discussion area to see the landlord Post said that the ASCII code value of Chinese characters is negative. But the ASCII code in the book range is 0-255 (of which 0--127 is the international standard Code, 128--255 is the extension code).
after consulting Daniel and consult the data, Chinese characters do not have ASCII code, Chinese characters are also in-machine code , that is, ANSI code, is the system according to the current region and then determine the local code, such as the continental ANSI code on behalf of the GB Code GBK code. There are differences in the intra-machine codes between regions.
The description of the Chinese character machine code in the computer expression is, using two bytes, each byte highest bit one is 1. in a computer, the first digit of the complement is the sign bit, and 1 is the negative, so the decimal number represented by each byte of the inside code of the Chinese machine is negative. So the statistics input string contains several Chinese characters, only the characters that are less than 0 in the string are required to have several
Note: One word in English one byte used 8 bits (1 bytes)
A word of Kanji two bytes used 16 bits (2 bytes)
Here is the topic, very simple:
Chinese Character StatisticsTime limit:2000/1000 MS (java/others) Memory limit:65536/32768 K (java/others)
Total Submission (s): 30201 Accepted Submission (s): 16568
Problem description counts the number of Chinese characters in a given text file.
The input file first contains an integer n, which indicates the number of test instances, followed by the N-segment text.
Output for each piece of text, outputs the number of characters in it, and the output of each test instance takes one row.
[Hint:] From the characteristics of the Chinese character machine internal code to consider ~
Sample Input
2wahaha! wahaha! This year the festival does not speak to speak only Putonghua wahaha! Wahaha! 's going to take the final exams right now.
Sample Output
149
By the above available, directly found less than 0 characters is the Chinese character.
The specific code is as follows:
#include <cstdio> #include <cstring>int main () { int len,i,n; Char str[1010]; scanf ("%d", &n); GetChar (); while (n--) { int count=0; Gets (str); Len=strlen (str); for (i=0;i<len;i++) { if (str[i]<0) count++; } printf ("%d\n", COUNT/2); } Because kanji are represented by two bytes, each time a character is found less than 0, a Chinese character is recorded as two, so it is divided by two return 0;
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Hdoj 2030 Chinese character statistics (using this topic to talk about ASCII code and Chinese characters in the machine code)