js| Skills | solve | problems | Chinese
Many open source software has the problem of internationalization and localization, in fact, internationalization is not so complicated, but there are always some companies want to reject international standards to increase market share, so it gradually led to the current situation. A lot of open source software first support is UTF code, then is other code. (by Gashero) of course, there are some random software preferred iso-8859-1 encoding, even ASCII encoding. OK, let's talk about coding, because Tomcat is a random software.
Common Chinese encodings are GB2312, GBK, GB18030, and so on, of course, not including traditional Chinese, which are coded, not display encodings. Now more and more software, in its core use of display encoding or processing coding, is Unicode encoding. Unicode encoding uses 2 of bytes to represent almost all of the world's literal symbols, and is a good fit for the internal internationalization of software.
Just some nasty company, for business purposes, the kernel of the Unicode coding API is all shielded, external only to provide localized coding. (by Gashero) such as m$ company in the Chinese software on the main push GBK code.
Another problem is the very old software written in C/s + +, the end of the string is ' the ', and Unicode, the use of the word will truncate some of the strings. So UTF-8 was born, this variable length encoding can reduce the volume of the string, but also to prevent the stage and transmission of Unicode encoding.
As for the traditional English coding, the most common is two kinds. The first is ASCII, which has a high level of 0, with 7 bits representing the data. The other is iso-8859-1, 1 bytes per character, and 8 bits for one character.
Tomcat's built-in encoding is the use of iso-8859-1. This is also the most important sentence of this article, the fundamental of various techniques.
It follows that the data submitted by the Web page needs to be decoded from the iso-8859-1, and the following is an example of getting the number of submit parameters.
String Number=new string (Request.getparameter ("number"). GetBytes ("Iso-8859-1"), "UTF-8");
This sentence succeeds in realizing the conversion from iso-8859-1 to UTF-8 encoding.
Experienced readers should have encountered an error in passing the parameters between pages through the label. The string received by Gashero as long as it is a bunch of question marks anyway. In fact, this is also caused by the internal code of tomcat. If you can adapt to Tomcat's internal code, you can pass the Chinese string.
For example, an internal redirect between pages:
"/>
When the destination page receives the parameters, it also needs to be converted from the ISO-8859-1 encoding to the UTF-8 encoding. This enables the transfer of Chinese parameters between pages.