New String (Request.getparameter ("Newdefrayitem"). GetBytes ("Iso-8859-1"), "GBK")

Source: Internet
Author: User
Tags character set tomcat

With several kinds of coding is garbled, request.setcharacterencoding ("UTF-8"); tried several

String newdefrayitem = new String (Request.getparameter ("Newdefrayitem"). GetBytes ("Iso-8859-1"), "GBK"), and changed the encoding several times, Finally, string newdefrayitem = new String (Request.getparameter ("Newdefrayitem"). GetBytes ("Iso-8859-1"), "GBK"); Checked the principles on the Internet, turned around.

Tomcat is all iso-8859-1 encoded by default, and regardless of what your page is showing, Tomcat will eventually turn all the characters into iso-8859-1 for you. Then, when the other target page is translated with GBK, the wrong encoding will be translated into the GBK encoding, Then the text will be garbled.

So we need to get the "character" (whatever it is) first in byte array, and use iso-8859-1 to translate, get a byte array in iso-8859-1 encoding environment. For example: AB is expressed as [64,65]. Then encode this array with GBK, and translates it into a string.

So we can get a code conversion process.
Suppose: GBK code ("You")->urlencode turns into a (%3f%2f)->tomcat automatically for you once iso-8859-1-> get (23 43 68 23 42 68 Each symbol is represented as an encoding in iso-8859-1), receive page---> re-iso-8859-1 byte array [23,43,68,23,42,68]---> Convert to readable text---> (% 3f%2f "----> Switch to (" You ")

In addition to UTF-16, other character set definitions are duplicated.

For example, the Chinese character "I", assuming its value is 22530 (just suppose, how much I did not check)
The value of "gusset" in Japanese can also be 22530 (also assumed) or Korean "?"

Transmission over the network cannot be transmitted in high-byte, because the network bottom end only recognized unsigned char, equivalent to byte in Java, so
22530 this int to be converted to a byte array,

Byte[0] = (22530 >> 8) &0xFF;
BYTE[1] = 22530 &0xFF;
How much I don't count, assuming it's byte[125,231]

Such bytes are sent to the service to indicate the Chinese character "I" or Japanese "gusset" or other bullshit.
The general communication protocol will tell the character set, such as HTTP, when requested, to tell the server:
Contenttype= "xxxxxxxxxx"; charset= "GKB";
At this point the server knows that [125,231] is now receiving GKB "I" rather than other words.

The above is the standard communication process. However, if some poorly-level programmers do not notify the server character set when submitting a request, there is no way for the server to do so.
Have to guess a default by the most commonly used character set.

This is good, the most deadly is the writing server programmer level and poor insight, it is killing. Just like the old version of Tomcat programmer, he was born in the West, that everyone in the world with 26 letters plus some symbols, so he no matter what the client submitted by the iso-8859-1 to calculate, the results can be imagined.

No way, the people who let us use GBK will not write Tomcat, we have to let that poor programmer error generated by the string used iso-8859-1 to restore
[125,231], and then re-use GKB to generate a string.

Used to get the characters from the server to regenerate the GBK encoding


The following example is done with simulations:

public class Getluanma {
public static void Main (string[] args) throws Exception {
System.out.println ("\ t------JSP simulation------");
SYSTEM.OUT.PRINTLN ("Client, has a Chinese character request (converted into a sequence of bytes sent), sent to the server side");
String request= "Request";
Byte[] Client=request.getbytes ();//client-requested byte sequence
print (client);
System.out.println ();//split with
System.out.println ("There is a middleware that will send a sequence of characters in the default encoding format (ISO-8859-1) to decode");
String Sever=new string (client, "iso-8859-1");
SYSTEM.OUT.PRINTLN (sever);

/*system.out.println ("haha:" +new String (Sever.getbytes ("iso-8859-1")));
Print (New String (Sever.getbytes ("Iso-8859-1")). GetBytes ());
SYSTEM.OUT.PRINTLN (); */

SYSTEM.OUT.PRINTLN ("The program Ape found that there are problems, Chinese has garbled, come to solve.") ");
String Debug=new string (sever.getbytes ("iso-8859-1"), "UTF-8");//restore byte sequence, use "UTF-8" to re-decode.
SYSTEM.OUT.PRINTLN (Debug);
SYSTEM.OUT.PRINTLN ("Problem solving. ");
}
public static void print (byte[] b) {//for displaying the byte sequence
for (byte b1:b) {
System.out.print (Integer.tohexstring (B1 & 0xff) + "");
}
}
}




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.