1. Why are there garbled characters?
The root cause of garbled characters is the problem of bytes and characters.
When we were studying C in college, the teacher introduced characters and bytes.
Bytes are represented by 8 bits. The earliest encoding is ASCII code, and the ASCII code is a single-byte encoded character. Because a single byte of 8 bits is not enough for Chinese characters and other countries, more bits are needed to represent characters. The common encodings include GBK, big5, gb2312 and UTF-8. The ing between bit and character can be determined through the encoding ing table.
An application changes the text to be displayed from the server to a byte and loses it to the browser. The process of displaying the byte stream Assembly characters is generally as follows (the application code is GBK ):
Server: String text-> getbytes ("GBK")-> byte [] bytes
Browser: byte [] bytes-> New String ("bytes", "GBK")-> string text
If the encoding is inconsistent during the conversion process of byte stream processing, garbled characters may occur.
2. Common garbled characters
A. garbled characters appear when submitting the form (the page encoding is inconsistent with the server encoding)
When the page is JSP, it will often occur. This is easier to find and modify. You only need to change the header of the JSP file.
<% @ Page contenttype = "text/html; charset = gb2312" Language = "Java" %>
B. garbled characters occur in interface calls between systems (if the encoding formats of the two applications are inconsistent, garbled characters may occur in the get and post methods)
Today, a problem occurs. system a calls our HTTP interface and the data they submit is garbled, mainly because the encoding formats of the two systems are inconsistent. Our application is GBK and the other side is UTF-8.
Solution:
They need to re-specify the encoding format of the HTTP request.
C. garbled characters may occur when interacting with the front-end (the same application get method)
When the page script is uploaded to the background in Chinese, garbled characters will appear, and the Chinese encoding is implemented by different browsers. the backend wants to use new string ("garbled ". getbytes ("GBK"), "UTF-8") in this way
Returns the question mark after the characters are restored.
The simplest solution for the front-end is to use js to encode the Chinese characters uploaded to the backend by using encodeuri. If it is uploaded to the server, decode it.
(Tomcat performs decode once by default, and sometimes JS encodes Chinese twice)
Another solution to garbled get is to modify the container's encodinguri.
JBoss: Modify/Server/default/deploy/jbossweb. SAR/server. xml
<Connector port= "6666" address= "${jboss.bind.address}" maxThreads= "150" minSpareThreads= "25" maxSpareThreads= "75" enableLookups= "false" redirectPort= "8443" acceptCount= "100" connectionTimeout= "20000" disableUploadTimeout= "true" <span style= "color: #ff0000;" >URIEncoding= "GBK" </span>/> |
Tomcat:/CONF/server. xml
<Connector connectionTimeout= "20000" port= "8080" protocol= "HTTP/1.1" redirectPort= "8443" URIEncoding= "UTF-8" /> |
Https://www.ibm.com/developerworks/cn/java/j-lo-chinesecoding/
Http://www.cnblogs.com/iusmile/archive/2012/06/01/2531262.html