Presumably, you often encounter garbled code problems during website development. Recently, when I wrote a small example of a blog, I encountered garbled code problems, which are generally common and often searched, all solutions are available, but most of them are directly posted on the Internet. I did not point out the specific reasons. So I read some articles about Garbled text these two days, so I have a little clue, so please write it down. I will present the garbled issues I encountered to you one by one, analyze the causes, and write out solutions.
If you have any questions, please include them and correct them! A person's understanding is often limited, or even wrong. Brainstorming will make the analysis more correct and perfect!
Let's take a look at a piece of code and explain several pieces of code:
<% @ Page Language = "Java" pageencoding = "gb2312" %> <% @ page contenttype = "text/html; charset = iso8859-1 "%> <HTML>
First, <% @ page Language = "Java" pageencoding = "gb2312" %> specifies the page encoding format. The storage format of JSP files is the encoding format, eclipse saves the file according to the encoding format. Compile the JSP file, including Chinese characters.
The second part is the decoding format. Because the file stored as gb2312 is decoded as a iso8859-1, such as Chinese certainly garbled. That is, it must be consistent. If this row in the second place does not exist, the system uses the encoding format of the iso8859-1 by default. If this line does not exist, garbled characters will also appear. Must be consistent.
This line is responsible for displaying characters in the JSP page according to the encoding format, that is, the encoding format of the final webpage content.
Note: <% @ page Language = "Java" pageencoding = "" %> is specified. If <% @ page contenttype = "text/html; charset =" %> is not specified, generally, the iso8859-1 encoding formats are used by default; otherwise, they are encoded in the same format.
The third encoding is to control the browser's decoding method. If all the preceding decoding operations are consistent and correct, you do not need to set this encoding format. Some web pages are garbled because the browser cannot determine which encoding format to use. Because the page is sometimes embedded into the page, the browser obfuscated the encoding format with garbled characters.
Note: <% @ page contenttype = "text/html; charset =" %> specifies the browser encoding format. If this is specified on the JSP page, meta http-equiv = "Content-Type" content = "text/html charset ="> does not work; only JSP does not have <% @ page contenttype = "text/html; charset =" %>, <% @ page contenttype = "text/html; charset =" %> priority.
Enter the text below:
Scenario 1:
The JSP page uses Form submission (method = "Post") to obtain the passed value in the Servlet File Using request. getparameter ("parameter name. Garbled characters often occur here.
The processing method for the following example is correct. You should pay attention to the red text.
Key points:Charset = "UTF-8" in JSP and value = new string (value. getbytes ("iso8859_1 "),
"UTF-8"); the characters must be consistent.
Index. jsp
<% @ Page contenttype = "text/html; charset = UTF-8" Language = "Java" Import = "Java. SQL. * "errorpage =" "%> <HTML>
Consumerservlet. Java
// Public void checkconsumer (httpservletrequest request, httpservletresponse response) throws servletexception, ioexception {string account = Chinese. tochinese (request. getparameter ("account"); consumerdao = new consumerdao (); consumerform = consumerdao. getconsumerform (account); If (consumerform = NULL) {request. setattribute ("information", "the user name you entered does not exist. Please enter it again! ");} Else if (! Consumerform. GetPassword (). Equals (request. getparameter ("password") {request. setattribute ("information", "the logon password you entered is incorrect. Please enter it again! ");} Else {request. setattribute ("form", consumerform);} // after the verification is passed, requestdispatcher = request. getrequestdispatcher ("dealwith. JSP "); requestdispatcher. forward (request, response );}
Chinese. Java (key)
Public class Chinese {public static string tochinese (string value) {try {If (value = NULL) {return "" ;}else {value = new string (value. getbytes ("iso8859_1"), "UTF-8"); return value;} // read the value in the iso8859_1 encoding format and convert it to UTF-8} catch (exception E) {return "";}}}
This code is responsible for transcoding and may not be quite familiar with the red code. I am going to post the Java API explanation here:
Public byte []Getbytes(String charsetname)
Throws
Unsupportedencodingexception
Decodes this string into a byte sequence using the specified character set and stores the result in a new byte array.
When this string cannot be decoded in a given character set, this method does not have the specified behavior. Use the charsetencoder class to further control the decoding process.
Parameters:
Charsetname-supported
Charset name
Return Value:
Result byte array
Throw:
Unsupportedencodingexception-if the specified character set is not supported
PublicString(Byte [] bytes,
String charsetname)
Throws unsupportedencodingexception
Construct a new string by decoding the specified byte array using the specified character set. The length of the new string is a character set function, so it cannot be equal to the length of the byte array.
If the given byte is invalid in the given character set, the constructor does not specify the behavior. To further control the decoding process, use
Charsetdecoder class.
Parameters:
Bytes-bytes to be decoded as characters
Charsetname-name of the supported charset
Throw:
Unsupportedencodingexception-if the specified character set is not supported
Case 2:
Form (using submit ).
The following is a submission page (submit. jsp). The Code is as follows:
<% @ Page contenttype = "text/html; charset = gb2312 "%> <HTML>
The following is the process. JSP code:
<% @ Page contenttype = "text/html; charset = gb2312 "%> <HTML>
If you enter Chinese characters, garbled characters are displayed. The solution is the same as above:
<% @ Page contenttype = "text/html; charset = gb2312 "%> <HTML>
In this case, OK.
Analysis:
Tomcat uses the ISO-8859-1 by default to read the value, but the actual character encoding should be specified by pageencoding of the JSP page (generally not a iso-8859-1 ), in this case, the encoding format displayed after reading is inconsistent with the storage format, so garbled characters appear.
While our transcoding just solves this problem, we first use the ISO-8859-1 format to decode the string, and then use the consistent encoding format with pageencoding to decode the byte array, in this way, the encoding format displayed after reading is consistent with the storage format, and the garbled problem is solved.
There is another solution for searching online:
Requests are uniformly encoded using request. secharacterencoding ("gb2312.
The modified process. JSP code is as follows:
<% @ Page contenttype = "text/html; charset = gb2312" %> <% request. setcharacterencoding ("gb2312 "); %> <HTML>
After testing, this method is not feasible or garbled. Which one can explain it ????
Not complete to be continued ~~~