How to deal with Chinese garbled problem

Source: Internet
Author: User
Tags character set tomcat
In Java programming, often encounter the Chinese character processing and display problems, to accidentally will produce a lot of garbled or question mark. The root cause of this problem is that the default encoding in Java is Unicode, and this problem occurs when the Chinese usually use files and db based on GB2312 or BIG5 encoding.
For different problems, different JDK versions, different application servers (such as tomcat,jboss,weblogic), there are some minor differences in the processing methods. Here, mainly for Tomcat in the development of JSP easy to appear in Chinese garbled problem discussion, generally have the following kinds of situations:

The garbled problem of Chinese in 1.JSP output

The so-called in the JSP output Chinese, that is, directly in the JSP output Chinese, or to give the variable Chinese value and so on, this situation is often garbled because there is no JSP page to develop a display character encoding way to solve the problem as follows:

• Add the Statement <%@ page contenttype= "TEXT/HTML;CHARSET=GBK" to the JSP header%> (used in the servlet page

Httpservletresponse.setcontenttype ("TEXT/HTML;CHARSET=GBK"), preferably in the head section of the JSP page plus <meta http-equiv= " Content-type "content=" TEXT/HTML;CHARSET=GBK ">

* In the output of the local version of the active conversion encoding, such as the page to enter the word "Chinese", you can use the following ways:




<%
String Str = " Chinese " ;
byte [] Tmpbyte = Str.getbtyes ( " iso-8859-1 " );
STR = New String (Tmpbyte);
Out.print (str);
%>


2. The Chinese garbled problem when obtaining the data submitted by the form

Request.getparameter (Panamname) is used to get the data in the form submission before any other processing is added, and the returned string is garbled when the form data contains Chinese. This problem occurs because Tomcat's Java EE implementation of the form submission, that is, the post-submitted parameters with the default iso-8859-1 to deal with.
For example, create a test.jsp that reads:




<% @ Page Contenttyp = " TEXT/HTML;CHARSET=GBK " %>
<%
String Str = Request.getparameter ( " Chstr " );
if (str == NULL ) Str = " No value entered " ;
%>
< HTML >
< Head >
< title > Chinese test </ title >
< Meta HTTP - equiv = " Content-type "C ontent = " TEXT/HTML;CHARSET=GBK " >
< Meta HTTP - equiv = param content = No - Cache >
</ Head >
< Body > The content you entered is: <%= Str %>< BR >
< Form Action = " test.jsp " method = " Post " >
Please enter Chinese: < input type = " text " name = " Chstr " >
< input type = " Submit " value = " Determine " >
</ form >
</ Body >
</ HTML >


After running, enter the Chinese character "Chinese" in the input box, and then after the submission, it becomes a pile of garbled characters. There are two ways to solve this problem. One is not to modify other settings, but only in the form of the Chinese data out after the conversion code, such as the statement string Str=request.getparameter ("Chstr"); String Str=new string (Sre.getbyte ("iso-8859-1"), "GBK"), but this method only considers the problem from a local, and if there is too much of it, you have to repeat the statement many times, in larger projects, This is a less feasible option. Another method is to have all requests for the page pass through a filter, setting the processing character set to GBK. The specific approach is as follows (there is a complete example in Tomcat's Webapps/servlet-examples directory, as well as references to Web.xml and Setcharacter encodingfilter configurations):

First, copy the file Setcharacterencodingfilter.class in the%tomcat%/webapps/servlets-examples/web-inf/classes/filters/directory to your own application/ Web-inf/classes/com/util/filter directory, and then add the following configuration code after the <web-app> of web.xml file:




< Filter >
< Filter-name > Set Character Encoding </ Filter-name >
< filter- class > Com.ccut.struts.SetCharacterEncodingFilter </ Filter-class >
< Init-param >
< Param-name > encoding </ Param-name >
< Param-value > GBK </ Param-value >
</ Init-param >
</ Filter >
< filter-mapping >
< Filter-name > Set Character Encoding </ Filter-name >
< Url-pattern > / *<url-pattern>
</fil ter-mapping >


The Chinese problem in 3.URL

For a GET request made directly by passing Chinese parameters in the URL, such as "http://localhost/a.jsp?str= Chinese", it is often garbled to return the Request.getparameter ("name") at the service end. According to the above practice set filter is not useful, with request.setcharacterencoding ("GBK") way, still does not work.
For example, create a test2.jsp file that reads:



<%@ Page Contenttyp="TEXT/HTML;CHARSET=GBK"%>
<%
String Str=Request.getparameter ("Chstr");
if(str==NULL) Str="No value entered";
%>
<HTML>
< Head>
<title>Chinese test</title>
<Meta HTTP-equiv="Content-type" content="TEXT/HTML;CHARSET=GBK">
<Meta HTTP-equiv=param content=No-Cache>
</ Head>
< Body>The content you entered is:<%=Str%><BR>
<Form Action="test.jsp" method="Post">
<a href="test2.jsp?chstr= Chinese">Click here to submit Chinese parameters</a>
</form>
</ Body>
</HTML>

After the operation, it is obvious that the Chinese parameters passed through the URL have been removed to become garbled, resulting in the result that Tomcat's request for a GET method is different from that of the Post method in query-string processing.
The solution to this problem is to open the/conf/server.xml file under the Tomcat installation directory, find the connector block, add uriencoding= "GBK" to it, and add the complete connector block code as follows:




< Connector Port = "8080"
MaxThreads = "the" minsparethreads = "a" maxsparethreads = "a"
Enablelookups= "false" redirectport= "8443" acceptcount= "100"
debug= "0" connectiontimeout= "20000"
Disableuploadtimeout= "true"
Uriencoding= "GBK"
/>


4. The garbled problem when database accesses

When you set up a database, all the tables in the database are encoded to GBK because the GBK encoding is also used in the JSP, which results in a reduction in the number of unnecessary coding conversion problems. In addition, when using JDBC to connect to a MySQL database, the connection string can be written in the following form to avoid some Chinese problems:




JDBC: // MySQL: // hostname:port/dbname?user=username&
Password = pwd &
Useunicode = True &
Characterencoding = GBK
If you are connecting to a database as a data source, use it in a configuration file:



< parameter >
< name > URL </ name >
< value >
Jdbc://mysql://hostname:port/dbname? &useunicode =true &characterencoding =GBK
</ value >
</ parameter >
However, if you use a database that already exists, the database is encoded in iso-8859-1, and the Web application uses UTF-8, and there is already a lot of important information in the database, so you can't fix the problem by changing the way the database is encoded. At this point, when writing a database to a database, be sure to include the "in the JDBC Connection string" Useunicode =true &characterencoding =iso-8859-1 ", so that you can successfully write normal data to the database. However, in the data read out of the database, garbled and will appear, this time should be in the data when the transfer code, you can write transcoding function as a function, specifically implemented as follows:



Publicstring Charconvert (string src){
String Result=NULL;
if(src!=NULL){
try{
result =new String (src.getbytes ("iso=8859-1" ) ),"gbk");
catch(Exception e)
{
result=null;
}
}
returnResult ;
}Then, call Charconvert (rs.getstring ("ColName") after the data is read from the database, so that the Chinese data in the database can be displayed properly.


</

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.