Http://www.lanceyan.com/tech/arch/web_luanma.html Remember just do javaweb development time by this coding problem get dizzy, often confused encoding normal A will code and disorderly. At that time, most of the project progress was known and did not know why. Then there is time to put the whole system, finally touched through the ins and outs.
In the CGI development of C + +, people like to use Latin, which is a byte encoding format, storage of MySQL space saving, and C + + is relatively easy to control to the byte level of the language. So the framework package is basically not a problem.
In the Java language, there are really a lot of places to deal with coding problems. A place that is not set up will be garbled and flying. The approximate summary includes the following sections: browser, server, database, operating system.
Browser:
If you use the template language, HTML needs to be set to display the character set. This applies to the browser to determine what encoding to display.
<meta http-equiv="Content-type" Content="text/html; Charset=utf-8 " />
Extensions, the browser recognizes the order of the encodings:
1. If the HTTP header declares the CharSet, it will use the HTTP header,
2. If the HTTP header is not set, it will parse the META tag,
3. If meta is not there, the browser will identify the code based on whether auto detect is set.
4. Otherwise, the character encoding of the local UI will be used.
Server:
For dynamic languages such as JSP, the JSP header needs to set the encoding format, the Java EE Server parsing this JSP will be the entire page encoded as UTF-8 output, or in accordance with the system default encoding format iso-8859-1 output. The JSP format is as follows:
<%@ page language= "java" contentType = "text/html; Charset=utf-8 "
pageencoding ="UTF-8"%>
As we all know, the JSP corresponds to the servlet. The servlet's encoding corresponds to the following settings:
Public void Service(httpservletrequest request, httpservletresponse response)
Throws Servletexception,IOException{
Response. setcontenttype("Text/html;charset=utf-8");
}
And do not miss the common spring tools, code conversion filter, very practical. When you use struts, spring MVC, this filter helps you transform the encoding filters that are not set. The following settings:
<filter>
<filter-name> Set Character Encoding</filter-name>
<filter-class>
Org.springframework.web.filter.CharacterEncodingFilter
</filter-class>
< Init-param>
<param-name> encoding </param-name>
<param-value> utf-8 </param-value>
</init-param>
</filter>
What if there is garbled? The parameter passing of the Doget method will certainly have garbled problems. Just set the encoding character set in the Tomcat listener (files are typically stored in the/tomcat installation directory/conf/server.xml):
<connector port= "protocol=" http/1.1 "
connectiontimeout="20000"
redirectport= "8443" uriencoding="utf-8" />
When you are developing it, don't forget that the Java file itself is encoded in the same format. Right-click on the class file to view the properties.
If you forget to change the encoding format of your files at development time, Windows defaults to GBK, and then to UTF8 encoded Linux. The file is huge, you can not change it one by one. In fact, it is very simple, just need to-DFILE.ENCODING=GBK the environment parameter setting of Java command to solve.
When compiling Java code, if you use ant you need to set up compiled character sets in Javac. So the printed log output to the file or console will not be garbled.
<javac debuglevel= "Source,lines" source= "1.6 " encoding= "Utf-8" >
The charset set at Maven compile time:
< Artifactid> Maven-compiler-plugin</artifactid>
< version> 2.5</version>
< configuration>
< optimize> True</optimize>
< showdeprecation> False</showdeprecation>
< DebugLevel> Lines,source</debuglevel>
< source> 1.6</source>
< target> 1.6</target >
< encoding> utf-8 </ Encoding >
< Meminitial> 128m </meminitial >&NBSP;&NBSP;
< Maxmem> 768m </maxmem >
&NBS P </configuration>
Sqlmap's SQL Xml,sping XML is also required because it involves cross-platform. Top add:
<!--? XML version= "1.0" encoding= "UTF-8"?-->
Database:
Here is a list of the most common MySQL character set settings for everyone. Open the MySQL configuration file (Linux is generally in/etc/my.cnf, windows in the MySQL installation directory my.ini). Settings are as follows:
[Mysqld]
Default-character-set = UTF8
[MySQL]
Character_set_server = UTF8
JDBC needs to be set
Jdbc:mysql://192.168.0.237:3306/dzh_db?useunicode=true&characterencoding=utf-8
These are all set up in the general Chinese is not a problem.
But a recent problem has been very funny. Previously thought that all the characters as long as the set up all the data can be entered into the database, the result of some characters do not, such as ★ this type. Later these characters into bytecode, incredibly not three-bit utf8, I rub, sweating. Later queries can be processed by filtering UTF8 special characters.
PublicStaticStringUtf2string(BYTE buf[]){
int Len= buf.Length;
StringBufferSb=NewStringBuffer(len/2);
For(int I=0; I< Len; I++){
If(By2int(buf[I])< = 0x7F)
Sb.Append((Char) BUF[I]);
ElseIf(By2int(buf[I])< = 0xDF& & By2int(buf[I])> = 0xC0){
int BH= By2int(buf[I]& 0x1F);
int BL= By2int(buf[++i]& 0x3F);
Bl= By2int*b*< <6| Bl); Bh= By2int*b*> >2);
int C= BH< <8| Bl;
Sb.Append((Char) c);
}ElseIf(By2int(buf[I])< = 0xEF& & By2int(buf[I])> = 0xE0){
int BH= By2int(buf[I]& 0x0F);
int BL= By2int(buf[++i]& 0x3F);
int BLL= By2int(buf[++i]& 0x3F);
Bh= By2int*b*< <4| Bl> >2);
Bl= By2int(BL< <6| Bll);
int C= BH<< 8 |&NBSP;BL //space converted to half-width
if (C == 58865) {
C = < Span class= "nu0" >32
Sb.append ( ( char ) c
}
}
return sb.tostring< Span class= "Br0" ( /span>
or change the MySQL character set to UTF8MB4, remember this only mysql55 support Oh!
[Mysqld]
Default-character-set =UTF8MB4
[MySQL]
Character_set_server = Utf8mb4
Operating system:
Windows is GBK by default and generally does not need to be changed. But everyone wants to create a file for the UTF8 format what to do, it is impossible for each of us to create a file after the use of properties to change it? Too much trouble! After the eclipse is set up, the same type of file creation will be in UTF8 format.
Linux, can have two places to modify the basic is enough:
vi/etc/sysconfig/i18n
Modify
Lang= "ZH_CN. GB3212 "
Language= "Zh_CN.GB18030:zh_CN.GB2312:zh_CN"
Supported= "ZH_CN. GB18030:zh_CN:zh:en_US. Utf-8:en_us:en "
Vi/etc/profile
Export Lc_all= "ZH_CN. GB2312 "
Export lang= "ZH_CN. GB2312 "
original articles, reproduced please specify: reproduced from lanceyan.com
A discussion on the problem of the hardship character set in web development