js|servlet| Display
Because it has been not believed that Java can not mix to display a number of languages of the bug, this weekend to study the servlet, JSP in the multinational language display problem, that is, the servlet's multiple character set problem, because I am not very clear on the concept of character set, so the writing is not necessarily accurate, This is how I understand the character set in Java: At run time, each string object is stored encoded in Unicode (I think all languages are encoded, because inside the computer the string is always expressed in code), Only the string encoding in a generic computer language is platform-dependent, while Java uses platform-independent Unicode.
When Java reads a string from a byte stream, it converts the platform-dependent byte into a platform-independent Unicode string. In the output, Java converts the Unicode string into a platform-dependent byte stream, and if a Unicode character does not exist on a platform, a ´?´ is output. For example: In Chinese windows, Java reads a "GB2312" encoded file (which can be any stream) into memory to construct a string object that converts GB2312 encoded text into a Unicode encoded string. The output of this string will also convert the Unicode string into a GB2312 byte stream or array: "Chinese test"-----> "U4E2DU6587U6D4BU8BD5"-----> "Chinese test".
The following routine:
byte[] bytes = new byte[]{(byte) 0xd6, (Byte) 0xd0, (Byte) 0xce, (Byte) 0xc4, (Byte) 0xb2, (Byte) 0xe2, (Byte) 0xca, (byte) 0xd4} ;//GBK coded "Chinese test"
Java.io.ByteArrayInputStream bin = new Java.io.ByteArrayInputStream (bytes);
Java.io.BufferedReader reader = new Java.io.BufferedReader (new java.io. InputStreamReader (Bin, "GBK"));
String msg = Reader.readline ();
SYSTEM.OUT.PRINTLN (msg)
This program is placed in a system (such as the Chinese system) containing the four words "Chinese test", which can be printed correctly. The MSG string contains the correct "Chinese test" Unicode Encoding: "U4e2du6587u6d4bu8bd5", which is converted to the operating system's default character set when printed, and whether the character set that relies on the operating system can be displayed correctly, only in systems that support the corresponding character set Our information can be correctly exported, otherwise the resulting will be rubbish.
Let's take a look at the multilingual problem in servlet/jsp. Our goal is that any country's clients send information to the server via form, server stores the information in the database, and the client can still see the correct information it sends when retrieving it. In fact, we want to make sure that the SQL statements in the final server contain the correct Unicode encoding of the client-sent text, and that the encoding used to communicate with the database can contain the text messages sent by the client, and in fact, it is best to let JDBC use the DBC directly unicode/ UTF8 and Database Communication! This ensures that the information is not lost, and that the server sends the message to the client with the encoding of not losing information or Unicode/utf8.
If you do not specify the Enctype property of the form, the form will submit the input according to the encoded character set UrlEncode the current page, and the server will get the urlencoding string. The urlencoding string that is encoded is related to the encoding of the page, such as GB2312 encoded pages submit "Chinese test", get "%d6%d0%ce%c4%b2%e2%ca%d4", each "%" followed by a 16-string , while the UTF8 encoding is "%e4%b8%ad%e6%96%87%e6%b5%8b%e8%af%95", because one of the characters in the GB2312 code is 16 digits, while the UTF8 one is 24 digits. China, Japan and South Korea ie4 above the browser support UTF8 encoding, this scheme must contain the three languages, so if we let HTML pages using UTF8 encoding will be able to support at least the three languages.
However, if we html/jsp the page using UTF8 encoding, because the application server may not know this, because if the browser sends a message that does not contain charset information, up to server knows to read Accept-language request bids, We know that only this bid is not known by the browser code, so the application server can not correctly parse the content submitted, why? Because all strings in Java are UNICODE16-bit encoded, the function of Httpservletrequest.request (string) is to convert the UrlEncode encoded information submitted by the client into a Unicode string. Some servers can only assume that the client's encoding is the same as the server platform, simply using the Urldecoder.decode (string) method to decode directly, if the client code is exactly the same as the server, you can get the correct string, otherwise, If the local character is included in the submit string, it will result in garbage information.
In my solution, I have already specified the use of UTF8 encoding, so that we can avoid this problem, we can customize the Decode method:
public static string decode (String s,string encoding) throws Exception {
StringBuffer sb = new StringBuffer ();
for (int i=0; i
char C = S.charat (i);
Switch (c) {
Case´+´:
Sb.append (´´);
Break
Case´%´:
try {
Sb.append ((char) integer.parseint (
S.substring (i+1,i+3), 16));
}
catch (NumberFormatException e) {
throw new IllegalArgumentException ();
}
i + 2;
Break
Default
Sb.append (c);
Break
}
}
Undo Conversion to external encoding
String result = Sb.tostring ();
byte[] Inputbytes = result.getbytes ("8859_1");
return new String (inputbytes,encoding);
}
This method can specify encoding, and if it is specified as UTF8, it satisfies our needs. For example, use it to parse: "%e4%b8%ad%e6%96%87%e6%b5%8b%e8%af%95" can get the correct Chinese character "Chinese test" Unicode string.
The problem now is that we have to get the client-submitted UrlEncode string. The information submitted by method for form of get can be read in httpservletrequest.getquerystring (), and the information submitted by the form of the Post method can only be read from the ServletInputStream , in fact, when the standard GetParameter method is first invoked, the information submitted by form is read out, and ServletInputStream cannot be read out repeatedly. So we should read and parse the information submitted by form before using the GetParameter method for the first time.
That's what I did, set up a servlet base class, override the service method, read and parse the form submission before calling the parent's service method, and see the following source code:
Package com.hto.servlet;
Import Javax.servlet.http.HttpServletRequest;
Import java.util.*;
/**
* Insert the type´s description here.
* Creation Date: (2001-2-4-15:43:46)
* @author: Chan Weichun
*/
public class Utf8parameterreader {
Hashtable pairs = new Hashtable ();
/**
* Utf8parameterreader constructor comment.
*/
Public Utf8parameterreader (HttpServletRequest request) throws java.io.ioexception{
Super ();
Parse (request.getquerystring ());
Parse (Request.getreader (). ReadLine ());
}
/**
* Utf8parameterreader constructor comment.
*/
Public Utf8parameterreader (HttpServletRequest request,string encoding) throws java.io.ioexception{
Super ();
Parse (request.getquerystring (), encoding);
Parse (Request.getreader (). ReadLine (), encoding);
}
public static string decode (string s) throws Exception {
StringBuffer sb = new StringBuffer ();
for (int i=0; i
char C = S.charat (i);
Switch (c) {
Case´+´:
Sb.append (´´);
Break
Case´%´:
try {
Sb.append ((char) integer.parseint (
S.substring (i+1,i+3), 16));
}
catch (NumberFormatException e) {
throw new IllegalArgumentException ();
}
i + 2;
Break
Default
Sb.append (c);
Break
}
}
Undo Conversion to external encoding
String result = Sb.tostring ();
byte[] Inputbytes = result.getbytes ("8859_1");
return new String (Inputbytes, "UTF8");
}
public static string decode (String s,string encoding) throws Exception {
StringBuffer sb = new StringBuffer ();
for (int i=0; i
char C = S.charat (i);
Switch (c) {
Case´+´:
Sb.append (´´);
Break
Case´%´:
try {
Sb.append ((char) integer.parseint (
S.substring (i+1,i+3), 16));
}
catch (NumberFormatException e) {
throw new IllegalArgumentException ();
}
i + 2;
Break
Default
Sb.append (c);
Break
}
}
Undo Conversion to external encoding
String result = Sb.tostring ();
byte[] Inputbytes = result.getbytes ("8859_1");
return new String (inputbytes,encoding);
}
/**
* Insert the method´s description here.
* Creation Date: (2001-2-4-17:30:59)
* @return java.lang.String
* @param name Java.lang.String
*/
public string GetParameter (string name) {
if (pairs = null | |!pairs.containskey (NAME)) return null;
Return (String) ((ArrayList) pairs.get (name)). Get (0));
}
/**
* Insert the method´s description here.
* Creation Date: (2001-2-4-17:28:17)
* @return Java.util.Enumeration
*/
Public enumeration Getparameternames () {
if (pairs = null) return null;
return Pairs.keys ();
}
/**
* Insert the method´s description here.
* Creation Date: (2001-2-4-17:33:40)
* @return java.lang.string[]
* @param name Java.lang.String
*/
Public string[] Getparametervalues (String name) {
if (pairs = null | |!pairs.containskey (NAME)) return null;
ArrayList al = (ArrayList) pairs.get (name);
String[] values = new string[al.size ()];
for (int i=0;i
Values[i] = (String) al.get (i);
return values;
}
/**
* Insert the method´s description here.
* Creation Date: (2001-2-4-20:34:37)
* @param urlenc java.lang.String
*/
private void Parse (String urlenc) throws java.io.ioexception{
if (Urlenc = null) return;
StringTokenizer tok = new StringTokenizer (Urlenc, "&");
try{
while (Tok.hasmoretokens ()) {
String Apair = Tok.nexttoken ();
int pos = apair.indexof ("=");
String name = NULL;
String value = null;
if (POS!=-1) {
Name = Decode (apair.substring (0,pos));
Value = Decode (apair.substring (pos+1));
}else{
name = Apair;
Value = "";
}
if (Pairs.containskey (name)) {
ArrayList values = (ArrayList) pairs.get (name);
Values.add (value);
}else{
ArrayList values = new ArrayList ();
Values.add (value);
Pairs.put (name,values);
}
}
}catch (Exception e) {
throw new Java.io.IOException (E.getmessage ());
}
}
/**
* Insert the method´s description here.
* Creation Date: (2001-2-4-20:34:37)
* @param urlenc java.lang.String
*/
private void Parse (String urlenc,string encoding) throws java.io.ioexception{
if (Urlenc = null) return;
StringTokenizer tok = new StringTokenizer (Urlenc, "&");
try{
while (Tok.hasmoretokens ()) {
String Apair = Tok.nexttoken ();
int pos = apair.indexof ("=");
String name = NULL;
String value = null;
if (POS!=-1) {
Name = Decode (apair.substring (0,pos), encoding);
Value = Decode (apair.substring (pos+1), encoding);
}else{
name = Apair;
Value = "";
}
if (Pairs.containskey (name)) {
ArrayList values = (ArrayList) pairs.get (name);
Values.add (value);
}else{
ArrayList values = new ArrayList ();
Values.add (value);
Pairs.put (name,values);
}
}
}catch (Exception e) {
throw new Java.io.IOException (E.getmessage ());
}
}
}
The function of this class is to read and save the information submitted by form and implement the common GetParameter method.
Package com.hto.servlet;
Import java.io.*;
Import javax.servlet.*;
Import javax.servlet.http.*;
/**
* Insert the type´s description here.
* Creation Date: (2001-2-5-8:28:20)
* @author: Chan Weichun
*/
public class Utfbaseservlet extends HttpServlet {
public static final String params_attr_name = "Params_attr_name";
/**
* Process incoming HTTP GET requests
*
* @param request Object that encapsulates the request to the servlet
* @param response Object that encapsulates the response from the servlet
*/
public void doget (HttpServletRequest request, httpservletresponse response) throws Servletexception, IOException {
Performtask (request, response);
}
/**
* Process incoming HTTP POST requests
*
* @param request Object that encapsulates the request to the servlet
* @param response Object that encapsulates the response from the servlet
*/
public void DoPost (HttpServletRequest request, httpservletresponse response) throws Servletexception, IOException {
Performtask (request, response);
}
/**
* Insert the method´s description here.
* Creation Date: (2001-2-5-8:52:43)
* @return int
* @param request Javax.servlet.http.HttpServletRequest
* @param name Java.lang.String
* @param Required Boolean
* @param defvalue int
*/
public static java.sql.Date Getdateparameter (HttpServletRequest request, String name, Boolean required, Java.sql.Date Defvalue) throws servletexception{
String value = GetParameter (request,name,required,string.valueof (defvalue));
return java.sql.Date.valueOf (value);
}
/**
* Insert the method´s description here.
* Creation Date: (2001-2-5-8:52:43)
* @return int
* @param request Javax.servlet.http.HttpServletRequest
* @param name Java.lang.String
* @param Required Boolean
* @param defvalue int
*/
public static double Getdoubleparameter (HttpServletRequest request, String name, Boolean required, double defvalue) throw S servletexception{
String value = GetParameter (request,name,required,string.valueof (defvalue));
return double.parsedouble (value);
}
/**
* Insert the method´s description here.
* Creation Date: (2001-2-5-8:52:43)
* @return int
* @param request Javax.servlet.http.HttpServletRequest
* @param name Java.lang.String
* @param Required Boolean
* @param defvalue int
*/
public static float Getfloatparameter (httpservletrequest request, String name, Boolean required, float defvalue) throws Se rvletexception{
String value = GetParameter (request,name,required,string.valueof (defvalue));
return float.parsefloat (value);
}
/**
* Insert the method´s description here.
* Creation Date: (2001-2-5-8:52:43)
* @return int
* @param request Javax.servlet.http.HttpServletRequest
* @param name Java.lang.String
* @param Required Boolean
* @param defvalue int
*/
public static int Getintparameter (HttpServletRequest request, String name, boolean required, int defvalue) throws Servlete xception{
String value = GetParameter (request,name,required,string.valueof (defvalue));
return Integer.parseint (value);
}
/**
* Insert the method´s description here.
* Creation Date: (2001-2-5-8:43:36)
* @return java.lang.String
* @param request Javax.servlet.http.HttpServletRequest
* @param name Java.lang.String
* @param Required Boolean
* @param defvalue java.lang.String
*/
public static String GetParameter (HttpServletRequest request, String name, Boolean required, String defvalue) throws SERVL etexception{
if (Request.getattribute (utfbaseservlet.params_attr_name)!= null) {
Utf8parameterreader params = (utf8parameterreader) request.getattribute (utfbaseservlet.params_attr_name);
if (params.getparameter (name)!= null) return Params.getparameter (name);
if (required) throw new Servletexception ("The Parameter" +name+ "required but not provided!");
else return defvalue;
}else{
if (request.getparameter (name)!= null) return Request.getparameter (name);
if (required) throw new Servletexception ("The Parameter" +name+ "required but not provided!");
else return defvalue;
}
}
/**
* Returns the servlet info string.
*/
Public String Getservletinfo () {
return Super.getservletinfo ();
}
/**
* Insert the method´s description here.
* Creation Date: (2001-2-5-8:52:43)
* @return int
* @param request Javax.servlet.http.HttpServletRequest
* @param name Java.lang.String
* @param Required Boolean
* @param defvalue int
*/
public static Java.sql.Timestamp Gettimestampparameter (HttpServletRequest request, String name, Boolean required, Java.sql.Timestamp defvalue) throws servletexception{
String value = GetParameter (request,name,required,string.valueof (defvalue));
return java.sql.Timestamp.valueOf (value);
}
/**
* initializes the servlet.
*/
public void init () {
Insert code to initialize the servlet here
}
/**
* Process incoming requests for information
*
* @param request Object that encapsulates the request to the servlet
* @param response Object that encapsulates the response from the servlet
*/
public void Performtask (HttpServletRequest request, httpservletresponse response) {
Try
{
Insert user code from here.
}
catch (Throwable theexception)
{
Uncomment the following line when unexpected exceptions
are occuring to aid in debugging the problem.
Theexception.printstacktrace ();
}
}
/**
* Insert the method´s description here.
* Creation Date: (2001-2-5-8:31:54)
* @param request Javax.servlet.ServletRequest
* @param response Javax.servlet.ServletResponse
* @exception javax.servlet.ServletException the exception description.
* @exception java.io.IOException the exception description.
*/
public void Service (ServletRequest request, servletresponse response) throws Javax.servlet.ServletException, java.io.IOException {
String content = Request.getcontenttype ();
If content = null | | Content!= NULL && content.tolowercase (). StartsWith ("application/x-www-form-urlencoded") )
Request.setattribute (Params_attr_name,new utf8parameterreader ((httpservletrequest) request));
Super.service (Request,response);
}
}
This is the servlet base class, which overrides the service method of the parent class and creates a Utf8parameterreader object that holds the information submitted in the form before invoking the parent service. The object is then saved to the request object as an attribute. The service method of the parent class is then called.
For the servlet that inherits this class, it should be noted that the "standard" getparameter cannot read the post data either, since the data has been read from the ServletInputStream in this class. Therefore, you should use the GetParameter method provided in this class.
The rest is the output problem, we want to transfer the output of the information to the UTF8 of the binary stream output. As long as we specify CharSet as UTF8 when we set up Content-type, and then use printwriter output, then these conversions are done automatically, as set in the servlet:
Response.setcontenttype ("Text/html;charset=utf8");
This setting in the JSP:
<%@ page contenttype= "Text/html;charset=utf8"%>
This will ensure that the output is UTF8 flow, the client can be displayed, look at the client.
For Multipart/form-data form submissions, I also provide a class for processing, in the constructor of this class can specify the page to use the CharSet, default or UTF-8, limited to the length of the source code.