Java Chinese garbled method for solving URL Chinese garbled problem _java

Source: Internet
Author: User
Tags tomcat server

We submit requests to the server mainly in two forms: URL, form. and form form generally will not appear garbled problem, garbled problem is mainly in the URL above. Through the previous blog introduction we know that the URL to send the request to the server coding process is really too confusing. Different operating systems, different browsers, and different Web page character sets will result in completely different coding results. Is it too scary for programmers to take every outcome into account? Is there a way to ensure that clients only use one encoding method to send requests to the server?

Yes! Here I mainly provide the following several methods

First, JavaScript
use JavaScript encoding to not give the browser a chance to intervene, then send the request to the server after encoding and then decode it in the server. In mastering this method, we need three methods that are encoded with javascript: Escape (), encodeURI (), encodeURIComponent ().

Escape
encodes the specified string using the Sio Latin character set. All non-ASCII characters are encoded into%XX-formatted strings, where XX represents the 16-digit number that the character corresponds to in the character set. For example, the encoding for the format corresponds to%20. Its corresponding decoding method is unescape ().

In fact, escape () cannot be used directly for URL encoding, and its true function is to return a Unicode encoded value of one character. For example, the result of "I am cm" above is%U6211%U662FCM, where "I" corresponds to a code of 6211, "yes" is encoded 662F, and "CM" is encoded as cm.

Note that escape () does not encode "+". But we know that when the Web page submits the form, if there are spaces, it will be converted to the + character. When the server processes the data, the + number is processed into spaces. So be careful when you use it.

encodeURI
Encodes the entire URL, which uses the UTF-8 format to output the encoded string. However, encodeURI is not encoded for some special characters except ASCII encoding such as:! @ # $& * () =:/;? + '.

encodeURIComponent ()
Converts the URI string into a string in escape format using the UTF-8 encoding format. As opposed to encodeuri,encodeuricomponent, it will be more powerful for symbols that are not encoded in the encodeURI () (; / ? : @ & = + $, #) will all be encoded. However, encodeURIComponent will only encode the components of the URL individually, not the entire URL. The corresponding decoding function method is decodeuricomponent.

Of course, we usually use the encodeURI side to encode operations. The so-called JavaScript two-time coding background two times decoding is the use of this method. JavaScript solves the problem with a one-time transcoding and two-time transcoding methods.

One turn code
JavaScript Transfer code:

var url = '/showmoblieqrcode.servlet?name= I am cm ';
window.location.href = encodeURI (URL);

URL after transcoding:http://127.0.0.1:8080/perbank/ShowMoblieQRCode.servlet?name=%E6%88%91%E6%98%AFcm

Background processing:

String name = Request.getparameter ("name"); 
System.out.println ("Foreground incoming parameter:" + name);
name = new String (name.getbytes ("iso-8859-1"), "UTF-8"); 
System.out.println ("After decoding parameter:" + name);

Output results:

Foreground incoming parameters:?????? Cm
After decoding the parameters: I am cm

Two times turn code
JavaScript

var url = '/showmoblieqrcode.servlet?name= I am cm ';
Window.location.href = encodeURI (encodeURI (URL));

URL:HTTP://127.0.0.1:8080/PERBANK/SHOWMOBLIEQRCODE.SERVLET?NAME=%25E6%2588%2591%25E6%2598%25AFCM after the turn code

Background processing:

String name = Request.getparameter ("name");
  System.out.println ("Foreground incoming parameter:" + name); 
  Name = Urldecoder.decode (name, "UTF-8"); 
  System.out.println ("After decoding parameter:" + name);

Output results:

Foreground incoming parameters: E68891E698AFCM

After decoding the parameters: I am cm

Filter
using filters, the filter LZ provides two kinds, the first type of encoding, the second is directly in the filter for decoding operations.

Filter 1
The filter directly sets the encoding format of the request.

public class Characterencoding implements Filter {
 private filterconfig config;
 String encoding = NULL;
  public void Destroy () {
  config = null; 
 } 
 public void Dofilter (ServletRequest request, servletresponse response,
   Filterchain chain) throws IOException, servletexception {
  request.setcharacterencoding (encoding);
  Chain.dofilter (request, response);
 public void init (Filterconfig config) throws servletexception {
  this.config = config;
  Gets the configuration parameter 
  String str = config.getinitparameter ("encoding");
  if (str!=null) {
   encoding = str; 
  }
 }
}

Configuration:

 <filter>
  <filter-name>chineseEncodingfilter-name> 
  <filter-class>
com.test.filter.characterencodingfilter-class> 
    <init-param>
   <param-name> encodingparam-name> 
   <param-value>utf-8param-value> 
  init-param> 
 filter> 
  < filter-mapping> 
  <filter-name>chineseEncodingfilter-name> 
  <url-pattern>/* Url-pattern>
 filter-mapping>

Filter 2
The filter decodes the parameters directly in the processing method, and then the decoded parameters are reset to the attribute of the request.

public class Characterencoding implements Filter {protected Filterconfig filterconfig; 
  String encoding = NULL; 
 public void Destroy () {this.filterconfig = null;
 }/** * Initialization/public void init (Filterconfig filterconfig) {this.filterconfig = Filterconfig; /** * Convert INSTR to UTF-8 encoded form * * @param inStr Input String * @return UTF-8 encoded String * @throws Unsupportedencodin 
  Gexception */private string Toutf (String inStr) throws Unsupportedencodingexception {string outstr = ""; 
  if (inStr!= null) {outstr = new String (instr.getbytes ("iso-8859-1"), "UTF-8"); 
 return outstr; /** * Chinese garbled filter processing * * public void Dofilter (ServletRequest servletrequest, Servletresponse servletresponse, filter Chain Chain) throws IOException, servletexception {httpservletrequest request = (httpservletrequest) servletrequest
  ;
  HttpServletResponse response = (httpservletresponse) servletresponse; The way the request was obtained (1.post or 2.get), different processing String according to different request methods method = Request.getmethod (); 1. A post-submitted request that directly sets the encoding to UTF-8 if (Method.equalsignorecase ("POST")) {try {request.setcharacterencoding ("UTF-8"
   ); 
   catch (Unsupportedencodingexception e) {e.printstacktrace (); }//2. 
   Requests that are submitted in Get are {//Take out the set of parameters submitted by the Customer enumeration paramnames = Request.getparameternames (); Traversal parameter set takes out the name and value of each parameter while (Paramnames.hasmoreelements ()) {String name = Paramnames.nextelement (),//Fetch parameter name St  
     Ring values[] = request.getparametervalues (name),//If the parameter value set is NOT NULL if (values!= null) {//traversal parameter value set for (int i = 0; i < values.length; i++) {The try {//Loop loops each value calls the Toutf (Values[i]) method converts the character encoding of the parameter value St
       Ring Vlustr = Toutf (Values[i]); 
      Values[i] = vlustr; 
      catch (Unsupportedencodingexception e) {e.printstacktrace ();
    The value is hidden in the request Request.setattribute (name, values) in the form of a property; }//Set response mode and character set response that support Chinese.setContentType ("Text/html;charset=utf-8"); 
 Continue to execute the next filter, without a filter to execute the request Chain.dofilter (request, response);
 }
}

Configuration:

 
 <filter> 
  <filter-name>chineseEncodingfilter-name> 
  <filter-class> com.test.filter.characterencodingfilter-class>
 filter> 
  <filter-mapping> 
  < filter-name>chineseencodingfilter-name>
  <url-pattern>/*url-pattern> 
 filter-mapping>

Other

1, set up pageencoding, ContentType

<%@ page language= "java" contenttype= "text/html;
Charset=utf-8 "pageencoding=" UTF-8 "%>

2, set up Tomcat's uriencoding

By default, the Tomcat server is encoded using the ISO-8859-1 encoding format, and the uriencoding parameter encodes the URL of the GET request. So we just need to add uriencoding= "Utf-8″" to the tag of Tomcat's Server.xml file.

The above is the entire content of this article, I hope to learn Java Chinese garbled problem is helpful

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.