The meaning of each part of JSP coding

Source: Internet
Author: User

Server JSP encoding Pageencoding

is the encoding of the JSP file itself,

The first stage is the JSP compiled into. Java, which will read the JSP according to the pageencoding settings (JSP file encoding, pageencoding is consistent), the result is the specified encoding scheme translated into a unified UTF-8 Java source code (ie. java), if the pageencoding is set wrong, or not set (in the JSP standard syntax, if the pageencoding attribute exists, then the JSP page character encoding is determined by pageencoding, otherwise the ContentType attribute of the C Harset decides that if CharSet does not exist, the JSP page character encoding takes the default iso-8859-1. ), out of the Chinese garbled. This parameter also has a function of specifying the encoding to recode the server response when the contenttype parameter is not specified in the JSP and the Response.setcharacterencoding method is not used.

ContentType

ContentType CharSet refers to the content encoding when the server is sent to the client

Note:

Visible, both pageencoding and ContentType can set the character set encoding in the JSP source file and in the response body. But there is also a difference: When setting the JSP source file character set, the priority is pageencoding>contenttype. If none is set, the default is Iso-8859-1. When setting the character set of the response output, the priority is contenttype>pageencoding. If none is set, the default is Iso-8859-1.

client browser encoded URL encoding
    1. Enter the URL in IE http://www.baidu.com/s?wd= Spring Festival

Encoding of the query string:

IE : With the the encoding of the operating system.

Chrome : UTF-8

Firefox:utf-8

    1. In the page link: <a href= "http://www.baidu.com/s?wd= Spring Festival" > Point me </a>

is determined by the encoding of the Web page, which is specified by Content-type

GET request

is determined by the encoding of the Web page, which is specified by Content-type

POST request

is determined by the encoding of the Web page, which is specified by Content-type

Jquery Ajax Request

When sending a request, Ajax automatically encodes the query string UTF-8.

$.ajax ({

data:[{key=value}]//this will automatically encode, if it is key=value& ... If you assemble them, you won't encode them.

});

Note: jquery internally calls the Jquery.param method to the parameter encode (performing the encode that should have been handled by the browser).

Encoding processing

GET request

There are two ways to handle encoding for Get mode: (server: Tomcat)

    1. Code implementation, using hard-coded:

New String (Request.getparameter ("name"). GetBytes ("Iso-8859-1"), "Client encoding mode");

    1. Configuration under the server (that is, the hard-coded operation was given to Tomcat)

Under Server.xml configuration:

<connector connectiontimeout= "20000" port= "8080" protocol= "http/1.1" redirectport= "8443" URIEncoding= ' UTF-8 '/ >

Or:

<connector connectiontimeout= "20000" port= "8080" protocol= "http/1.1" redirectport= "8443" useBodyEncodingForURI= " TRUE '/>

    • Uriencoding is a uniform recoding of all data that is requested by the Get method.
    • Usebodyencodingforuri is the re-encoding of the data according to the request.setcharacterencoding parameter of the page that should be requested, and the different pages can have different recoding encodings, which, by default, This parameter is false.
POST request

Request.setcharacterencoding (arg0); only works for post

JavaScript encoding function 1. Escape ()

cannot be used directly for URL encoding, its true function is to return a Unicode encoded value of a character. For example, the "Spring Festival" return result is%u6625%u8282, that is, in the Unicode character set, "Spring" is the No. 6625 (hexadecimal) character, "section" is the No. 8282 (hexadecimal) character.

Its rule is that escape does not encode characters with 69:*,+,-,. , / , @ , _ , 0-9 , A- z , A- z, encoding all other characters. The symbols between \u0000 and \u00ff are converted into%xx forms, and the remaining symbols are converted into%uxxxx forms. The corresponding decoding function is unescape ().

Note: First, regardless of the original encoding of the Web page, once encoded by JavaScript, it becomes a Unicode character. In other words, the input and output of the Javascipt function are Unicode characters by default. Second, Escape () does not encode "+". But we know that when we submit a form, the page will be converted to a + character if there are any spaces. When the server processes the data, the + number is processed into a space. So be careful when you use it.

encodeURI ()

encodeURI () is a function in JavaScript that is really used to encode URLs.

After encoding, it outputs the utf-8 form of the symbol and adds a% before each byte.  Its corresponding decoding function is decodeURI (). It is important to note that it does not encode single quotes.

encodeURI does not encode characters there are 82:! , # , $ , & , ' , (,) , * , + , , , - , . , / , : , ; , = , ? , @ , _ , ~ , 0-9 , A- z , A- z

encodeURIComponent ()

with encodeURI () the difference is that it is used for the URL the components are individually encoded and not used for the entire URL to encode . encodeURIComponent does not encode 71 characters:!, ', (,), *,-,.,_,~,0-9,a-z,a-z its corresponding decoding function is decodeURIComponent ().

Sometimes I use two times JS code

because the first time you encode, your parameter content does not have multibyte characters, it becomes a purely Ascii string . (This is the first time the result is called [str_enc1] good. [STR_ENC1] is not with multibyte characters, and then once again, submit, receive when the container is automatically solved once (container automatic solution this time, whether by GBK or UTF-8 or  iso-8859-1 are good, can get [str_enc1] correctly, and then, in the program to achieve once decodeURIComponent (Java.net.URLDecoder (* *, "UTF-8") is usually used in Java to get the original value of the argument you want to commit.

Give me a chestnut:

String str1 = Urlencoder. encode ("Programmer", "utf-8");//Assume for the first time the browser is encoded

String str2 = Urlencoder. encode  (STR1, "utf-8"); Browser second-time encoding

String deStr1 = Urldecoder. Decode (STR2, "GBK");

The server is decoded, no matter what encoding can get the correct first encoding of the browser

String deStr2 = Urldecoder. Decode (DESTR1, "utf-8");//finally get the correct string

Http://www.cnblogs.com/xckxue/p/4202278.html

1. The byte and Unicode Java kernels are Unicode, even the class file, but many media, including file/stream, are saved using a byte stream.    So Java wants these bytes to flow through the line transformation.   Char is Unicode, and byte is a byte. The functions of Byte/char in Java are in the middle of Sun.io's package. Where the Bytetocharconverter class is in dispatch, can be used to tell you that you use the convertor. Two of the most commonly used static functions are    public   static   bytetocharconverter   getdefault ()      ;    public   static   bytetocharconverter   getconverter (String     encoding); If you do not specify converter, then the system will automatically use the current ENCODING,GB platform on the Gbk,en platform with the 8859_1 we come to a simple example: "You" GB code is:0xc4e3  , unicod      E is 0x4f60 you use:--encoding= "gb2312";      --byte   b[]={(Byte) ' \u00c4 ', (byte) ' \u00e3 '};      --convertor=bytetocharconverter.getconverter (encoding);      --char   []   C=converter.convertall (b);      --for (int   i=0;i <c.length;c++)--{--   System.out.println (integer.tohexstring (C[i])); ----print it out as 0x4f60--but if you use 8859_1 encoding, printed out is--0X00C4,0X00E3----Example 1    in turn:   --encoding= "gb2312";           char   c[]={' \u4f60 '};         convertor=bytetocharconverter.getconverter (encoding);      --byte   []   B=converter.convertall (c);      --for (int   i=0;i <b.length;c++)--{--   System.out.println (integer.tohexstring (B[i])); ---Print out is: 0xc4,0xe3----Example 2--if the use of 8859_1 is 0x3f, the number, indicating that can not be converted-   many Chinese problems is from the two simplest Class is derived from the. But there are many classes do not directly support the encoding input, which brings us a lot of inconvenience. Many programs rarely use encoding, directly with the default encoding, which gives us a lot of transplant difficulties-2.utf-8--utf-8 is and Unicode one by one corresponding to the implementation is very simple-  - -   7-bit unicode:   0   _   _   _   _   _    _   _--11-bit unicode:   1   1   0   _   _    _   _   _   1   0   _   _   _   _   _   _--16-bit unicode:    1   1   1   0   _   _   _   _   1   0   _   _   _   _   _   _   1    0   _   _   _   _   _   _--21-bit Unicode:    1   1   1   1   0   _   _   _    1   0   _   _   _   _   _   _   1   0   _   _   _   _   _   _   1    0   _   _   _   _   _   _-- In most cases, only Unicode with 16 digits is used:--"You" GB code is:0xc4e3  , Unicode is 0x4f60--IWe still use the example above----example 1:0xc4e3 binary:----   1   1   0   0   0 &n Bsp 1   0   0   1   1   1   0   0   0    1   1----   because only two of us are ranked by two-bit codes, but we find that this does not work,----   because the 7th bit is not 0 therefore, return " ? "----  ----example 2:0x4f60 binary:----   0   1   0   0    1   1   1   1   0   1   1   0   0   0   0   0  ----   we filled it with UTF-8 and became:----   111 00100   10111101   10100000----   e4--bd--   A0----   so   return 0xe4,0xbd,0xa0----3.String and byte[]--string in fact the core is char[], but to convert a byte into a string, it must be encoded. --string.length () is actually the length of the char array, if the use of different encodings, very can--can be wrong points, resulting in the scattered charactersand garbled.   --Example:----byte   []   b={(byte) ' \u00c4 ', (byte) ' \u00e3 '};  ----string   str=new   String (b,encoding); --------If encoding=8859_1, there will be two words, but encoding=gb2312 only one word------This problem is often occurred in the processing of paging 4.reader,writer/inputstream,outputst   Ream--reader and writer Core is the Char,inputstream and OutputStream core is byte. --but the main purpose of reader and writer is to read/write char Inputstream/outputstream--A reader example:--file Test.txt only a "you" word, 0xc4,0xe3----String    encoding=; --inputstreamreader   reader=new   inputstreamreader (----new   FileInputStream (" Text.txt "), encoding); --char   []c=new   char[10]; --int   Length=reader.read (c);   --for (int   i=0;i <c.length;i++)----System.out.println (c[i]); --If encoding is gb2312, then there is only one character, if encoding=8859_1, then there are two characters------------  ----2. We want to know about Java compilers:--javac   -encoding    We often don't use encoding as a parameter.  In fact, encoding this parameter is important for cross-platform operations.  &nbsp If encoding is not specified, it is gb2312 on the system's default ENCODING,GB platform and is iso8859_1 on the English platform.   --java's compiler actually calls the Sun.tools.javac.Main class, compiles the file, and this class--there is a encoding variable in the middle of the compile function--  The parameters of the encoding are actually passed directly to the encoding variable.  The compiler reads the Java file according to this variable, and then compiles it into a class file in the form of UTF-8.  An example:--public   void   Test ()--{----string   str= "You";  ----filewriter   write=new   FileWriter ("test.txt");  ----Write.write (str);  ----Write.close (); --}----Example 3--if compiled with gb2312, you will find e4   bd   A0 fields----if 8859_1 compiled,--00c4   00E3 binary:-- 00000000   11000100   00000000   11100011----because each character is greater than 7 bits, it is encoded with 11-bit:--11000001    10000100   11000011   10100011  --c1--   84--C3--   A3-- You will find c1   84   c3   a3  --but we tend to ignore this parameter, so there are often cross-platform problems:--Example 3 compiled on the Chinese platform, Raw Cheng Zhclass--Example 3 compiled on the English platform, output Enclass--1.    Zhclass performs OK on the Chinese platform, but in EnglishNo--2 on the platform.    Enclass do OK on the English platform, but not on the Chinese platform:--1. After compiling on the Chinese platform, in fact Str is 0X4F60 in the running state of the char[],------run on the Chinese platform, The default encoding for FileWriter is gb2312, so--chartobyteconverter automatically calls gb2312 converter, converts STR into byte input into FileOutputStream, and 0xc4,  0xe3 put in the file. --but if the default value for Chartobyteconverter is 8859_1 in the English platform,--filewriter will automatically call 8859_1 to convert str, but he can't explain it, so he will--output "?"------2.  In the English platform after compiling, in fact, str in the running state of the char[] is 0x00c4   0x00e3,------on the Chinese platform to run, Chinese is not recognized, so it will appear?? --,0x00c4--> 0xc4,0x00e3-> 0xe3 on the English platform, so 0xc4,0xe3 was put in--file----1. For the interpretation of the JSP body:--tomcat First Look at your leaf there is no "<% @page Symbols for    include. Have, then in the same--place set Response.setcontenttype (..); According to encoding read, no he according to 8859_1-read the file, and then use UTF-8 write. java file, and then use Sun.tools.Main to read the file,--(of course, it uses UTF-8 to read), and then compiled into a class file-- setContentType change is the Out property, the out variable default encoding is 8859_1

2. Explanation of parameter-unfortunately parameter only iso8859_1 explanation, this material can be found in the servlet implementation code.

3. Explanation of the include format, but unfortunately, because the person who wrote "Org.apache.jasper.compiler.Parser" in the array jsputil.validattribute[] forgot to add a parameter: encoding, As a result, this approach is not supported. You can fully compile the source code, plus support for encoding

Summarize:

If you are under NT, the simplest way is to cheat Java, without any encoding variables:

Http://localhost/test/test.jsp?value= You

Result: Hello you

However, this method is more restrictive, such as the uploading of the article fragment, the practice is dead, the best solution is to use this scheme: <%@ page contenttype= "text/html;charset=gb2312"%>

Workaround for issues in eclipse such as. js, or. properties files that cannot be saved in Chinese

window-"ppreference-" General-"Content Types-" Text-"javascript or Java Properties File, select it and in default encoding: Change the iso8859-1 to Utf-8, or GBK, or gb2312, then click "Update" to do it.
You can also change the default encoding for some other files: such as Jsp,java Source file ...

Summarize JSP submission Chinese garbled

Http://www.blogjava.net/luedipiaofeng/articles/307666.html

1: The most basic problem of garbled characters.
This garbled problem is the simplest garbled problem. General Xinhui appears. is the page encoding inconsistency caused by garbled.
<%@ page language= "java" pageencoding= "UTF-8"%>
<%@ page contenttype= "Text/html;charset=iso8859-1"%>
<title> Chinese issues </title>
<meta http-equiv= "Content-type" content= "text/html; Charset=utf-8 ">
<body>
I'm a good man.
</body>

Three places of code.

The first place in the encoding format is the storage format of the JSP file. Eclipse will save the file based on this encoded format. and compile the JSP file, including the Chinese characters inside.

The second encoding is the decoding format . Because the file saved as UTF-8 is decoded to iso8859-1, so the Chinese must be garbled. That must be the same. And the second place in this line, can not. The default is also the encoding format using ISO8859-1. So if there is no such a line, "I am a good person" will also appear garbled. Must be consistent.

The third code is to control how the browser is decoded . This encoding format is not related if the previous decoding is consistent and error-free. Some pages are garbled because the browser cannot determine which encoding format to use. Because pages are sometimes embedded in the page, the browser confuses the encoding format. There was garbled characters.


2: Forms Use Post the garbled problem received after the method was submitted
This problem is also a common problem. This garbled is also Tomcat internal encoding format iso8859-1 in the trouble, that is, when the post submission, if not set the encoding format of the submission, it will be submitted in iso8859-1 Way, (The Tomcat Default code: ISO8859-1) accepted JSPs are accepted in a utf-8 manner. causes garbled characters. Since this is the reason, there are several workarounds and comparisons.
A: Encoding conversion when parameters are accepted
String str = new String (Request.getparameter ("something"). GetBytes ("Iso-8859-1"), "Utf-8");

In this case, each parameter must be transcoded in this way. Very troublesome. But you can actually get the kanji.

B: At the beginning of the request page, execute the requested encoding code, request.setcharacterencoding ("UTF-8"), set the submission character set to UTF-8 . In this case, the page that accepts this parameter does not have to be transcoded. Use string str request.getparamet ("something") directly to obtain Chinese character parameters. But every page needs to execute this sentence. This method is also effective for post submissions, which is not valid for enctype= "Multipart/form-data" when a get commits and uploads a file. The following is a separate description of the two garbled cases later.

C: In order to avoid writing request.setcharacterencoding ("UTF-8") on every page, it is recommended that all JSPs be encoded with a filter.

3: form Get How to submit the method of garbled processing

If you use get to submit the Chinese language, the page that accepts parameters will also appear garbled, this garbled reason is also tomcat internal encoding format iso8859-1 caused. Tomcat will encode the kanji with the default encoding of GET, append to the URL after encoding, and result in the iso8859-1 parameters being garbled.

Workaround:

A using the first method in the previous example, the accepted characters are decoded and then transcoded.

B get goes for URL commits, and iso8859-1 is encoded before entering the URL . If you want to influence this code, you need to
The Server.xml connector node increases the usebodyencodingforuri= "true" property configuration to control how Tomcat Chinese character coding the Get mode, which controls the get commit Encoded in the encoding format set by request.setcharacterencoding ("UTF-8"). So automatically encoded as utf-8, accept the page to accept the normal. But I think the real coding process is that Tomcat is also based on
<connector port= "8080"
maxthreads= "minsparethreads=" maxsparethreads= "75"
Enablelookups= "false" redirectport= "8443" acceptcount= "100"
debug= "0" connectiontimeout= "20000" usebodyencodingforuri= "true"
Disableuploadtimeout= "true"
uriencoding= "UTF-8"/> Inside the set of uriencoding= "UTF-8" again to encode,
However, since it has been encoded as UTF-8, the coding will not change.
If the encoding is obtained from the URL, the Accept page is decoded according to uriencoding= "UTF-8".

1. (client garbled) display garbled in IE

1). <% @page pageencoding= "utf-8"%>-----------------> Specify how you want to read the JSP file

2). Note: When the JSP file is saved as , choose the utf-8 format to save the .===> and the first step consistent
3). <% @page contenttype= "text/html;charset=utf-8"%>------> Specify reponse return stream with what encoding

4). Note: IE---------> Automatically selects the utf-8===> and the third-step consistency

Note: HTML meta-<meta http-equiv= "Content-type" content= "text/html; Charset=utf-8 "> is not a function of any kind. It is only useful when IE directly opens the local HTML file , in this case, it tells IE, exactly which encoding method to use to read the local HTML file.

(probably not, because IE has not read HTML How to know that the file contains this information, if it has been read, then what to do with this information.) )

2. (server-side garbled) in the *.java request.getparameter ("username");

When the form in HTML contains Chinese, theform is submitted to the server-side of the stream encoding----is consistent with "IE--view--Encode---> Automatically select Utf-8". So, in Java, before taking parameters , use: request.setcharacterencoding ("UTF-8");

The meaning of each part of JSP coding

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.