Today, I am working on the file download function. I found that when I passed the Chinese file name as a parameter, the information displayed in the action is garbled, and I went online to find the information for a long time, saying that urlencoder and urldecoder are required. Then I tried to rewrite
buffer.append("<li><a href='" + request.getContextPath()+ "/fileDownload.do?filename=" + URLEncoder.encode(files[i].getName(),"UTF-8")+ "' >" + files[i].getName() + "</a></li>");
Then get the file name in the download action:
filename = URLDecoder.decode(request.getParameter("filename"),"UTF-8");
The result is garbled. So I continued to look up the information and found a detail. I wrote urlencoder. encode twice in an article. So I tried urlencoder twice for the link generation:
buffer.append("<li><a href='" + request.getContextPath()+ "/fileDownload.do?filename=" + URLEncoder.encode(URLEncoder.encode(files[i].getName(),"UTF-8"),"UTF-8")+ "' >" + files[i].getName() + "</a></li>");
The result is successful, and the filename is not garbled. I felt very strange, so I went to Baidu: urlencoder twice. The original principle is as follows:
In JSP, the encoding method of UTF-8 is used for Chinese encoding, And the request is called in servlet. the getparameter (); method is automatically decoded using the encoding format specified by the server. Therefore, the front-end encoding is decoded once in the background, and the decoding and encoding methods are not used, resulting in garbled characters,
This is similar to the following code:
String name = java.net. urlencoder. encode ("test", "UTF-8 ");
System. Out. println (name );
System. Out. println (java.net. urldecoder. Decode (name, "ISO-8859-1 "));
Encoded as % E6 % B5 % 8B % E8 % af % 95;
What is decoded with a ISO-8859-1 ??? ?;
However, if you call
System. Out. println (java.net. urldecoder. Decode (name, "UTF-8 "));
The result is a "test ";
This confirms why I called java.net in servlet. urldecoder. decode (request. getparameter ("name"), "UTF-8") method and call java.net. urldecoder. decode (request. getquerystring (), "UTF-8") results are not the same, because in the request. getparameter ("name") automatically performs a decoding operation before it is the default ISO-8859-1.
Therefore, when using java.net. urlencoder. Decode () and java.net. urldecoder. Decode (), you must use the java.net. urldecoder. Decode () method twice on the front-end page.
The process of using two encodings is equivalent to the following code:
String name = java.net. urlencoder. encode ("test", "UTF-8 ");
System. Out. println (name );
Name = java.net. urlencoder. encode (name, "UTF-8 ");
System. Out. println (name );
Name = java.net. urldecoder. Decode (name, "UTF-8 ");
System. Out. println (name );
System. Out. println (java.net. urldecoder. Decode (name, "UTF-8 "));
Output:
% E6 % B5 % 8B % E8 % af % 95
% 25e6% 25b5% 258b % 25e8% 25 af % 2595
% E6 % B5 % 8B % E8 % af % 95
Test
After the first encoding, the Chinese characters are encoded in the format of % and letters and numbers, while the second encoding is to encode % letters and numbers, although the decoding is the ISO-8859-1, but for % and alphanumeric characters decoded with ISO-8859-1 and UTF-8 is the same, then return to the Chinese character is encoded once the string, when decoding again, use the UTF-8 to transfer it back to Chinese characters;