Solutions for garbled text caused by direct input of Chinese characters to param in the address bar of IE browser

Source: Internet
Author: User
: This article describes the solutions for garbled text caused by direct input of Chinese characters to param in the address bar of IE browser. if you are interested in PHP tutorials, refer. This article draws a full stop on a problem that I could not solve when I was working on a search engine project in the past few years. it is of little use, but it can make up for my regret.

At that time, the scenario was like this. Originally, normal people used to enter normal search words in the search box and then search for them. However, some users may think they are smart and copy the URL from the address bar, then change the parameter and access it, like a http://www.xxx.com/search? Keyword = % E4 % B8 % AD % E6 % 96% 87 (Display in IE, chrome and firefox will display Chinese in the address bar ), when a user submits a request in IE is a http://www.xxx.com/search? Keyword = Chinese, you will find that the server (web processing back-end) cannot recognize this character at all, this is when the browser to submit a request to the back-end, its parameter must be the URLEncode of The ISO-8859-1 specification, when writing a web program, IE must be converted manually, while chrome and firefox can be converted or not, because they will be automatically converted during transmission.

The backend cannot recognize characters, which is what we often call garbled characters. This garbled code is also caused by decoding errors. our web container (Framework, similar to jetty/tomcat/jboss in java and django in python) the string is automatically UrlDecode. at this time, the unencoded characters submitted by IE are decoded. it can be imagined that they will not be returned again (many people once saw such garbled code like me, so they are anxious to submit a doctor ).

OK. There are two solutions to solve this problem. The first one is before arriving at the web backend (there is no way to go to the js layer, because the user directly clicks the carriage return in the address bar ), that is to say, preprocessing is performed on the front-end (nginx) of the server, and url encoding is performed for unencoded characters. The second is to recompile the logic of the servlet processing parameters in the web container for decode to determine whether urldecode is required.

In view of the difficulty of implementation, I chose the first method: to process nginx, use lua in nginx to transcode the parameters, and then reverse proxy to the web backend.

Here, depending on your own project, there are several situations to pay attention to, such as their own project is UTF-8 code or GBK code, the customer's environment is UTF-8 or GBK, these are to do different processing, for example, my system is the browser where the system is windows, so my client code is GBK, and then my project is UTF-8, therefore, before urlencoding, you also need to perform the GBK-"UTF-8 operation.

 set_by_lua $arg_name 'local iconv = require("luaiconv")local cd = iconv.new( "utf-8","gbk")if(string.find(ngx.var.arg_name,"%")){ngx.var.arg_name, err = cd:iconv(ngx.var.arg_name)}return ngx.escape_uri(ngx.var.arg_name)';
In this scenario, my parameter name is name, and then I use the luaiconv library for conversion. In fact, I am not very rigorous in this logic. for example, if I have not determined the encoding, I just want to determine whether the encoding is needed by checking whether the string contains %.

Three years ago, Google was the only search engine that manually entered Chinese characters in the IE address bar, but many companies have done the same thing today.

The above introduces the solution for garbled text caused by direct input of Chinese characters to param in the address bar of IE browser, including some content, and hopes to help friends who are interested in PHP tutorials.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.