Php url-based Chinese garbled characters

Source: Internet
Author: User
Tags alphanumeric characters

If you are using apache or linux, url Chinese Garbled text is a very common issue. The following small series will introduce the solution to the Chinese Garbled text transmitted by url in php, I hope this method will help you.

In use? Garbled characters occur when the Chinese parameter is passed in id = "Chinese". This is the result of second transcoding. in php, Chinese characters cannot be directly transmitted in URLs, I am always dissatisfied with this. No way. Who told me that we didn't have a solution? I don't know if other languages have such problems.

Add the header ("content-type: text/html; charset = UTF-8") on the homepage, and the solution that sets the database page and so on to utf8 is useless at all, and the Chinese characters that are always passed are garbled characters.

Although PHP on all 04ie.com sites uses the unified utf8 encoding, it always passes garbled code. Then I tested several browsers and found that 360 can be passed over, But IE cannot, then use $ msg = iconv ('gbk', 'utf-8', $ _ GET ["msg"]); for conversion. After testing several browsers, most of them are garbled.

Finally, the url cannot directly transmit Chinese characters for GET [] values. If the url must be transmitted, urlencode () is used to process Chinese characters. I don't know about POST []. I haven't done any experiments yet.


Check the usage of urlencode () in the PHP manual:

Urlencode () This function encodes a string into a URL. For example, a space is changed to a plus sign. In Homepage, form data transmission uses urlencode encoding before being sent.

In the past, I would like to explain why it is okay to submit the form from the form, but the url is garbled.


This tool implements two methods of Encode and Decode:

Chinese-> GB2312 Encode-> % D6 % D0 % CE % C4

English-> UTF-8 Encode-> % E4 % B8 % AD % E6 % 96% 87

URLEncode in Html:

In the html file encoded as GB2312:/ 文.rar-> the browser automatically converts to->/%D6%D0%CE%C4.rar

Note: Firefox does not support the Chinese URL of GB2312 Encode, because it is the UTF-8 code by default to send the URL, but ftp: // protocol can, I tried, I think this is a Firefox bug.

Html file encoded as a UTF-8:/Chinese .rar-> the browser automatically converted to->/javase4%b8%ad%e6%96%87.rar

URLEncode in PHP:

The Code is as follows: Copy code
<? Php
// GB2312 Encode
Echo urlencode ("Chinese-_."). "n"; // % D6 % D0 % CE % C4-_. +
Echo urldecode ("% D6 % D0 % CE % C4-_."). "n"; // Chinese -_.
Echo rawurlencode ("Chinese-_."). "n"; // % D6 % D0 % CE % C4-_. % 20
Echo rawurldecode ("% D6 % D0 % CE % C4-_."). "n"; // Chinese -_.
?>


All non-alphanumeric characters except "-_." will be replaced with a semicolon "%" followed by two hexadecimal numbers.

The difference between urlencode and rawurlencode: urlencode encodes the space into the plus sign "+", and rawurlencode encodes the space into the plus sign "% 20 ".

If you want to use the UTF-8's Encode, there are two ways:

1. Save the file as a UTF-8 file, directly use urlencode, rawurlencode.

Ii. Use the mb_convert_encoding function:

The Code is as follows: Copy code
<? Php
$ Url = '/ .rar ';
Echo urlencode (mb_convert_encoding ($ url, 'utf-8', 'gb2312 '). "n ";
Echo rawurlencode (mb_convert_encoding ($ url, 'utf-8', 'gb2312 '). "n ";
// Http%3A%2F%2Fs.%2F%E4%B8%AD%E6%96%87.rar
?>


Instance:

The Code is as follows: Copy code
<? Php
Function parseurl ($ url = "")
{
$ Url = rawurlencode (mb_convert_encoding ($ url, 'gb2312', 'utf-8 '));
$ A = array ("% 3A", "% 2F", "% 40 ");
$ B = array (":","/","@");
$ Url = str_replace ($ a, $ B, $ url );
Return $ url;
}
$ Url = "ftp: // ud03: password @ s./Chinese/ .rar ";
Echo parseurl ($ url );
// Ftp: // ud03: password @ s./% D6 % D0 % CE % C4/127d61_d0performance=c4.rar
?>


URLEncode in JavaScript:

Example: % E4 % B8 % AD % E6 % 96% 87-_. % 20% E4 % B8 % AD % E6 % 96% 87-_. % 20

EncodeURI does not encode the following characters: ":", "/", ";", "?" And.

For example:/users


It seems that only the urlencode () method can be used to process Chinese characters. Before transmission, encoding must be performed first, and then resolution can be performed after transmission. The following two functions are used: encoding: ". urlencode ('Chinese '). ", decoding :". urldecode ('Chinese '). ", the Chinese characters in the brackets are the characters passed.

The transfer page is first encoded with 04ie.com: td. php? Id = ". urlencode ('Chinese').", which can be decoded by the page: urldecode (id ).".

A function is attached later.

The Code is as follows: Copy code


If (preg_match ("/^ ([". chr (1, 228 ). "-". chr (1, 233 ). "] {1 }[". chr (1, 128 ). "-". chr (1, 191 ). "] {1 }[". chr (1, 128 ). "-". chr (1, 191 ). "] {1}) + $/", $ msg) // If $ msg is a UTF-8 code
{
$ Msg = iconv ("UTF-8", "GB2312", $ msg); // convert $ msg from UTF-8 encoding to GB2312 Encoding
}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.