A detailed introduction to the coding problem of PHP strings _php Example

Source: Internet
Author: User
Tags form post ord urlencode alphanumeric characters
As we all know, different character encodings, the number of bytes they occupy in memory is not the same. If ASCII-encoded characters occupy 1 bytes, the UTF-8-encoded Chinese character is 3 bytes and GBK is 2 bytes.

PHP also comes with several string intercept functions, which are commonly used in substr and MB_SUBSTR.

When you use substr to intercept Chinese characters, garbled characters occur because the substr is intercepted by byte. That is, UTF-8 encoded in Chinese, using substr interception, will only intercept 1/3 Chinese, of course, there are garbled.

parameters in Mb_substr (string $str , int $start [, int $length [, string $encoding ]]) $encoding can refer to the Code, if omitted, the internal character encoding is used.

If you do not know the encoding format of the string, you can use the Mb_detect_encoding check:

$encoding = mb_detect_encoding ($string, Array ("ASCII", "utf-8′," gb2312′, "GBK", ' big5′ '));

And then:

mb_substr ( string $str , int $start [, int $length [, string $encoding ]])

If you realize mb_substr, the efficiency is not very good.

Encoding-related PHP functions using

Ord (substr ($str, $i, 1)) > 0xa0)

Ord ($string) returns the ASC code of the first character of the string, which is used to determine whether the first character of the intercepted string is kanji, because for example gb2312 encodes a text that is 2 bytes, and UTF8 is three bytes. That is  , the code is more than 256 of the Chinese characters.


Regular characters:

Matching Chinese characters: Preg_match_all ('/[\x80-\xff]? /', $string, $match);

Match English: Preg_match_all ("/[/x01-/x7f]+/", $string, $match);


Encoding Conversion

Iconv ( string $in_charset , string $out_charset , string $str )

such as GB2312 turn UTF-8: Iconv ("GB2312", "UTF-8", $text)

URL encoding UrlEncode

The string returned after encoding except for -_. all non-alphanumeric characters are replaced with a percent sign ( %) followed by a two-bit hexadecimal number, and the space is encoded as a plus ( +). This encoding is the same encoding as the WWW form POST data and is encoded in the same way as the application/x-www-form-urlencoded Media type.

It should be noted, however, that you should encode only part of the URL when encoding, or the colon and backslash in the URL will also be escaped.

There are generally two kinds of urlencode, one is the traditional encode based on GB2312, the other is encode based on UTF-8. such as:
Copy Code code as follows:

$url = ' China ';
echo UrlEncode ($url);
UTF-8:%E4%B8%AD%E5%9B%BD
Gb2312:%d6%d0%b9%fa

http://www.baidu.com/s?wd= %e4%b8%ad%e5%9b%bd&rsv_bp=0&ch=&tn=baidu&bar=&rsv_ spt=3&ie=utf-8&rsv_sug3=16&rsv_sug=0&rsv_sug4=302&rsv_sug1=11&inputt=22928

%E4%B8%AD%E5%9B%BD 。


UrlEncode and Rawurlencode: UrlEncode encodes the space as a plus sign "+", and Rawurlencode encodes the space as the plus sign "%20". &NBSP

URL decoding urldecode and Rawurldecode
1, in decoding, you can use the corresponding UrlDecode () and Rawurldecode (), accordingly, Rawurldecode () will not be the plus (' + ') decoded as a space, and UrlDecode () can.
2, UrlDecode () and Rawurldecode () decoded string is UTF-8 format encoding, if the URL contains UTF-8 encoded in Chinese, then the decoded string to convert.
For example, first set the PHP file to gb2312 encoding. You will see that part of it is garbled and part of it is normal.
$url = ' China ';
echo $a = UrlDecode (UrlEncode ($url)), ';
echo iconv (' gb2312 ', ' utf-8 ', $a);
Й China

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.