PHP Regular Expressions Match Chinese rules

Last Update:2017-01-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Chinese matching in utf-8 environment

\w match only Chinese, numbers, letters, for the Chinese people, only match the English will often use, see

Matching regular expressions for Chinese characters: [\U4E00-\U9FA5]

Maybe you also need to match double-byte characters, and Chinese is also double-byte characters

Match Double-byte characters (including Chinese characters): [^\x00-\xff]

Note: Can be used to calculate the length of a string (a double-byte character length meter 2,ascii character 1)

in the ANSI (GB2312) environment

Match all gb2312 encoded table characters:/[". chr (0xb0)." -". Chr (0xf7)." +/
Match only Chinese characters without matching full-width punctuation:/(["Chr (0xb0)." -". Chr (0xf7)." [". chr (0XA1)." -". Chr (0xFE)."]) /

The expression can match a Chinese character.

Matches full-width punctuation without matching kanji:/(["Chr (0XA1)." -". Chr (0XA3)." [". chr (0XA1)." -". Chr (0xFF)."]) /

Example

The code is as follows

Copy Code

1. Using Preg_match function to match Chinese characters

<?php
$str = ' asd US CD ';
$key = ' #[\x{4e00}-\x{9fa5}] #u ';
Preg_match ($key, $str, $res);
Print_r ($res);
?>
Results:
Array ([0]=> me)
2. Use Preg_match function to match Chinese characters (more than 1 consecutive)

<?php
$str = ' 34353434 US CD ';
$key = ' #[\x{4e00}-\x{9fa5}]{1,} #u ';
Preg_match ($key, $str, $res);
Print_r ($res);
?>
Results
Array ([0]=> us)
3, improve 1, using Preg_match_all function matching

<?php
$str = ' 34353434 US CD ';
$key = ' #[\x{4e00}-\x{9fa5}] #u ';
Preg_match_all ($key, $str, $res);
Print_r ($res);
?>
Results
Array ([0]=>array ([0]=> i [1]=>))
4, improve 2, using the Preg_match_all function to match Chinese characters (more than 1 consecutive)

<?php
$str = ' 34353434 US CD ';
$key = ' #[\x{4e00}-\x{9fa5}]{1,} #u ';
Preg_match_all ($key, $str, $res);
Print_r ($res);

?>
Results
Array ([0]=>array ([0]=> US))

As can be seen from the results, the use of [\x4e00-\x9fa5] This regular expression can be matched to Chinese.
The difference between Preg_match or Preg_match_all is that the former match is over (whether or not the match succeeds), and the latter, from the beginning to the end of the string to be matched.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

PHP Regular Expressions Match Chinese rules

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support