PHP Regular Expressions Match Chinese rules

Source: Internet
Author: User

Chinese matching in utf-8 environment

\w match only Chinese, numbers, letters, for the Chinese people, only match the English will often use, see

Matching regular expressions for Chinese characters: [\U4E00-\U9FA5]

Maybe you also need to match double-byte characters, and Chinese is also double-byte characters

Match Double-byte characters (including Chinese characters): [^\x00-\xff]

Note: Can be used to calculate the length of a string (a double-byte character length meter 2,ascii character 1)


in the ANSI (GB2312) environment

Match all gb2312 encoded table characters:/[". chr (0xb0)." -". Chr (0xf7)." +/
Match only Chinese characters without matching full-width punctuation:/(["Chr (0xb0)." -". Chr (0xf7)." [". chr (0XA1)." -". Chr (0xFE)."]) /

The expression can match a Chinese character.

Matches full-width punctuation without matching kanji:/(["Chr (0XA1)." -". Chr (0XA3)." [". chr (0XA1)." -". Chr (0xFF)."]) /

Example

The code is as follows Copy Code

1. Using Preg_match function to match Chinese characters

<?php
$str = ' asd US CD ';
$key = ' #[\x{4e00}-\x{9fa5}] #u ';
Preg_match ($key, $str, $res);
Print_r ($res);
?>
Results:
Array ([0]=> me)
2. Use Preg_match function to match Chinese characters (more than 1 consecutive)

<?php
$str = ' 34353434 US CD ';
$key = ' #[\x{4e00}-\x{9fa5}]{1,} #u ';
Preg_match ($key, $str, $res);
Print_r ($res);
?>
Results
Array ([0]=> us)
3, improve 1, using Preg_match_all function matching

<?php
$str = ' 34353434 US CD ';
$key = ' #[\x{4e00}-\x{9fa5}] #u ';
Preg_match_all ($key, $str, $res);
Print_r ($res);
?>
Results
Array ([0]=>array ([0]=> i [1]=>))
4, improve 2, using the Preg_match_all function to match Chinese characters (more than 1 consecutive)

<?php
$str = ' 34353434 US CD ';
$key = ' #[\x{4e00}-\x{9fa5}]{1,} #u ';
Preg_match_all ($key, $str, $res);
Print_r ($res);

?>
Results
Array ([0]=>array ([0]=> US))

As can be seen from the results, the use of [\x4e00-\x9fa5] This regular expression can be matched to Chinese.
The difference between Preg_match or Preg_match_all is that the former match is over (whether or not the match succeeds), and the latter, from the beginning to the end of the string to be matched.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.