How to match Chinese characters _ regular expressions with regular expressions in utf-8 coding in PHP

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In JavaScript, it's easy to tell if a string is Chinese. Like what:

Copy Code code as follows:

var str = "PHP programming";
if (/^[\u4e00-\u9fa5]+$/.test (str)) {
Alert ("The string is all in Chinese");
} else {
Alert ("This string is not all Chinese");
}

Taken for granted, in PHP to determine whether the string is Chinese, it will follow this idea:

Copy Code code as follows:

<?php
$STR = "PHP programming";
if (Preg_match ("/^[\u4e00-\u9fa5]+$/", $str)) {
Print ("The string is all Chinese");
} else {
Print ("This string is not all Chinese");
}
?>

However, it will soon be found that PHP does not support this expression, the error:
Warning:preg_match () [Function.preg-match]: compilation Failed:pcre does not support \l, \l, \ n, \u, or \u at offset 3 I n test.php on line 3
Just started looking at Google a lot of times, want to from the PHP regular expression for hexadecimal data expression way breakthrough, found in PHP, is using \x to represent hexadecimal data. Instead, change to the following code:
$STR = "PHP programming";
if (Preg_match ("/^[\x4e00-\x9fa5]+$/", $str)) {
Print ("The string is all Chinese");
} else {
Print ("This string is not all Chinese");
}
Seemingly no error, judge the result is correct, but the $STR replaced by "programming" two words, the result is still show "the string is not all Chinese", it seems that the judgment is not accurate enough.
Later ran back to Baidu search "PHP matching Chinese characters UTF 8", found that the article is more than Google's matching degree is much higher, it seems that Baidu "more understand Chinese" is still to a certain extent correct. In the second article "★★★ seek UTF8 under the matching Chinese characters, online and so on ..." see the following elements:
Landlord Zhiin (┈jcan┈) 2006-11-15 15:59:30 in WEB development/PHP Questions
To find the UTF8 matching Chinese characters, excluding full-width characters and special symbols!
Only regular matching full-width characters can be found on the net: ^[\x80-\xff]*^/
[\u4e00-\u9fa5] can match Chinese, but PHP does not support
Depressed in ....
1/F pleasedotellmewhy (Allah bless you!) reply to 2006-11-15 16:04:55 score 11
Chr (0XA1). '-' . Chr (0xff) can match all Chinese, but don't know what to do under UTF-8! Top
2/F Zhiin (┈jcan┈) reply to 2006-11-15 16:11:34 score 0
Even under GB2312, Chr (0XA1). '-' . Chr (0xff) is not right
It also matches the full-width symbols in the top.
3/F xuzuning (NAG) back to 2006-11-15 16:19:56 score 90
Pattern modifier: U
After trying each of these clues, and finding out that they are, as they say, probably related to the code, you need to know about the pattern modifier-so keep searching for Baidu.
In a "pattern modifier" article, read:
U (PCRE_UTF8)
This modifier enables an additional feature that is incompatible with Perl in a PCRE. The pattern string is treated as UTF-8. This modifier is available under Unix from PHP 4.1.0 and is available under Win32 from PHP 4.2.3.
Example:
Preg_match ('/[\x{2460}-\x{2468}]/u ', $str); Matching inner code Chinese characters
In the way he provided, the code was as follows:

Copy Code code as follows:

$STR = "PHP programming";
if (Preg_match ("/^[\x{2460}-\x{2468}]+$/u", $str)) {
Print ("The string is all Chinese");
} else {
Print ("This string is not all Chinese");
}

Find out whether or not to judge the Chinese is still abnormal. However, since \x represents the hexadecimal data, why and JS inside the scope of the \X4E00-\X9FA5 is not the same? So I switched to the bottom code:

Copy Code code as follows:

$STR = "PHP programming";
if (Preg_match ("/^[\x4e00-\x9fa5]+$/u", $str)) {
Print ("The string is all Chinese");
} else {
Print ("This string is not all Chinese");
}

Originally thought the thing that definitely succeeds, unexpectedly, warning again produce:
Warning:preg_match () [Function.preg-match]: compilation Failed:invalid UTF-8 string at offset 6 into test.php on line 3
It seems that there is a wrong way of expression, and then contrasted the expression of the article, to "4e00" and "9fa5" on both sides of the "{" and "}" wrapped up, ran again, found really accurate:

Copy Code code as follows:

$STR = "PHP programming";
if (Preg_match ("/^[\x{4e00}-\x{9fa5}]+$/u", $str)) {
Print ("The string is all Chinese");
} else {
Print ("This string is not all Chinese");
}

Know that PHP in the Utf-8 code with regular expressions to match the final correct expression of Chinese characters--/^[\x{4e00}-\x{9fa5}]+$/u, so I use this expression to Baidu search, found that there is really someone else came up with such a correct conclusion, Just through the usual way is difficult to find, and only found one--"with a positive delete Chinese characters", it seems that the internet on the correctness of the selection of information is still urgent to strengthen.
PS: Google will not give up, but also search for a while, and found an article "PHP commonly used class", or in the Baidu space, hehe, interesting!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

How to match Chinese characters _ regular expressions with regular expressions in utf-8 coding in PHP

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

How to match Chinese characters _ regular expressions with regular expressions in utf-8 coding in PHP

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support