PHP matching Zhong Wenjing (reprint)!

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

PHP regular Match Chinese (2011-09-26 10:10:46)

Reprint: http://hi.baidu.com/?_d/blog/item/063b77d5432f8f1aa18bb7fd.html

In JavaScript, it's easy to tell if a string is Chinese. Like what:

var str = "PHP programming";

if (/^[\u4e00-\u9fa5]+$/.test (str)) {

Alert ("The string is all Chinese");

} else {

Alert ("The string is not all Chinese");

}

Take it for granted, in PHP to determine whether the string is Chinese, will follow this idea:

<?php

$STR = "PHP programming";

if (Preg_match ("/^[\u4e00-\u9fa5]+$/", $str)) {

Print ("The string is all Chinese");

} else {

Print ("The string is not all Chinese");

}

However, it will soon be found that PHP does not support such an expression, error:

Warning:preg_match () [Function.preg-match]: compilation Failed:pcre does not support \l, \l, \ n, \u, or \u at offset 3 I n test.php on line 3

Just started to look at Google a lot of times, want to from the PHP regular expression for the hexadecimal data

Breakthrough in expression, found in PHP, is to use \x to represent hexadecimal data. So

Transform it into the following code:

$STR = "PHP programming";

if (Preg_match ("/^[\x4e00-\x9fa5]+$/", $str)) {

Print ("The string is all Chinese");

} else {

Print ("The string is not all Chinese");

}

Seemingly no error, the results of the judgment is correct, but the $STR replaced by "programming" two words, the result is

Or "The string is not all Chinese", it seems that the judgment is not accurate enough.

Later ran back to Baidu search "php matching Chinese character utf 8", found that the article is more than the matching degree of Google is much higher,

It seems Baidu's "Baidu more understand Chinese" is still to a certain extent is correct. In the second article, "★ ¡ï Ask UTF8

The following are some of the following: the regular matching of Chinese characters, online ...

Landlord Zhiin (┈jcan┈) 2006-11-15 15:59:30 in WEB development/PHP Questions

Find the UTF8 of matching Chinese characters, not including full-width characters and special symbols!

Only regular matches for full-width characters can be found online: ^[\x80-\xff]*^/

[\u4e00-\u9fa5] can match Chinese, but PHP does not support

Depressed in .....

1 floor pleasedotellmewhy (Allah bless you!) reply at 2006-11-15 16:04:55 score 11

Chr (0XA1). ‘-‘ . Chr (0xff) can match all Chinese, but do not know how under UTF-8! Top

2 floor Zhiin (┈jcan┈) reply at 2006-11-15 16:11:34 score 0

Even under the gb2312, Chr (0XA1). ‘-‘ . Chr (0xff) also wrong

It also matches the full-width symbol in the top

3 floor xuzuning (nagging) reply at 2006-11-15 16:19:56 score 90

Pattern modifier: U

According to the clues provided, one after another, it is true that, as they say, it may also be related to coding,

So you need to know something about the pattern modifier--and keep searching for Baidu.

In an article in the pattern modifier, I learned that:

U (PCRE_UTF8)

This modifier enables additional features in a PCRE that are incompatible with Perl. The pattern string is treated as UTF-8.

This modifier is available under Unix from PHP 4.1.0 and is available under Win32 from PHP 4.2.3.

Example:

Preg_match ('/[\x{2460}-\x{2468}]/u ', $str); Match the Chinese characters in the code

Test it in the way he provides it, with the following code:

$STR = "PHP programming";

if (Preg_match ("/^[\x{2460}-\x{2468}]+$/u", $str)) {

Print ("The string is all Chinese");

} else {

Print ("The string is not all Chinese");

}

Found that this is still a judgment on whether the Chinese is abnormal. However, since \x represents the hexadecimal data,

Why and JS inside provide scope \x4e00-\x9fa5 not the same? So I replaced the code below:

$STR = "PHP programming";

if (Preg_match ("/^[\x4e00-\x9fa5]+$/u", $str)) {

Print ("The string is all Chinese");

} else {

Print ("The string is not all Chinese");

}

The thing that was supposed to succeed, unexpectedly, warning again produced:

Warning:preg_match () [Function.preg-match]: compilation Failed:invalid UTF-8 string at offset 6 inch test.php on line 3

It seems that there is a wrong way of expression, so against the expression of the article,

To "4e00" and "9fa5" on both sides with "{" and "}" wrapped up, ran again, found that really accurate:

$STR = "PHP programming";

if (Preg_match ("/^[\x{4e00}-\x{9fa5}]+$/u", $str)) {

Print ("The string is all Chinese");

} else {

Print ("The string is not all Chinese");

}

Know the final correct expression--/^[\x{4e00}-\x{9fa5}]+$/u of the regular expression matching Chinese characters under Utf-8 encoding in PHP,

So I used this expression to go to Baidu search, found that there are really others have come to such a correct conclusion, but through

The conventional way is difficult to find, and just search for one article-"using the regular deletion of Chinese characters", it seems that the internet for

The selection of the correctness of information is still to be strengthened urgently.

PS: Google does not give up, also searched for a bit, and found an article "PHP Common Class",

Or in the Baidu space, hehe, interesting!

--------------------------------------------------------------------------------------------------------------- -------------------

Refer to the above article to write the following test code (copy the following code to save the. php file)

<?php

$action = Trim ($_get[' action ');

if ($action = = "Sub")

{

$str = $_post[' dir '];

if (!preg_match ("/^[". Chr (0XA1). " -". Chr (0xff)." a-za-z0-9_]+$/", $str))//gb2312 Chinese character alphanumeric underline regular expression

if (!preg_match ("/^[\x{4e00}-\x{9fa5}a-za-z0-9_]+$/u", $str))//utf-8 Chinese alphanumeric underscore regular expression

{

echo "<font color=red> you entered [". $str. "] Contains illegal characters </font> ";

}

Else

{

echo "<font color=green> you entered [". $str. "] Perfectly legal, through!</font> ";

}

Input characters (numbers, letters, kanji, underscores):

</form>

PHP matching Zhong Wenjing (reprint)!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

PHP matching Zhong Wenjing (reprint)!

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

PHP matching Zhong Wenjing (reprint)!

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support