Php uses regular expressions to match Chinese characters. In php, Chinese character regular expressions may be very simple for some friends, but during use, we will find that the gbk encoding and uft8 encoding may be a little different. The following is a small series to introduce. In gbk encoding, Chinese character regular expressions in php may be very simple for some friends, but during use, we will find that the gbk encoding and uft8 encoding may be a little different. The following is a brief introduction.
Chinese character regular expression in gbk encoding
1. determine whether the string is full of Chinese characters
The code is as follows:
$ Str = 'All are Chinese character test ';
If (preg_match_all ("/^ ([x81-xfe] [x40-xfe]) + $/", $ str, $ match )){
Echo 'all Chinese characters ';
} Else {
Echo 'all Chinese characters ';
}
?>
When $ str = 'All are Chinese character test', the output "all are Chinese characters ";
When $ str = 'All is a Chinese character test', the output is "not all Chinese characters ";
2. determine whether a string contains Chinese characters
The code is as follows:
$ Str = 'Chinese character 3 test ';
If (preg_match ("/([x81-xfe] [x40-xfe])/", $ str, $ match )){
Echo 'contains Chinese characters ';
} Else {
Echo 'contains no Chinese characters ';
}
?>
When $ str = 'Chinese character 3 test', the output "contains Chinese characters ";
When $ str = 'abc345';, the output "contains no Chinese characters ";
The content of the above variable $ str is irrelevant to utf8 or gbk encoding and the result is the same.
How to match Chinese characters with regular expressions in UTF-8 encoding
The code is as follows:
$ Str = "php programming ";
If (preg_match ("/^ [x {4e00}-x {9fa5}] + $/u", $ str )){
Print ("all strings are Chinese ");
} Else {
Print ("Not all strings are Chinese ");
}
Bytes. Chinese characters in gbk encoding...