Php judges Chinese and English characters

Source: Internet
Author: User

Encoding table

Double byte character encoding range

1. gbk (gb2312/gb18030)
X00-xff gbk dubyte encoding range
X20-x7f (ascii)
Xa1-xff Chinese gb2312
X80-xff Chinese gbk

2. UTF-8 (unicode)

U4e00-u9fa5)
X3130-x318f (Korean
Xac00-xd7a3 (Korean)
U0800-u4e00 (Japanese)

<?
$ Str = "China ";
Echo $ str;
Echo "// If (preg_match ("/^ [". chr (0xa1 ). "-". chr (0xff ). "] + $/", $ str) {// can only be used in the case of gb2312
If (preg_match ("/^ [x7f-xff] + $/", $ str) {// compatible with gb2312, UTF-8
Echo "correct input ";
} Else {
Echo "incorrect input ";
}
?>

There are actually a lot of knowledge about Chinese judgment problems. The underlying internal encoding involves UTF-8, gbk, and gb13800. I have studied how to differentiate what a character is. There are too many associated details.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.