Long ago, had done this piece, checked a lot of information, not many now and forget, and the summary of the information at that time did not know where to go, summed up again:
JavaScript uses:
[\u4e00-\u9fa5]
Also used in Java:
[\u4e00-\u9fa5]
Someone on the internet says:
Now most of the internet is used to judge Chinese characters is \u4e00-\u9fa5 this range is only "China, Japan and South Korea unified Ideographic" This interval, but this is not all, if you want to include all, but also their extension set, radicals, pictographic characters, note letters and so on
This seems to be, I have been looking for a little impression, but forgot. Summarize later ...
Focus, we need to be in PHP, matching Chinese:
PHP Usage:
/[\x{4e00}-\x{9fa5}]/u---------Note Mode modifier ' u '
"U", explained in the manual:
U (PCRE_UTF8) This modifier opens an add-on feature that is incompatible with Perl. The pattern string is considered utf-8. This modifier is available from the UNIX version of PHP 4.1.0 or higher, Win32 version of PHP 4.2.3. PHP 4.3.5 begins checking the utf-8 legality of the mode. For Chinese matching under gb2312 encoding:
if (!preg_match ("/^[". Chr (0XA1). " -". Chr (0xff)." a-za-z0-9_]+$/", $str))//gb2312 Chinese character alphanumeric underline regular expression----not necessarily accurate, or before the impression, I was also checked for a long time, feeling that this may be just a part ...