Chinese characters are regular. Familiar character set of friends into.
Matching Chinese characters are divided into several cases?
How do you write each case?
Example: ASCII encoding and Unicode encoding
gb2312 GBK Big5 how to match? Depends on what font is used for the service?
On the matching of UICode encoding, the regular range given on the net is:
[\u4e00-\u9fa5]
But I looked up the Unicode encoding table to find:
From the 3220 onwards, there are Chinese characters.
Also \x80-\xff is the matching ASCII code?
Please give it a little bit.
Or have relevant information let me refer to:
I am greatly obliged to you.
------Solution--------------------
2E80~33FFH: Chinese-Japanese-Korean symbol area. Host Kangxi Radical, CJK Auxiliary radicals, phonetic symbols, Japanese kana, Korean notes, symbols of CJK, punctuation, circled or with rune numbers, month, and Japanese kana combination, unit, era name, month, date, time, etc.
3400~4DFFH: China, Japan and South Korea agree ideographs to expand a area, a total of 6,582 Japanese and Korean characters.
4E00~9FFFH: CJK Identity Ideographs District, a total of 20,902 Japanese and Korean characters.
A000~a4ffh: Yi Writing area, accommodating the Chinese Southern Yi text and the word root.
AC00~D7FFH: The Korean phonetic combination word area, to accommodate the text that is spelled with the Korean note.
F900~FAFFH: CJK Compatible Ideographs area, with a total of 302 Chinese and Japanese Korean characters.
FB00~FFFDH: Text representation of the form area, accommodating the combination of Latin text, Hebrew, Arabic, CJK Straight punctuation, small symbols, half-width symbols, full-width symbols and so on.
For example, if you need to match all of the Chinese and Japanese characters, the regular expression should be ^[\u3400-\u9fff]+$
Theoretically, yes, but I went to Msn.co.ko to copy a Korean, and found it was not right, weird
Again to msn.co.jp copy a ' お尻 ', also cannot line.
And then extend the scope to ^[\u2e80-\u9fff]+$, so it all passed, this should be a match between the Japanese and Korean characters of the regular expression, including our Taiwan Province is still in the blind use of traditional Chinese
And the regular expression of Chinese, should be ^[\u4e00-\u9fff]+$, and the forum is often mentioned ^[\u4e00-\u9fa5]+$ very close
Note that the forum said ^[\u4e00-\u9fa5]+$ this is specifically used to match the Simplified Chinese regular expression, in fact, the traditional characters are also in the inside, I tested the next ' Chinese People's Republic ' with the tester, also passed, of course, ^[\u4e00-\u9fff]+$ and the same result.
------Solution--------------------
Mb_ereg_match
------Solution--------------------
U0000 ascii.pdf
U0a00.pdf
U0a80.pdf
U0b00.pdf
U0b80.pdf
U0c00.pdf
U0c80.pdf
U0d00.pdf
U0d80.pdf
U0e00.pdf
U0e80.pdf
U0f00.pdf
U1a00.pdf
U1b00.pdf
U1d000.pdf
U1d00.pdf
U1d80.pdf
U1d100.pdf
U1d200.pdf
U1d300.pdf
U1d360.pdf
U1d400.pdf
U1dc0.pdf
U1e00.pdf
U1f00.pdf
U1ff80.pdf
U2a00 Extended Mathematical Symbols-pdf
U02b0.pdf
U2b00.pdf
U2c00.pdf
U2c60.pdf
U2c80.pdf
U2d00.pdf
U2d30.pdf
U2d80.pdf
U2e00.pdf
U2e80.pdf
U2f00.pdf
U2f800.pdf
U2ff0.pdf
U2ff80.pdf
U3ff80.pdf
U4dc0.pdf
U4e00 Chinese. pdf
U4ff80.pdf
U5ff80.pdf
U6ff80.pdf
U07c0.pdf
U7ff80.pdf
U8ff80.pdf
U9ff80.pdf
U10a00.pdf
U10a0.pdf
U10ff80.pdf
U13a0.pdf
U16a0.pdf
U19e0.pdf
U20a0.pdf
U20d0.pdf
U25a0.pdf
U27c0.pdf
U27f0.pdf
U30A0 Japanese film fake. pdf
U31a0.pdf
U31c0.pdf
U31f0.pdf
U0080 Latin symbol. pdf
U0100.pdf
U103a0.pdf
U0180.pdf
U0250.pdf
U0300.pdf
U0370.pdf
U0400.pdf
U0500.pdf
U0530.pdf
U0590.pdf
U0600.pdf
U0700.pdf
U0750.pdf
U0780.pdf
U0900.pdf
U0980.pdf
U1000.pdf
U1100.pdf
U1200.pdf
U1380.pdf
U1400.pdf
U1680.pdf
U1700.pdf
U1720.pdf
U1740.pdf
U1760.pdf
U1780.pdf
U1800.pdf
U1900.pdf
U1950.pdf
U1980.pdf
U2000.pdf
U2070.pdf
U2100.pdf
U2150.pdf
U2190 arrows. pdf
U2200 Mathematical Symbols-pdf
U2300.pdf
U2400.pdf
U2440.pdf
U2460 numeric sequence number. pdf
U2500 tab. pdf
U2580 blocks. pdf
U2600.pdf
U2700.pdf
U2800.pdf
U2900.pdf
U2980.pdf
U3000 Chinese punctuation. pdf
U3040 Japanese flat fake text. pdf
U3100 Chinese old pinyin. pdf
U3130 Korean Pinyin. pdf
U3190.pdf
U3200 number sign sign. pdf
U3300 unit and time. pdf
U3400.pdf
U10000.pdf
U10080.pdf
U10100.pdf
U10140.pdf
U10300.pdf
U10330.pdf
U10380.pdf
U10400.pdf
U10450.pdf
U10480.pdf
U10800.pdf
U10900.pdf
U12000.pdf
U12400.pdf
U20000.pdf
U100000.pdf
Ua000.pdf
Ua490.pdf
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.