In programming, regular expressions matching Chinese characters are sometimes used, which can be done with [\ u4e00-\ u9fa5] +. However, this regular expression is not suitable for the general Martian birds, and even the full-angle punctuation marks are not included. For example, the name of a player in the game. Ordinary young people generally use Chinese characters. There are several special characters in the Youth League of literature and art, which can be used in the second generation of Mars. At this time, you need more powerful regular expressions.
In fact, most of the player names in the game are taken from: CJK Unified ideographs, plus some special characters. They are basically covered by [\ u2e80-\ ufe4f] +. According to unicode5.0:
1) Standard CJK text
Http://www.unicode.org/Public/UNIDATA/Unihan.html
2) ASCII, all Chinese Punctuation, half width Katakana, half width hirakana, half width Korean letters: FF00-FFEF
Http://www.unicode.org/charts/PDF/UFF00.pdf
3) CJK radical supplement: 2e80-2eff
Http://www.unicode.org/charts/PDF/U2E80.pdf
4) CJK punctuation: 3000-303f
Http://www.unicode.org/charts/PDF/U3000.pdf
5) CJK strokes: 31c0-31ef
Http://www.unicode.org/charts/PDF/U31C0.pdf
6) Kangxi: 2f00-2fdf
Http://www.unicode.org/charts/PDF/U2F00.pdf
7) Chinese character structure description character: 2ff0-2fff
Http://www.unicode.org/charts/PDF/U2FF0.pdf
8) phonetic symbol: 3100-312f
Http://www.unicode.org/charts/PDF/U3100.pdf
9) phonetic symbols (extensions of South Fujian and Hakka): 31a0-31bf
Http://www.unicode.org/charts/PDF/U31A0.pdf
10) Japanese hirakana: 3040-309f
Http://www.unicode.org/charts/PDF/U3040.pdf
11) Japanese Katakana: 30a0-30ff
Http://www.unicode.org/charts/PDF/U30A0.pdf
12) Japanese Katakana pinyin Extension: 31f0-31ff
Http://www.unicode.org/charts/PDF/U31F0.pdf
13) Korean pinyin: AC00-D7AF
Http://www.unicode.org/charts/PDF/UAC00.pdf
14) Korean letters: 1100-11ff
Http://www.unicode.org/charts/PDF/U1100.pdf
15) compatible letters in Korean: 3130-318f
Http://www.unicode.org/charts/PDF/U3130.pdf
16) taixuan Sutra Symbol: 1d300-1d35f
Http://www.unicode.org/charts/PDF/U1D300.pdf
17) Yijing sixty-14 pictures: 4dc0-4dff
Http://www.unicode.org/charts/PDF/U4DC0.pdf
18) Yi-wen syllable: A000-A48F
Http://www.unicode.org/charts/PDF/UA000.pdf
19) Yi document first: A490-A4CF
Http://www.unicode.org/charts/PDF/UA490.pdf
20) Braille: 2800-28ff
Http://www.unicode.org/charts/PDF/U2800.pdf
21) CJK letter and month: 3200-32ff
Http://www.unicode.org/charts/PDF/U3200.pdf
22) CJK special symbols (date merging): 3300-33ff
Http://www.unicode.org/charts/PDF/U3300.pdf
23) decorative symbols (not for CJK purposes): 2700-27bf
Http://www.unicode.org/charts/PDF/U2700.pdf
24) Miscellaneous symbols (not for CJK purposes): 2600-26ff
Http://www.unicode.org/charts/PDF/U2600.pdf
English vertical punctuation: FE10-FE1F
Http://www.unicode.org/charts/PDF/UFE10.pdf
26) CJK compatible symbols (vertical variants, underscores, comma): FE30-FE4F
Http://www.unicode.org/charts/PDF/UFE30.pdf