Summary of encoding ranges of character sets in Regular Expressions
Last Update:2018-12-08
Source: Internet
Author: User
These character sets are especially helpful for the various characters, punctuation marks, and special characters of Japanese character sets.
UTF8
[\ X01-\ x7f] | [\ xc0-\ xdf] [\ x80-\ xbf] | [\ xe0-\ xef] [\ x80-\ xbf] {2} | [\ xf0-\ xff] [\ x80-\ xbf] {3}
UTF16
[\ X00-\ xd7] [\ xe0-\ xff] | [\ xd8-\ xdf] [\ x00-\ xff] {2}
JIS
[\ X20-\ x7e] | [\ x21-\ x5f] | [\ x21-\ x7e] {2}
SJIS
[\ X20-\ x7e] | [\ xa1-\ xdf] | ([\ x81-\ x9f] | [\ xe0-\ xef]) ([\ x40-\ x7e] | [\ x80-\ xfc])
EUC_JP
[\ X20-\ x7e] | \ x81 [\ xa1-\ xdf] | [\ xa1-\ xfe] [\ xa1-\ xfe] | \ x8f [\ xa1 -\ xfe] {2}
EUC_JP punctuation marks and special characters
[\ Xa1-\ xa2] [\ xa0-\ xfe]
EUC_JP full-angle number
\ Xa3 [\ xb0-\ xb9]
EUC_JP full-width uppercase English
\ Xa3 [\ xc1-\ xda]
EUC_JP full-width lowercase English
\ Xa3 [\ xe1-\ xfa]
EUC_JP fullwidth hirakana
\ Xa4 [\ xa1-\ xf3]
EUC_JP fullwidth katakana [color = Red] update [/color]
\ Xa3 [\ xb0-\ xb9] | \ xa3 [\ xc1-\ xda] | \ xa5 [\ xa1-\ xf6] [\ xa3] [\ xb0-\ xfa] | [\ xa1] [\ xbc-\ xbe] | [\ xa1] [\ xdd]
EUC_JP full-width Chinese characters [color = Red] update [/color]
[\ Xb0-\ xcf] [\ xa0-\ xd3] | [\ xd0-\ xf4] [\ xa0-\ xfe] | [\ xB0-\ xF3] [\ xA1 -\ xFE] | [\ xF4] [\ xA1-\ xA6] | [\ xA4] [\ xA1-\ xF3] | [\ xA5] [\ xA1-\ xF6] | [\ xA1] [\ xBC-\ xBE]
Big5
[\ X01-\ x7f] | [\ x81-\ xfe] ([\ x40-\ x7e] | [\ xa1-\ xfe])
GBK
[\ X01-\ x7f] | [\ x81-\ xfe] [\ x40-\ xfe]
GB2312 Chinese Characters
[\ Xb0-\ xf7] [\ xa0-\ xfe]
GB2312 halfwidth punctuation marks and special symbols
\ Xa1 [\ xa2-\ xfe]
GB2312 Rome array and Project No.
\ Xa2 ([\ xa1-\ xaa] | [\ xb1-\ xbf] | [\ xc0-\ xdf] | [\ xe0-\ xe2] | [\ xe5 -\ xee] | [\ xf1-\ xfc])
GB2312 fullwidth punctuation and fullwidth letters
\ Xa3 [\ xa1-\ xfe]
GB2312 Japanese hirakana
\ Xa4 [\ xa1-\ xf3]
GB2312 Japanese Katakana
\ Xa5 [\ xa1-\ xf6]
Charge:
GB18030
[\ X00-\ x7f] | [\ x81-\ xfe] [\ x40-\ xfe] | [\ x81-\ xfe] [\ x30-\ x39] [\ x81 -\ xfe] [\ x30-\ x39]
[Color = Red] supplement [/color]
Japanese half-width Space
\ X20
SJIS fullwidth Space
(? : \ X81 \ x81)
SJIS fullwidth number
(? : \ X82 [\ x4f-\ x58])
SJIS in uppercase
(? : \ X82 [\ x60-\ x79])
SJIS all lowercase English
(? : \ X82 [\ x81-\ x9a])
SJIS fullwidth hirakana
(? : \ X82 [\ x9f-\ xf1])
SJIS fullwidth hirakana Extension
(? : \ X82 [\ x9f-\ xf1] | \ x81 [\ x4a \ x4b \ x54 \ x55])
SJIS fullwidth katakana
(? : \ X83 [\ x40-\ x96])
SJIS full-angle katakana Extension
(? : \ X83 [\ x40-\ x96] | \ x81 [\ x45 \ x5b \ x52 \ x53])
EUC_JP fullwidth Space
(? : \ Xa1 \ xa1)
EUC halfwidth katakana
(? : \ X8e [\ xa6-\ xdf])