Java General Chinese judgments are based on regular expressions
Pattern pattern = Pattern.compile ("[\u4e00-\u9fcc]+");
SYSTEM.OUT.PRINTLN (Pattern.matcher (str). find ());
Or
System.out.println (Str.matches ("[\u4e00-\u9fcc]+"));
this way there will be some obscure Chinese characters can not be recognized, but also can not match the Chinese symbols such as "",;
in Java, the main use of the character class is to handle character-related functions, using the internal classes it provides to determine Chinese
1.character.unicodeblock
Judging Chinese characters:
private static Boolean Ischinesebyblock (char c) {
Character.unicodeblock UB = Character.UnicodeBlock.of (c);
if (UB = = Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS
| | ub = = Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_ Extension_a
| | ub = = Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B
| | ub = = character.unicodeblock.cjk_unified_ideographs_extension_c//jdk1.7
| | ub = = Character.UnicodeBlock.CJK_UNIFIED _ideographs_extension_d//jdk1.7
| | ub = = Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS
| | ub = = Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS_SUPPLEMENT) {return
true;
}
return false;
}
Judging Chinese symbols:
private static Boolean ischinesepuctuation (char c) {
Character.unicodeblock UB = Character.UnicodeBlock.of (c);
if (UB = = Character.UnicodeBlock.GENERAL_PUNCTUATION
| | ub = = Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION | |
| ub = = Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS
| | UB = = Character.UnicodeBlock.CJK_COMPATIBILITY_FORMS
| | ub = = Character.UnicodeBlock.VERTICAL_FORMS) {//jdk1.7 return
true;
}
return false;
}
2.character.unicodescript (jdk1.7 only available)
Character.unicodescript provides a more concise way of judging Chinese characters:
private static Boolean ischinesebyscript (char c) {
Character.unicodescript sc = Character.UnicodeScript.of (c);
if (sc = = Character.UnicodeScript.HAN) {return
true;
}
return false;
}