Java judges whether Chinese characters are garbled; java judges Chinese characters garbled
Java checks whether Chinese characters are garbled
import java.util.regex.Matcher;import java.util.regex.Pattern;public class MessyCodeCheck {private static boolean isChinese(char c) {Character.UnicodeBlock ub = Character.UnicodeBlock.of(c);if (ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS|| ub == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS|| ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A|| ub == Character.UnicodeBlock.GENERAL_PUNCTUATION|| ub == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION|| ub == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS) {return true;}return false;}public static boolean isMessyCode(String strName) {Pattern p = Pattern.compile("\\s*|\t*|\r*|\n*");Matcher m = p.matcher(strName);String after = m.replaceAll("");String temp = after.replaceAll("\\p{P}", "");char[] ch = temp.trim().toCharArray();float chLength = ch.length;float count = 0;for (int i = 0; i < ch.length; i++) {char c = ch[i];if (!Character.isLetterOrDigit(c)) {if (!isChinese(c)) {count = count + 1;}}}float result = count / chLength;if (result > 0.4) {return true;} else {return false;}}}
Java: how to judge whether a string is garbled
I think the landlord is talking about common garbled characters in WIN, but this is not in JAVA, Because JAVA uses UNICODE sets. For details, refer to: zhidao.baidu.com/question/31810916.html? Si = 3
So what language is designed and what language is output, just as you can now directly view Japanese webpages without garbled characters.
The regular expression on the second floor is the position of Chinese characters in the UNICODE set.
Java: How can I judge whether the string is garbled? What is the name of a jar package that automatically converts garbled characters into Chinese characters?
Corresponding to source code and compiled encoding parameters, without garbled characters