Several Methods for Java to calculate the length of Chinese and English
In project development, verification of input characters is often encountered, especially when both Chinese and English are combined. In order to meet the requirements of verification, you sometimes need to calculate the length of both Chinese and English.
This article uses several common methods to calculate the length:
Import java. io. unsupportedEncodingException;/*** processing of Chinese and English verification * @ author a123demi **/public class EnChValidate {public static void main (String [] args) {String validateStr = "AB ababcde interface ii"; int bytesStrLength = validate (validateStr); int chineseLength = validate (validateStr); int regexpLength = getRegExpLength (validateStr); System. out. println ("getBytesLength:" + bytesStrLength + ", chineseLength:" + chineseLength + ", regexpLength:" + regexpLength );} /*** generate a temporary String based on the number of characters encoded in bytes * @ param validateStr * @ return */public static int getBytesStrLength (String validateStr) {String tempStr = ""; try {tempStr = new String (validateStr. getBytes ("gb2312"), "iso-8859-1");} catch (UnsupportedEncodingException e) {// TODO Auto-generated catch blocke. printStackTrace ();} return tempStr. length ();}/*** obtains the length of a string, each Chinese character is counted as two characters * @ param validateStr * Specified String * @ return String Length */public static int getChineseLength (String validateStr) {int valueLength = 0; string chinese = "[\ u0391-\ uFFE5]";/* Get the length of the field value. If it contains chinese characters, the length of each chinese character is 2, otherwise it is 1 */for (int I = 0; I <validateStr. length (); I ++) {/* Get A character */String temp = validateStr. substring (I, I + 1);/* determines whether it is a Chinese character */if (temp. matches (chinese) {/* chinese characters are 2 */valueLength + = 2;} else {/* other characters are 1 */valueLength + = 1 ;}} return valueLength;}/*** use a regular expression to convert each Chinese character to a regular expression that matches a Chinese character: [\ u4e00-\ u9fa5] * match two-byte characters (including Chinese characters ): [^ \ x00-\ xff] * @ param validateStr * @ return */public static int getRegExpLength (String validateStr) {// String temp = validateStr. replaceAll ("[\ u4e00-\ u9fa5]", "**"); String temp = validateStr. replaceAll ("[^ \ x00-\ xff]", "**"); return temp. length ();}}