Java to determine whether it is a Chinese character)

Source: Internet
Author: User
Document directory
  •  
Java code
  1. Public Boolean VD (string Str ){
  2. Char [] chars = Str. tochararray ();
  3. Boolean isgb2312 = false;
  4. For (INT I = 0; I <chars. length; I ++ ){
  5. Byte [] bytes = ("" + chars [I]). getbytes ();
  6. If (bytes. Length = 2 ){
  7. Int [] ints = new int [2];
  8. Ints [0] = bytes [0] & 0xff;
  9. Ints [1] = bytes [1] & 0xff;
  10. If (ints [0]> = 0x81 & ints [0] <= 0xfe & ints [1]> = 0x40 & ints [1] <= 0xfe) {
  11. Isgb2312 = true;
  12. Break;
  13. }
  14. }
  15. }
  16. Return isgb2312;
  17. }

First, import java. util. RegEx. Pattern and Java. util. RegEx. matcher.
The two packages are followed by the code

Java code
  1. Public Boolean isnumeric (string Str)
  2. {
  3. Pattern pattern = pattern. Compile ("[0-9] *");
  4. Matcher isnum = pattern. matcher (STR );
  5. If (! Isnum. Matches ()){
  6. Return false;
  7. }
  8. Return true;
  9. }
  10. Java. Lang. character. isdigit (CH [0])

----------------- Another type ----------------- Java code

  1. Public static void main (string [] ARGs ){
  2. Int COUNT = 0;
  3. String RegEx = "[\ u4e00-\ u9fa5]";
  4. // System. Out. println (RegEx );
  5. String STR = "Chinese fdas ";
  6. // System. Out. println (STR );
  7. Pattern P = pattern. Compile (RegEx );
  8. Matcher M = P. matcher (STR );
  9. While (M. Find ()){
  10. For (INT I = 0; I <= M. groupcount (); I ++ ){
  11. Count = count + 1;
  12. }
  13. }
  14. System. Out. println ("Total" + Count + "count ");
  15. }

-------------------------------------------------------------------

Method for Determining whether a Java string contains Chinese Characters

Java uses Unicode-encoded char variables in the range of 0-65535 unsigned values, which can represent 65536 characters. Basically, all characters on the earth can be included, in reality, we want to determine whether a character is a Chinese character or whether a character in a string contains a Chinese character to meet business needs, the string class has such a method to get its character length (). For example, the Java code

  1. String S1 = "I am a Chinese ";
  2. String S2 = "imchinese ";
  3. String S3 = "Im Chinese ";
  4. System. Out. println (S1 + ":" + new string (S1). Length ());
  5. System. Out. println (s2 + ":" + new string (S2). Length ());
  6. System. Out. println (S3 + ":" + new string (S3). Length ());

Output:
I am a Chinese: 5
Imchinese: 9
Im Chinese: 5
As you can see, if the string contains double-byte characters, Java will encode each character in double-byte format. If it is a single-byte character, it will be encoded in single-byte format.
So according to the above rules, combined with a QQ nickname? G tea? I Zhuhai elder brother's prompt is resolved by judging whether the string length is the same as the character byte length to determine whether there is a double byte character Java code

  1. System. Out. println (s1.getbytes (). Length = s1.length ())? "S1 has no Chinese characters": "S1 has Chinese characters ");
  2. System. Out. println (s2.getbytes (). Length = s2.length ())? "S2 has no Chinese characters": "S2 has Chinese characters ");
  3. System. Out. println (s3.getbytes (). Length = s3.length ())? "S3 has no Chinese characters": "S3 has Chinese characters ");

Output:
S1 has Chinese Characters
S2 has no Chinese Characters
S3 has Chinese characters //
This way, we can determine whether a string contains double-byte characters. However, it is a bit difficult to accurately determine whether a string contains Chinese characters, we know that many characters in other countries are double-byte in Unicode.
Therefore, we need to further determine how to determine the encoding range of Chinese characters. I used this method, that is, the notepad now outputs the characters between 0 and, we can see that the first Chinese character is '1' and the last one is '?? '(I don't know it now). It's much easier to judge Chinese characters. For example, we can compare the encoding range of characters, finally, I will give you some results. The Chinese characters are basically concentrated in the range of [20901,], with a total of Chinese characters (if it's a little less, it's just how much you know)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.