A regular expression of Java matching Chinese characters

Source: Internet
Author: User
Tags expression html tags lowercase regular expression

The code is as follows

[U4e00-u9fa5] Chinese characters? [ufe30-uffa0] Full-angle characters

[U4e00-u9fa5] Chinese characters? [ufe30-uffa0] Full-angle characters

Matching regular expressions for Chinese characters: [U4E00-U9FA5]

Match Double-byte characters (including Chinese characters): [^x00-xff]

Application: Computes the length of the string (a double-byte character length meter 2,ascii character 1)

The code is as follows
String.prototype.len=function () {return This.replace ([^x00-xff]/g, "AA"). Length;}

A regular expression that matches a blank row: n[s|] *r

Regular Expression:/< (. *) >.* Matching HTML tags |< (. *)/>/

Matching a regular expression with a trailing space: (^s*) | (s*$)

If we know it, we're done.

code as follows copy code

public static void Regxchinese () {
///String to match
string s Ource = "<span title= ' 5 star hotels ' class= ' dx dx5 ' >";
//Convert the string above to lowercase
//Source = Source.tolowercase (); A regular expression for the
//matching string
String reg_charset = "<span[^>]*?title=" ([0-9]*[s| s]*[u4e00-u9fa5]*) ' [s| S]

*class= ' [a-z]*[s| S]*[a-z]*[0-9]* ' ";

Pattern p = pattern.compile (Reg_charset);
Matcher m = p.matcher (source);
while (M.find ()) {
System.out.println (M.group (1));
}
}
public static void Regxchinese () {
///to match string
string source = "<span title=" 5 star hotels ' class= ' d x dx5 ' > ';
//Convert the string above to lowercase
//Source = Source.tolowercase (); A regular expression for the
//matching string
String reg_charset = "<span[^>]*?title=" ([0-9]*[s| s]*[u4e00-u9fa5]*) ' [s| S]

*class= ' [a-z]*[s| s]*[a-z]*[0-9]* ' ";

Pattern p = pattern.compile (Reg_charset);
Matcher m = p.matcher (source);
while (M.find ()) {
System.out.println (M.group (1));
}
}

Java Regular expressions can match Chinese characters, while it is also possible to write an expression with the characters

The code is as follows Copy Code

String reg_charset = "<span[^>]*?title=" ([0-9]*[s| S]* star Hotel) ' [s| S]*class= ' [a-z]*[s| S

*[a-z]*[0-9]* ' ";

String reg_charset = "<span[^>]*?title=" ([0-9]*[s| S]* star Hotel) ' [s| S]*class= ' [a-z]*

[s| S]*[a-z]*[0-9]* ' ";

Some common regular matching rules

Matching regular expressions for Chinese characters: [U4E00-U9FA5]

Commentary: Matching Chinese is really a headache, with this expression will be easy to do

Match Double-byte characters (including Chinese characters): [^x00-xff]

Commentary: can be used to compute the length of a string (a double-byte character length meter 2,ascii 1 characters)

A regular expression that matches a blank row: ns*r

Commentary: can be used to delete blank lines

Regular expression:< matching HTML tags (s*?) [^>]*>.*?| <.*? />

Commentary: The online version is too bad, the above can only match the part of the complex nested tags still powerless

A regular expression that matches the end-end whitespace character: ^s*|s*$

Commentary: A useful expression that can be used to delete white-space characters (including spaces, tabs, page breaks, and so on) at the end of a line at the beginning

Regular expression matching an email address: w+ ([-+.] w+) *@w+ ([-.] w+) *.w+ ([-.] w+) *

Commentary: Form validation is useful

Regular expressions that match URL URLs: [a-za-z]+://[^s]*

Commentary: Online circulation of the version of the function is very limited, which can meet the basic requirements

Match account number is legal (beginning of letter, allow 5-16 bytes, allow alphanumeric underline): ^[a-za-z][a-za-z0-9_]{4,15}$

Commentary: Form validation is useful

Match domestic phone number: D{3}-d{8}|d{4}-d{7}

Commentary: Match form such as 0511-4405222 or 021-87888822

Matching Tencent QQ Number: [1-9][0-9]{4,}

Commentary: Tencent QQ number starting from 10000

Match China ZIP Code: [1-9]d{5} (?! D

Commentary: China postal code is 6 digits

Matching ID: d{15}|d{18}

Commentary: China's ID card is 15-or 18-digit

Matching IP address: d+.d+.d+.d+

Commentary: Useful when extracting IP addresses

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.