Java Regular Expression Small summary (collation) _ Normal expression

Source: Internet
Author: User
Tags getdate stringbuffer

]js Regular Expression basic syntax (pristine): http://www.jb51.net/article/72044.htm

Many languages, including Perl, PHP, Python, JavaScript, and JScript, support the use of regular expressions to process text, and some text editors implement advanced search-replace functionality with regular expressions. So the Java language is no exception. Regular expressions have gone beyond the limits of a language or a system and become a widely used tool, and we can use it to solve some of the practical problems encountered in actual development.

There is no class with regular expressions in JDK1.3 and previous JDK versions, and if you want to use regular expressions in Java you must use regular expression libraries provided by third parties, the most famous of which is the Jakarta-oro,jakarta-oro library formerly known as Oromatcher, Daniel Savarese presented to Jakarta Project for an open source package. When used, you first create an instance variable that implements the Patterncompiler interface to create a "schema compiler", and the class that implements the interface in Jakarta-oro is Perl5compiler, which is fully compatible with PERL5 regular expressions. The use of Jakarta-oro is very simple, and very efficient, supported regular expression syntax is also very complete, the only drawback is that it is not a standard package in the JDK. The regular expression APIs are available from JDK1.4, they are in the Java.util.regex package, and because of the standard APIs already in place, this book will use Java.util.regex for regular expressions.

Basic knowledge of regular expressions

1.1 Period Symbol

Suppose you are playing English Scrabble and want to find three-letter words that must begin with the letter "T" and End With "n" Letters. In addition, suppose you have an English dictionary, you can use regular expressions to search all of its contents. To construct this regular expression, you can use a wildcard character-the period symbol ".". In this way, the complete expression is "T.N", which matches "tan", "ten", "Tin" and "ton", and also matches "T#n", "TPN" or even "T n", and many other meaningless combinations. This is because the period symbol matches all characters, including spaces, tab characters, and even line breaks:

1.2 Square brackets Symbol

In order to solve the problem that the period symbol matching range is too broad, you can specify a meaningful character in square brackets ("[]"). At this point, only the character character character specified in the square brackets participate in the match. That is, the regular expression "t[aeio]n" matches only "tan", "Ten", "Tin", and "ton". But "Toon" does not match, because within square brackets you can only match a single

Characters:

1.3 "or" symbol

If you want to match "toon" in addition to all the words above, you can use the "|" Operator. | The basic meaning of an operator is the "or" operation. to match "Toon", use the "t (A|e|i|o|oo) n" Regular expression. You cannot use a square extension here because the brackets allow only a single character to be matched, and you must use the parentheses "()" here. Parentheses can also be used to group.

1.4 Symbols that indicate the number of matches

The following table shows the syntax for regular expressions:

Table 1.1 Regular Expression syntax

Suppose we want to search the U.S. Social Security number in a text file. The format of this number is 999-99-9999. The regular expression used to match it is shown in figure one. In a regular expression, a hyphen ("-") has a special meaning, which represents a range, for example, from 0 to 9. Therefore, when matching the hyphenation symbol in the social security number, it is preceded by an escape character "/".

Suppose you want the hyphen to appear or not when you are searching-that is, 999-99-9999 and 999999999 are in the correct format. At this point, you can add the word "? The quantity qualifier symbol.

A format for U.S. car licences is four digits plus two letters. Its regular expression is preceded by the number part "[0-9]{4}", plus the letter part "[A-z]{2}".

1.5 "no" symbol

The "^" symbol is called a "no" symbol. If used in square brackets, "^" denotes a character that you do not want to match. For example, the regular expression in Figure four matches all words except words that begin with the "X" letter.

1.6 Parentheses and blank symbols

The "/S" symbol is a blank symbol that matches all whitespace characters, including the tab character. If the string matches correctly, then how do you extract the month portion? Simply create a group with parentheses around the month, and then extract its value with the Oro API.

1.7 Other symbols

For simplicity, you can use some shortcut symbols that are created for common regular expressions. As shown in the following:

/t: tab, equivalent to/u0009
/N: line breaks, equivalent to/u000a
/d: Represents a number, equivalent to [0-9]
/d: Represents a non-numeric, equivalent to [^0-9]
/s: white space characters that represent line breaks, tab tabs, and so on
/S: Represents Non-white-space characters
/w: Alphabetic characters, equivalent to [a-za-z_0-9]
/w: Non-alphabetic characters, equivalent to [^/w]

For example, in the previous example of social Security numbers, we can use "/d" for all occurrences of "[0-9]".

Here are the procedures I have sorted out: for reference:

Package org.luosijin.test; 
Import Java.util.regex.Matcher; 
Import Java.util.regex.Pattern; 
 /** * Regular Expression * @version V. * @author Rosikin * @date--/public class Regex {/** * @param args * @author Rosikin * @date-pm:: */Publi 
 c static void Main (string[] args) {Pattern pattern = pattern.compile ("B*g"); 
 Matcher Matcher = Pattern.matcher ("BBG"); 
 System.out.println (Matcher.matches ()); 
 System.out.println (Pattern.matches ("B*g", "BBG")); 
 Verify the ZIP code System.out.println (Pattern.matches ("{}", "")); 
 System.out.println (Pattern.matches ("//d{}", "")); 
 Verify the phone number System.out.println (Pattern.matches ("{,}//-?+", "")); 
 GetDate ("Nov,"); 
 Charreplace (); 
 Verify ID: Determine if a string is an ID number, that is, a digit or not. 
 System.out.println (Pattern.matches ("^//d{}|//d{}$", "")); 
 GetString ("D:/dir/test.txt"); 
 Getchinese ("Welcome to", Jiangxi Fengxin, welcome, you!); 
 Validateemail ("luosijin@.com"); /** * Date Extraction: Extract month to * @param str * @author Rosikin * @date-PM:: * * public static void GetdatE (String str) {string regex= "([a-za-z]+) |//s+{,},//s*{}"; 
 Pattern pattern = pattern.compile (regEx); 
 Matcher Matcher = Pattern.matcher (str); 
  if (!matcher.find ()) {System.out.println ("Wrong date format!"); 
 Return } System.out.println (Matcher.group ()); 
 The index values for grouping are from the beginning, so the first grouping is M.group () instead of M.group (). 
 /** * Character substitution: This example replaces all occurrences of one or more contiguous "a" in a string with "a". 
 * * @author Rosikin * @date-a.m.:: */public static void Charreplace () {String regex = "A +"; 
 Pattern pattern = pattern.compile (regex); 
 Matcher Matcher = Pattern.matcher ("okaaaa letmeaseeaaa aa Booa"); 
 String s = Matcher.replaceall ("A"); 
 System.out.println (s); /** * String Extraction * @param str * @author Rosikin * @date-a.m.:: */public static void GetString (String str) {Stri 
 ng regex = ". +/(. +) $"; 
 Pattern pattern = pattern.compile (regex); 
 Matcher Matcher = Pattern.matcher (str); if (!matcher.find ()) {System.out.println ("file path is malformed!") 
  "); 
 Return 
 } System.out.println (Matcher.group ()); } 
 /** 
 * Extract * @param str * @author Rosikin * @date-a.m.:: */public static void Getchinese (String str) {string regex = 
 "[//ue-//ufff]+";//[//ue-//ufff] is the pattern of Chinese characters = pattern.compile (regex); 
 Matcher Matcher = Pattern.matcher (str); 
 StringBuffer sb = new StringBuffer (); 
 while (Matcher.find ()) {Sb.append (Matcher.group ()); 
 } System.out.println (SB); /** * Verify Email * @param email * @author Rosikin * @date-a.m.:: * * public static void Validateemail (String emai L) {String regex = ' [-a-za-z]+@[-a-za-z]+//.[ 
 -a-za-z]+ "; 
 Pattern pattern = pattern.compile (regex); 
 Matcher Matcher = pattern.matcher (email); 
 if (Matcher.matches ()) {System.out.println ("This is a legitimate email"); 
 }else{System.out.println ("This is an illegal email"); } 
 } 
}

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.