Java Regular match pattern

Source: Internet
Author: User
Tags character classes

Java regular expressions are implemented through the pattern class and the Matcher class under the Java.util.regex package (it is recommended that you open the Java API documentation when you read this article, and look at the method descriptions in the Java API for better results when you do this.)
The pattern class is used to create a regular expression, or it can be said to create a matching schema that is private and cannot be created directly, but can be created with a Pattern.complie (String regex) Simple factory method.


Sample code

Package Com.yulore.regex;
Import Java.util.regex.Matcher;

Import Java.util.regex.Pattern;
		public class Regexpatterntest {/** * @param args */public static void main (string[] args) {//Search ();
Split ();
Replace ();
		
		Matches ();
	GroupTest02 (); 
		/** * When using matches (), Lookingat (), find () to perform a matching operation, you can use the above three methods to obtain more detailed information. 
		Start () returns the index position of the substring to be matched to in the string. 
		End () returns the index position of the last character of the substring that is matched to the string. 
		Group () returns a substring to match/public static void Grouptest () {Pattern p = pattern.compile ("\\d+"); 
		Matcher m = P.matcher ("AAA2223BB"); M.find ()///Match 2223//m.start ()//Return 3 m.end ();//Return 7, return 2223 index number m.group ();//Return 2223 System.out.println ("start="

		+m.start () + ">>end=" +m.end () + ">>group=" +m.group ()); 
		m = P.matcher ("2223BB");   M.lookingat ();   Match 2223//M.start ();   Returns 0 because Lookingat () can only match the preceding string, so when the Lookingat () match is used, the start () method always returns 0//m.end ();   Return 4 M.group (); Return to 2223 System.out.println ("start=" +m.start () + ">>end=" +m.end () + "&G"T;>group= "+m.group ()); 
		m = P.matcher ("2223BB");   M.matches ();   Match the entire string//m.start ();   Return 0, reason believe everybody also clear//m.end ();   Return 6, the reason is believed that everybody also understood, because matches () needs to match all string m.group ();
	Returns 2223BB System.out.println ("start=" +m.start () + ">>end=" +m.end () + ">>group=" +m.group ()); /** * Start (), End (), group () has an overloaded method that is start (int i), end (int i), group (int i) * is dedicated to grouping operations, Mathcer class also has a GroupCount () 
	 How many groups to return. 
		* * private static void groupTest02 () {Pattern p=pattern.compile ("([a-z]+) (\\d+)"); 
		Matcher m=p.matcher ("AAA2223BB");   Boolean flag = M.find ();   Match aaa2223 int count = M.groupcount ();
		
		Returns 2 because there are 2 groups of System.out.println ("flag=" +flag+ ">>groupcount=" +count);   int start = M.start (1);   Returns 0 returns the index number of the first set of matching substrings in the string int end = M.end (1); 
		
		Return 3 returns the index position of the last character in the string that matches the first set of substrings.
		
		System.out.println ("Start 1 =" +start+ ">>" + "End 1 =" +end);   Start = M.start (2);   Returns 3 end = M.end (2); Returns 7 System.out.println ("Start 2 ="+start+ ">>" + "End 2 =" +end);   String group1 = M.group (1);   Returns AAA, returns the first set of substring string group2 = M.group (2);

	Returns 2223, returns the substring of the second set of System.out.println ("group1=" +group1+ ">>" + "group2 =" +group2);
		/** * Split splits the given input sequence by matching the specified pattern/private static void Split () {String regEx = "::";
		String str = "XD::ABC::CDE";
		Pattern p = pattern.compile (regEx);

		string[] arr = p.split (str);

		After execution, R is {"XD", "abc", "CDE"}, in fact, there is a simple way to divide://string[] arr =str.split ("::");
		for (int i = 0; arr!= null && i < arr.length; i++) {System.out.println (arr[i)); }/** * matches match * matches () matches the entire string, only the entire string is matched to return true/private static void Matches () {Boolean b
		OL = Pattern.matches ("\\d+", "2223");//Returns True System.out.println ("bol1=" +bol);
		Bol =pattern.matches ("\\d+", "2223AA");//returns false, needs to match to all strings to return true, where AA cannot match to System.out.println ("bol1=" +bol); Bol =pattern.matches ("\\d+", "22bb23");/return false, need to match to all strings to return true, where BB cannot match to 
		System.out.println ("bol1=" +bol); /** * Lookingat () matches the preceding string, only the string matched to returns true at the front/private static void Lookingat () {pattern p=pattern.compile 
		("\\d+"); 
		
		Matcher m=p.matcher ("22bb23");
		
		boolean bool = M.lookingat ();//returns True because the \d+ match to the preceding System.out.println ("bool=" +bool); 
		Matcher m2=p.matcher ("aa2223");
	M2.lookingat ();//returns false because \d+ cannot match the previous AA System.out.println ("bool=" +bool); 
	 /** * FIND () matches the string and matches the string to any location. 
		* * private static void find () {pattern p=pattern.compile ("\\d+"); 
		
		Matcher m=p.matcher ("22bb23");
		
		Boolean flag = M.find ();//return True System.out.println ("flag=" +flag); 
		Matcher m2=p.matcher ("aa2223");
		
		Flag = M2.find ();//return True System.out.println ("flag=" +flag); 
		Matcher m3=p.matcher ("AA2223BB");
		
		Flag = M3.find ();//return True System.out.println ("flag=" +flag); 
		Matcher m4=p.matcher ("Aabb");
	Flag = M4.find ();//Return False System.out.println ("flag=" +flag);
 }
}


1. Find a character string that matches a specified regular expression

/**
	 * @param args
	 *
	/public static void main (string[] args) {
		
		String regex = ' [a]+ ';
		String target = "Aswwaaabsvasra";
		 Search (Regex,target);
	}
	
	/** *
	 Find
	 * @param regex-	specified regular expression
	 * @param target target String *
	/private static void search (string Regex,string target) {
		//create pattern instance by compile () method pattern
		= Pattern.compile (regex, Pattern.case_ insensitive);
		Create a Matcher instance via match ()
		Matcher Matcher = Pattern.matcher (target);
		while (Matcher.find ()) {//Find the pattern-conforming string
			System.out.println ("The" Is here: "+ matcher.group () + \ n"
					+ "It starts from" + matcher.start () + "to"
					+ matcher.end () + ". \ n");
		}
	


2. Replacement and deletion

/**
	 * @param args
	 *
	/public static void main (string[] args) {
		
		String regex = "a";
		String target = "Aswwaaabsvasra";
		String replace = "#";		 search (regex,target);
		Replace (regex, target, replace);
	}
	
	/**
	 * Replace/delete
	 * @param regex
	 * @param target
	 * @param replace
	 /
	private static void replace (String regex,string target,string Replace) {pattern
		p = pattern.compile (regex);
		Matcher m = p.matcher (target);
		String s = m.replaceall (replace);
		String s = M.replacefirst ("a");

		If written as an empty string, can achieve the deletion function, for example:
		//String S=m.replaceall ("");

		System.out.println ("s=" + s);
	}



the construction summary of regular expressions

Character

X character X
\ backslash Character
\0n with octal value 0 of the character n (0 <= n <= 7)
\0nn with octal value 0 of the character nn (0 <= n <= 7)
\0mnn characters with octal value 0 mnn (0 <= m <= 3, 0 <= n <= 7)
\XHH characters with hexadecimal value of 0x hh
\uhhhh characters with hexadecimal value of 0x HHHH
\ t tab (' \u0009 ')
\ n New Line (newline) character (' \u000a ')
\ r return character (' \u000d ')
\f page feed (' \u000c ')
\a Alarm (Bell) character (' \u0007 ')
\e Escape character (' \u001b ')
\CX corresponds to the control character of X

character class
[ABC] A, B, or C (simple Class)
[^ABC] Any character except A, B, or C (negation)
[A-za-z] A to Z or A to Z, and the letters at both ends are included (range)
[A-d[m-p]] A to D or M to P:[a-dm-p] (and set)
[A-z&&[def]] D, E or F (intersection)
[A-Z&AMP;&AMP;[^BC]] A to Z, except B and C:[ad-z] (minus)
[A-z&&[^m-p]] A to Z, not M to P:[a-lq-z] (minus)

Predefined character Classes
. Any character (may or may not match the line terminator)
\d number: [0-9]
\d Non-digit: [^0-9]
\s whitespace characters: [\t\n\x0b\f\r]
\s non-whitespace characters: [^\s]
\w Word characters: [a-za-z_0-9]
\w non-word characters: [^\w]



Boundary Matching Device
^ The beginning of a line
$ End of line
\b Word boundaries
\b Non-word boundaries
\a the beginning of the input
\g the end of the previous match
\z the end of the input, only for the last terminator (if any)
End of \z input

Greedy Quantity Word
X? X, not once or once
X* X, 0 or more times
x+ X, one or more times
X{n} X, exactly n times
X{n,} X, at least n times
X{n,m} X, at least n times, but not more than m times

Reluctant quantity Word
X?? X, not once or once
X*? X, 0 or more times
X+? X, one or more times
X{n}? X, exactly n times
X{n,}? X, at least n times
X{n,m}? X, at least n times, but not more than m times

Possessive Quantity Word
x?+ X, once or once there is no
x*+ X, 0 or more times
x + + x., one or more times
x{n}+ X, exactly n times
x{n,}+ X, at least n times
x{n,m}+ X, at least n times, but not more than m times

Logical operator
XY X followed by Y
X| Y X or Y
(x) x, as a capturing group

Back reference
\ n Any matching nth capture group


backslashes, escapes, and references
The backslash character (' \ ') is used to reference the escaped construct, as defined in the previous table, and to refer to other characters that will be interpreted as non-escaped constructs. Therefore, the expression \ \ Matches a single backslash, and \{matches the left parenthesis.

It is wrong to use backslashes before any alphabetic characters that escape constructs are used, and they are reserved for future extensions of regular expression languages. You can use a backslash before a non-alphanumeric character, regardless of whether the character is part of an escaped construct or not.

The backslash in the Java source code string is interpreted as Unicode escape or other character escape, as required by the Java Language specification. Therefore, you must use two backslashes in the string literal to indicate that the regular expression is protected and not interpreted by the Java bytecode compiler. For example, when interpreted as a regular expression, the string literal "\b" matches a single backspace character, and "\\b" matches the word boundary. string literal "\ (hello\)" is illegal and will result in a compile-time error; to match the string (hello), you must use string literal "\ (hello\\)".







Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.