Java. util. regex. Matcher: matching class: This pattern matches the results produced by a string, and there may be many results.4: Here is a simple introduction to regular expressions through a small program.
Import java. util. regex. matcher; import java. util. regex. pattern; public class Test {public static void main (String [] args) {// matches () determines whether the String matches an expression ,. represents any character p (abc. matches (...)); // replace the number in the string a2389a with *, and d indicates the "0--9" number p (a2389a. replaceAll (\ d, *); // compile any string that is a -- z with a length of 3 to accelerate the matching speed. compile ([a-z] {3}); // match and put the matching result in the Matcher object Matcher m = p. matcher (abc); p (m. matches (); // the above three lines of code can replace p (abc. matches ([a-z] {3});} public static void p (Object o) {System. out. println (o );}}
The following is the result.
truea****atruetrue
Now we use some experiments to illustrate the matching rules of regular expressions. Here we use the Greedy method.
. Any character
A?A does not exist once or once.
A*A zero or multiple times
A + a once or multiple times
A{N}? A EXACTLY n times
A {n ,}? A must be at least n times
A{N,M}? A must be at least n times, but cannot exceed m times
// Preliminary understanding. * +? P (. matches (.)); // true p (aa. matches (aa); // true p (aaaa. matches (a *); // true p (aaaa. matches (a +); // true p (. matches (a *); // true p (aaaa. matches (?)); // False p (. matches (?)); // True p (a. matches (?)); // True p (1232435463685899. matches (\ d {3,100}); // true p (192.168.0.aaa.matches (\ d {1, 3 }\. \ d {1, 3 }\. \ d {1, 3 }\. \ d {1, 3}); // false p (192. matches ([0-2] [0-9] [0-9]); // true
[Abc] A,BOrC(Simple class)
[^ Abc]Any characterA,BOrC(No)
[A-zA-Z] AToZOrAToZ, Two letters included (range)
[A-d [m-p] AToDOrMToP:[A-dm-p](Union)
[A-z & [def] D,EOrF(Intersection)
[A-z & [^ bc] AToZ,BAndC:[Ad-z](Minus)
[A-z & [^ m-p] AToZ, RatherMToP:[A-SCSI-z](Minus)
// The value range is p (. matches ([abc]); // true p (. matches ([^ abc]); // false p (. matches ([a-zA-Z]); // true p (. matches ([a-z] | [A-Z]); // true p (. matches ([a-z [A-Z]); // true p (R. matches ([A-Z & [RFG]); // true
D Number:[0-9]
D Non-numeric:[^ 0-9]
S blank characters:[]
S non-blank characters:[^ S]
W word character:A-zA-Z_0-9
W non-word characters:[^ W]
// Recognize s w d p (. matches (\ s (4); // false p (. matches (\ S); // false p (a_8. matches (\ w (3); // false p (abc888 & ^ %. matches ([a-z] {1, 3} \ d + [& ^ # %] +); // true p (\. matches (\); // true
Boundary
^Start of a row
$End of a row
Word boundary
BNon-word boundary
AStart of input
GLast matched end
ZThe end of the input. It is only used for the last terminator (if any)
ZEnd of input
// The boundary matches p (hello sir. matches (^ h. *); // true p (hello sir. matches (. * ir $); // true p (hello sir. matches (^ h [a-z] {1, 3} o \ B. *); // true p (hellosir. matches (^ h [a-z] {1, 3} o \ B. *); // false // blank line: one or more (blank and non-line break) start with a line break and end with a line break p (. matches (^ [\ s & [^ \ n] * \ n $); // true
Method Analysis
Matches (): match the entire string
Find (): match the substring
LookingAt (): always starts from the beginning of the entire string.
// Email p (asdsfdfagf@adsdsfd.com.matches ([\ w [. -] + @ [\ w [. -] + \. [\ w] +); // true // matches () find () lookingAt () Pattern p = Pattern. compile (\ d {3, 5}); Matcher m = p. matcher (123-34345-234-00); // use the Regular Expression Engine to search for and match the entire 123-34345-234-00. When the first-unmatched one is reached, it stops, // but will not spit out the unmatched-p (m. matches (); // spit out the unmatched-m. reset (); // 1: The current surface has p (m. matches (); find the substring from... 34345-234-00 start // it will be the first, second, and second. If the second, 34345, and 234 cannot be found, it will be false. // 2: The current surface has p (m. matches (); and m. reset (); find the substring starting from 123-34345-234-00 // it will be true, false p (m. find (); p (m. start () + --- + m. end (); p (m. find (); p (m. start () + --- + m. end (); p (m. find (); p (m. start () + --- + m. end (); p (m. find (); // if it is not found, an exception occurs in java. lang. illegalStateException // p (m. start () + --- + m. end (); p (m. lookingAt (); p (m. lookingAt (); p (m. lookingAt (); p (m. lookingAt ());
String replacement: the following method is very flexible for string replacement.
// String replacement // Pattern. CASE_INSENSITIVE case insensitive Pattern p = Pattern. compile (java, Pattern. CASE_INSENSITIVE); Matcher m = p. matcher (java Java jAva ILoveJavA youHateJAVA adsdsfd); // stores the string StringBuffer buf = new StringBuffer (); // counts the parity int I = 0; while (m. find () {I ++; if (I % 2 = 0) {m. appendReplacement (buf, java);} else {m. appendReplacement (buf, JAVA) ;}// if this sentence is not added, the string adsdsfd will be abandoned. appendTail (buf); p (buf );
Result printing:
JAVA java JAVA ILovejava youHateJAVA adsdsfd
Group
// Group. Use () to group Pattern p = Pattern. compile (\ d {3, 5}) ([a-z] {2}); String s = 123aa-34345bb-234cc-00; Matcher m = p. matcher (s); p (m. groupCount (); // two groups of while (m. find () {p (m. group (); // numbers and letters all have // p (m. group (1); // only numbers // p (m. group (2); // only letters}
Ii. simple use of regular expressions
Java Regular Expression Application
I. Capture the Email address on the webpage
Use regular expressions to match text in a webpage
[\ W [.-] + @ [\ w [.-] + \. [\ w] +
Separate and extract webpage content
import java.io.BufferedReader;import java.io.FileNotFoundException;import java.io.FileReader;import java.io.IOException;import java.util.regex.Matcher;import java.util.regex.Pattern;public class EmailSpider { public static void main(String[] args) { try { BufferedReader br = new BufferedReader(new FileReader(C:\emailSpider.html)); String line = ; while((line=br.readLine()) != null) { parse(line); } } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } private static void parse(String line) { Pattern p = Pattern.compile([\w[.-]]+@[\w[.-]]+\.[\w]+); Matcher m = p.matcher(line); while(m.find()) { System.out.println(m.group()); } }}
Print result:
867124664@qq.com260678675@QQ.com806208721@qq.comhr_1985@163.com32575987@qq.comqingchen0501@126.comyingyihanxin@foxmail.com1170382650@qq.com1170382650@qq.comyingyihanxin@foxmail.comqingchen0501@126.com32575987@qq.comhr_1985@163.com
Now you can find so many email addresses and use JavaMail knowledge to send spam emails !!!
Ii. Code statistics
Import java. io. bufferedReader; import java. io. file; import java. io. fileNotFoundException; import java. io. fileReader; import java. io. IOException; public class CodeCounter {static long normalLines = 0; // normal code line static long commentLines = 0; // comment line static long whiteLines = 0; // blank line public static void main (String [] args) {// find a folder without folders, there is no recursive processing of files in different folders. File f = new File (E: \ Workspaces \ eclipse \ Applic Ation \ JavaMailTest \ src \ com \ java \ mail); File [] codeFiles = f. listFiles (); for (File child: codeFiles) {// only count java files if (child. getName (). matches (. *\. java $) {parse (child) ;}} System. out. println (normalLines: + normalLines); System. out. println (commentLines: + commentLines); System. out. println (whiteLines: + whiteLines);} private static void parse (File f) {BufferedReader br = null; // indicates whether the comment starts boolean Comment = false; try {br = new BufferedReader (new FileReader (f); String line =; while (line = br. readLine ())! = Null) {// remove the annotator/* leading to a blank line = line. trim (); // empty line because readLine () extracts the string, linefeed/has been removed, so it is not ^ [\ s & [^ \ n] * \ n $ if (line. matches (^ [\ s & [^ \ n] * $) {whiteLines ++;} else if (line. startsWith (/*)&&! Line. endsWith (*/) {// count multiple rows/*****/commentLines ++; comment = true;} else if (line. startsWith (/*) & line. endsWith (*/) {// counts a row/**/commentLines ++;} else if (true = comment) {// counts */commentLines ++; if (line. endsWith (*/) {comment = false;} else if (line. startsWith (//) {commentLines ++;} else {normalLines ++ ;}} catch (FileNotFoundException e) {e. printStackTrace ();} catch (IOExcep Tion e) {e. printStackTrace ();} finally {if (br! = Null) {try {br. close (); br = null;} catch (IOException e) {e. printStackTrace ();}}}}}