Regular Expressions are rules used to operate strings.
1. In the string class, some methods are used to match and cut strings.
Boolean matches (string RegEx );
String [] Split (string RegEx) that is cut by the given regular expression );
Replace the string that matches the regular expression with the other string we want: String replaceall (string RegEx, string replacement)
2. The following describes common regular expressions.
(1)
String RegEx = "[1-9] [0-9] {4, 15 }";
// [1-9] indicates that this number can only be selected from 1 to 9
// [0-9] indicates that the number can be 0-9
// {} Indicates that the number in the preceding format can be repeated 4-15 times.
This regular expression means that the first number should be any one from 1 to 9, and then one of the numbers in 0-9 must appear, and this number must appear at least four times, up to 15 times
For example:
10175 compliance
10 does not match, because [0-9] {} appears at least four times, only once here
(2)
[A-zA-Z0-9 _] {6} represents exactly 6 Characters in A-Z or A-Z or _
+ Indicates at least once
* Indicates zero or multiple occurrences.
? Indicates one or zero occurrence.
(3) Cutting strings based on regular expressions
String STR = "SJD. ksdj. skdjf ";
String RegEx = "\\.";
Note: In a regular expression, it is an arbitrary table character and a special symbol. If we want to use. To cut, we must convert it to a common character and use.
Because \ is a special symbol, it must be expressed by two \ characters. When we want to use common \, we need to use \ to represent it.
String [] Ss = Str. Split (RegEx); returns a string array: "SJD" "ksdj" "skdjf" to cut the original string
(4) replace what we want to replace according to the regular expression.
Replace five or more numeric strings in the string #
String STR = "abcd1334546lasjdfldsf2343424sdj ";
String RegEx = "[0-9] {5 ,}";
String newstr = Str. replaceall (RegEx ,"#");
(5) Obtain strings that comply with regular expression rules
Pattern P = pattern. Compile (string RegEx );
Matcher M = P. matcher (string Str );
While (M. Find ())
{
System. Out. println (M. Group ());
}
3. Web Crawler Creation
You can read all the mailboxes on a web page and store them in a text file.
/* Web crawler: Obtain strings or content that match regular expressions from the web page and obtain the email address from the Internet */import Java. io. *; import Java. util. regEx. *; import java.net. *; Class mailtest {public static void main (string [] ARGs) throws exception {getmailaddr ();} public static void getmailaddr () throws exception {URL url = new URL ("http://bbs.csdn.net/topics/390148495"); urlconnection con = URL. openconnection (); bufferedreader bufin = new bufferedreader (New inputstreamreader (con. geti Nputstream (); bufferedwriter bufw = new bufferedwriter (New filewriter (new file ("E: // mailaddress.txt"); string STR = NULL; string RegEx = "[a-zA-Z0-9 _] {6, 12} @ [a-zA-Z0-9] + (\\. [A-Za-Z] +) + "; pattern P = pattern. compile (RegEx); While (STR = bufin. readline ())! = NULL) {matcher M = P. matcher (STR); While (M. find () {string Ss = m. group (); bufw. write (SS, 0, SS. length (); bufw. newline (); bufw. flush ();}}}}