Regular Expressions in Java

Source: Internet
Author: User
Tags character classes

In the project of data mining, I have to make use of the regular expressions to deal with the large amount of text in HTML.

I used regular expression in Linux (grep) before and find it quite an efficient way to deal with text, especially when their amount is very large.

 

Introduction

Regular ExpressionsAre a way to describe a set of strings based on common characteristics shared by each string in the set. they can be used to search, edit, or manipulate text and data. you must learn a specific syntax to create regular expressions-one that goes beyond the normal syntax of the Java programming language. regular expressions vary in complexity, but once you understand the basics of how they're constructed, you'll be able to decipher (or create) any regular expression.

 

The package of Java. util. RegEx

It primary consists three classes:

Pattern: A compiled representation of a regular expression.

Matcher: Interprets the Patten and performs match operation against an input string.

Patternsyntaxexception: Indicates an syntax error in a Regular Expression Pattern

 

A single regular expression Program

 1   Package  Regextestharness;  2   3   Import  Java. util. RegEx. pattern;  4   Import Java. util. RegEx. matcher;  5   Import  Java. Io. bufferedreader;  6   Import  Java. Io. inputstreamreader;  7   8   Public   Class  Regextestharness {  9       Public   Static   Void Main (string [] ARGs ){  10           Try  {  11   12 System. Out. println ("% nenter your RegEx :" );  13   14 Inputstreamreader ISR = New  Inputstreamreader (system. In );  15   16 Bufferedreader BR = New Bufferedreader (ISR );  17   18 String S = BR. Readline ();  19   20 Pattern pattern = Pattern. Compile (s );  21   22 System. Out. println ("% nenter your text :" );  23   24 ISR = New Inputstreamreader (system. In );  25   26 BR = New  Bufferedreader (ISR );  27   28 S = BR. Readline ();  29   30 Matcher = Pattern. matcher (s );  31   32               Boolean Found =False  ;  33               While  (Matcher. Find ()){  34 System. Out. Print ("I found the text" + Matcher. Group ()  35 + "Starting at" + "Index" + Matcher. Start ()  36 + "And ending at Index" + Matcher. End ());  37 Found = True ;  38   }  39               If (! Found ){  40 System. Out. println ("no match found ." );  41   }  42 } Catch  (Exception e ){  43   E. printstacktrace (); 44   }  45   }  46   47 }

 

Chracter classes and predefined classes

Construct Description
[ABC] A, B, or C (simple class)
[^ ABC] Any character t a, B, or C (negation)
[A-Za-Z] A through Z, or a through Z, random sive (range)
[A-d [M-p] A through D, or m through P: [A-DM-p] (union)
[A-Z & [DEF] D, E, or F (intersection)
[A-Z & [^ BC] A through Z, random t for B and C: [ad-Z] (subtraction)
[A-Z & [^ m-p] A through Z, and not m through P: [A-SCSI-Z] (subtraction)

 

Construct Description
. Any character (may or may not match line Terminators)
\ D A digit:[0-9]
\ D A non-digit:[^ 0-9]
\ S A whitespace character:[\ T \ n \ x0b \ f \ r]
\ S A non-whitespace character:[^ \ S]
\ W A word character:A-zA-Z_0-9
\ W A non-word character:[^ \ W]

 

Quantifiers

Greedy Reluctant Possessive Meaning
X? X ?? X? + X, Once or not at all
X * X *? X * + X, Zero or more times
X + X ++? X ++ X, One or more times
X {n} X {n }? X {n} + X, ExactlyNTimes
X {n ,} X {n ,}? X {n,} + X, At leastNTimes
X {n, m} X {n, m }? X {n, m} + X, At leastNBut not moreMTimes

Chinese characters

[\ U4e00-\ u9fa5]

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.