C # data structure and algorithm Learning Series 10-Regular Expressions

Source: Internet
Author: User
Tags character classes

A regular expression is a language used to describe character formats in a string. It provides descriptors corresponding to repeated characters, replacement characters, and grouping characters. Regular Expressions can be used to search strings or replace strings. A regular expression is a string defined for other string search modes. Generally, the characters in the regular expression match themselves. Therefore, the regular expression "the" can match the same character sequence found at any position in the string. Regular Expressions can also contain special characters called metacharacters. Metacharacters are used to indicate repetition, replacement, or grouping.

1. Use regular expressions. To use regular expressions, You need to introduce the RegEx classProgram. You can find this type in the Name field of system. Text. regularexpression. Once this type is imported into the program, you need to decide what you want to do with the RegEx class. If you want to perform a match, you need to use the match class. If you want to replace it, you do not need the match class. Instead, the replace method of the RegEx class is used.
First, let's take a look at how to match words in strings. Assume that the string "The quick brown fox jumped over the lazy dog" is given, and you want to find the word "the" in the string ". As follows:

Using system; using system. text. regularexpressions; Class chapter10 {static void main () {RegEx Reg = nNew RegEx ("the"); string str1 = "the quick fox jumped over the lazy dog"; match matchset; int matchpos; matchset = reg. match (str1); If (matchset. success) {matchpos = matchset. index; console. writeline ("found match at position:" + matchpos );}}}

The IF statement uses the success attribute of the match class to determine whether the match is successful. If the return value is true, the regular expression matches at least one substring in the string. Otherwise, the value stored in success is false. The program can also have another method to check whether the matching is successful. You can pre-test the Regular Expression by passing the Regular Expression and target string to the ismatch method. If it matches a regular expression, this method returns true; otherwise, false. As follows:

 
If (RegEx. ismatch (str1, "the") {match amatch; amatch = reg. Match (str1 );}

Match
One problem with the class is that it can only store one matching. In the previous instance, there are two matches for the substring ". Here, we can use another matches class to store multiple matches with regular expressions. To process all the matching results, you can store the matching results in the matchcollection object. As follows:

 
Using system; using system. text. regularexpressions; Class chapter10 {static void main () {RegEx Reg = new RegEx ("the"); string str1 = "The quick brown fox jumped over the lazy dog"; matchcollection matchset; matchset = reg. matches (str1); If (matchset. count> 0) foreach (match amatch in matchset) console. writeline ("found a match at:" + amatch. index); console. read ();}}

Next we will discuss how to replace a string with another string using the replace method. The replace method can be called as a class method with three parameters: a target string, a substring to be replaced, and a substring to be replaced. As follows:

String S = "The quick brown fox jumped over the brown dog"; S = RegEx. Replace (S, "brown", "black ");

2. Usage of quantifiers. When writing a regular expression, you often want to add quantitative data to the regular expression, such as "exact match twice" or "match once or multiple times ". You can use quantifiers to add the data to a regular expression.

(1). + This quantizer indicates that the regular expression should match one or more immediate characters.

(2). * This quantizer indicates that the regular expression should match zero or multiple immediate characters.

(3 ).? This quantizer indicates that the regular expression should match a zero or one time quantizer.

(4). {n} indicates that the regular expression must match a finite number, and N indicates the number of matches to be found.

(5). {n, m indicates the maximum and minimum values that the regular expression should match, N indicates the minimum value of the match, and M indicates the maximum value.

A simple example is as follows:

 using system; using system. text. regularexpressions; Class chapter10 {static void main () {string [] words = new string [] {"bad", "boy", "baad", "baaad ", "Bear", "Bend"}; foreach (string word in words) if (RegEx. ismatch (word, "BA {2} D") console. writeline (Word) ;}}

 3. Use of character classes. Character classes can contain multiple groups of characters. If you want to match both lower-case and upper-case letters, you can write the regular expression as "[A-Za-Z]". Of course, if you need to include all ten numbers, you can also write a character class consisting of numbers like [0-9. In addition, by placing an unsigned character (^) in front of a character class, you can also create a reversed or negative character class. For example, if a [aeiou] character class is used to represent a vowel class, you can compile [^ aeiou] to represent a consonants or non-vowels. If these three character classes are combined, the so-called words in regular expression usage can be formed. The regular expression looks like this: [A-Za-z0-9]. There is also a character class that can be used to represent the shorter and smaller characters of the same class: \ W. \ W
Used to represent the inverse of \ W or non-word characters (such as punctuation marks ). In addition, you can also write the numeric character class ([0-9]) as \ D (note that in C # Language, The backslashes followed by other characters are probably escape sequences, so \ D and so onCodeIn C #,
\ DRegular Expressions rather than escape code ). Instead of numeric character classes ([^ 0-9]), it can be written as \ D
In this way. Finally, because space characters play a very important role in text processing, \ s is used to represent space characters, while \ s is used to represent non-space characters.

 

Summary:Regular Expressions can be quickly searched and replaced in most cases. However, to be familiar with their application, it is mainly a time-and-period process, naturally, it's easy.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.