Further study of rugular expresions page 1/2

Source: Internet
Author: User
Tags alphanumeric characters

The younger brother was born with the idea of learning a new student, but the younger brother was naturally a little lazy and always wanted to see if there were any fast learning methods. So the younger brother asked Google again, with his divine power, the younger brother found Mr. Jim holenhorst's Article After reading, the younger brother thought it was really good, so I had to make a careful report, share with friends of move-to.net, hope to bring you a little bit of help in learning re. The URL of Jim holenhorst's large article is as follows. You can directly link it to a large article if necessary.
The 30 minute RegEx tutorial by Jim holenhorst
Http://www.codeproject.com/useritems/regextutorial.asp
What is re?
Presumably, you have used a-character "*" when searching for files. For example, if you want to find all the Word files in the Windows directory, you may use "*. doc "is used for search, because" * "represents any character. Re is doing something like this, but it is more powerful.
Write Program Re is mainly used to describe the specific style. Therefore, you can regard re as a descriptive style. For example, "\ W +" represents a non-null string consisting of any letter or number ). The. NET Framework provides a very powerful category library to easily use re for text search and replacement, decoding complex headers, and text verification.
The best way to learn Re is to do it with examples. Jim holenhorst also provides a tool program expresso (cup of coffee) to help us learn about re. The download URL is http://www.codeproject.com/useritems/regextutorial/expressosetup2_1c.zip.
Next, let's try some examples.
Some simple examples
If you want to search for Elvis followed by an alive string in the article, using RE may go through the following process. Parentheses are the meaning of the RE:
1. ELVIS (search for Elvis)
The above indicates that the character order to be searched is Elvis. In. net, the Case sensitivity can be set to slightly different characters. Therefore, "Elvis", "Elvis", or "Elvis" are all RESS under 1. But because only the characters appear in the order of Elvis, pelvis also conforms to the RESS under 1. You can use the RE of 2 to improve the performance.
2. \ belvis \ B (Elvis is regarded as a whole word search, such as when Elvis and Elvis are slightly case sensitive)
"\ B" has a special meaning in re. In the above example, it refers to the word boundary, SO \ belvis \ B defines the front and back boundary of Elvis with \ B, that is, the word Elvis is required.
Assume that Elvis in the same row is followed by an alive string, and the other two special characters "." and "*" are used "*"."." This indicates any character except the line break character, and "*" indicates that the project is repeated * until the re-compliant string is found. Therefore, ". *" refers to any number of characters except for line breaks. Search for Elvis in the same row and find out the alive text string, which can be like 3 Re.
3. \ belvis \ B. * \ balive \ B (FIND THE alive text string followed by Elvis, such as elvis is alive)
You can use simple and special characters to form a powerful re, but it also finds that when more and more special characters are used, re will become more and more difficult to understand.
Let's look at another example.
Form a valid phone number
If you want to collect 7-digit phone numbers in the format of XXX-XXXX from the web page, where X is a number, RE may write like this.
4. \ B \ D-\ D (find the seven-digit phone number, such as 123-1234)
Each \ D represents a number ." -"Is a general hyphen. To avoid too many repeated \ D, RE can be rewritten as 5.
5. \ B \ D {3}-\ D {4} (search for a better seven-digit phone number, such as 123-1234)
{3} After \ D indicates that the previous project is repeated three times, that is, it is equal to \ D.
Re learning and testing tool Expresso
Because Re is not easy to read and users are prone to errors, Jim has developed a tool software expresso to help users learn and test the re. Besides the URL described above, you can also go to the ultrapico website (http://www.ultrapico.com ). After expresso is installed, in expression library, Jim builds many examples of the article. You can view the article and test it, or try to modify the re under the example, I can see the result immediately, and the younger brother thinks it is very useful. You can try it.
Basic concepts of RE in. net
Special characters
Some characters have special meanings, such as "\ B", ".", "*", and "\ D ." \ S represents any space character, such as spaces, tabs, and newlines .." \ W represents any letter or number.
Let's look at some examples.
6. \ Ba \ W * \ B (search for words starting with a, such as able)
This re description is used to find the start boundary of a word (\ B), followed by the letter "A", plus any number of letters and numbers (\ W *), then terminate the end boundary of the word (\ B ).
7. \ D + (search for numeric strings)
"+" And "*" are very similar, except that + must repeat the previous project at least once. That is to say, there must be at least one number.
8. \ B \ W {6} \ B (search for six letters and numbers, such as ab123c)
The following table lists the special characters commonly used by Re.
. Any character except for line breaks
\ W any letter or Digit
\ S any space character
\ D any number character
\ B defines the word boundary
^ The beginning of the article, for example, "^ the'' indicates that the string that appears at the beginning of the article is ""
$ End of an article, such as "End $", indicates that the end of an article appears as "end"
Special characters "^" and "$" are used to search for certain words that must be the beginning or end of an article. They are especially used to verify whether the input meets a certain style, for example, if you want to verify a seven-digit phone number, you may enter the following 9 re.
9. ^ \ D {3}-\ D {4} $ (verify the phone number with seven digits)
This is the same as the 5th re, but there are no other characters before and after it, that is, the entire string only has the seven numbers of phone numbers. In. if the multiline option is set in. net, "^" and "$" compare each line, as long as the beginning and end of a line meet the RE, instead of the entire article string for a comparison.
Conversion character (escaped characters)
Sometimes, you may need literal meaning instead of special characters, in this case, the "\" character is used to remove special characters, so "\ ^ ","\. "," \ "represents" ^ ",". the literal meaning.
Repeat the preceding project
I have read "{3}" and "*" before to repeat the preceding characters. Then we will see how to repeat the entire description (subexpressions) with the same syntax ). The following table describes how to repeat the preceding items.
* Repeat any number of times
+ Repeat at least once
? Zero or one repetition
{N} repeated n times
{N, m} repeats at least N times, but does not exceed M times
{N ,}repeat at least N times
Let's try some examples.
10. \ B \ W {5, 6} \ B (search for five or six alphanumeric characters, such as as25d and d58sdf)
11. \ B \ D {3} \ s \ D {3}-\ D {4} (find the phone number of ten numbers, such as 800 123-1234)
12. \ D {3}-\ D {2}-\ D {4} (find a social insurance number, such as 123-45-6789)
13. ^ \ W * (the first word in each line or entire article)
In espresso, try the difference between multiline and no multiline.
Match characters in a certain range
How to identify specific characters? In this case, brackets "[]" come in handy. Therefore, [aeiou] looks for the vowels "A", "E", "I", "O", and "u", [.?!] What are you looking ".","?" ,"!" These symbols remove the special meanings of special characters in brackets, that is, they are interpreted as literal meanings. You can also specify characters in a certain range, such as "[a-z0-9]", referring to any lowercase letter or any number.
Next, let's look at a complicated re-Example for finding phone numbers.
14 .\(? \ D {3} [(] \ s? \ D {3} [-] \ D {4} (find the phone number of 10 digits, for example (080) 333-1234)
Such a re can be used to find phone numbers in multiple formats, such as (080) 123-4567, 511 254 6654, and so on ." \(?" Represents one or zero left parentheses (", and" [(] "indicates finding a right parentheses") "or space character," \ s ?" It refers to one or zero space character groups. However, such a re will find a phone number like "800) 45-3321", that is, there is no symmetric balance between the brackets, and you will learn to choose one (alternatives) to solve this problem.
Not included in a specific character group (negation)
Sometimes you need to find the characters contained in a specific character group. The following table describes how to perform such a description.
\ W is not an arbitrary character of letters and numbers
\ S is not any character of the space character
\ D is not an arbitrary number character
\ B is not at the word boundary
[^ X] Not any character of X
[^ Aeiou] is not any character of A, E, I, O, u
15. \ s + (a string that does not contain space characters)
Alternatives)
Sometimes you need to find several specific options. At this time, the special character "" comes in handy. For example, you need to find five numbers and nine numbers () the zip code.
16. \ B \ D {5}-\ D {4} \ B \ D {5} \ B (search for five numbers and nine numbers () zip code)
When using alternatives, you need to pay attention to the order before and after, because in alternatives, re will first select the project that matches the leftmost, and in 16, if you put the item that finds the five numbers in front, the re will only find the zip code of five digits. If you have learned how to choose one, you can make a better correction of 14.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.