Getting started with regular expressions and grep, sed, awk (1)

Source: Internet
Author: User

Some nonsense:

For regular expressions, I have always been confused. I can see some code. The script contains regular expressions, awk, sed, and I am totally confused. I can't remember how to use it at ordinary times, so I won't summarize it. If you are free, Let's sum up. By the way, we can overcome the weak points of Shell and JS with regular expressions. I plan to write in three articles

Body:

Generally, regular expressions are divided into several types, but they are basically the same. What I know is "basic regular expressions", "extended regular expressions", and "Perl regular expressions ", this article prefers to record "basic regular" and "extended regular", which play an important role in grep, egrep, sed, and awk.

 

Before starting regular expressions, review some common grep parameters:

-N: the row number,-V. Select-I in the reverse direction to ignore case sensitivity.

Next, let's start copying books. For more information, see laruence's Linux Private food. Basic Learning. First download the text we want to use:

Http://linux.vbird.org/linux_basic/0330regularex/regular_express.txt

The content is as follows:


First, learn "basic regular expressions"



1. Direct Matching

Example 1. Find the text containing apple and is respectively


This should be the simplest way to use regular expressions.



2. square brackets []

[] It mainly performs set matching. The usage is illustrated by examples.

Example 2 match the text containing test and tast


It can be seen that [] is to select one from the set [AE] to match



3. brackets [] and hyphens-Combination

If you want to match a text containing a number, we can write it as [0123456789], but this is too troublesome. This requires a hyphen. For numbers, you can write it as [0-9]. similarly, for letters can be applied up, uppercase letters [A-Z], lowercase letters [A-Z], can also be combined in a piece, such as uppercase and lowercase letters [A-Za-Z].

Example 3: Find the text containing numbers


 


4. square brackets [],-, ^, and combination of the three

In [], ^ indicates inverse. For example

Example 4. extract text with OO But without g before oo


Last "19: goooooogle yes !" Why does it match? Although goo is in the front, it is obviously not satisfied, but go (OO) oogle is satisfied, so it is matched. This may be one of the difficulties of regular expressions. The regular expressions you write may have bugs, but you have not found them yet.

 

Example 5: match the text with OO, but the front of OO does not contain lowercase letters

 

As you can see, this is [],-, ^ used together. Note: ^ indicates the inverse in.

 


5. ^ and $

^ Is displayed here, but it is different from the above. ^ indicates the beginning of the row, and $ indicates the end of the row.

Example 6 retrieve the text starting with


 

Example 7 retrieve text ending with a number or letter


 

Example 8 retrieve empty rows


Use '^ $' to match empty rows

 


6. Dot. And asterisk *

Point number. indicates that there is only one arbitrary character

The asterisk indicates that the previous 0 or multiple characters are repeated.


Example 9: G ?? String of D (there are two characters between GD)


As shown in the result, the dot. represents any character.


In this example, 10 matches at least two consecutive o characters.


Note that the meaning of "*" is different from that of the wildcard * we know.

 

Example 11: match the text starting with G and ending with G


You cannot use 'G * G' because * is different from the wildcard character. The correct character is 'G. * G'

So remember, the regular expression * is different from the wildcard!

 


7. Escape \

If the text we want to match exactly represents some special characters ("laruence's Linux house dish" says it has special meanings in shell, I think it is incorrect, or it is misleading, is it in shell? For example, the point number in his example represents the local directory in shell, right? In fact, the real reason is. Is it regular expression Characters ?), What should I do? Escape!

If you match the text ending with a dot., we know that the regular expression matches only one character, so you can use '\. $'


 

Write the basic regular expression first today.


References:

Laruence's Linux private house dish

Linux Programming

Http://www.ibm.com/developerworks/cn/education/aix/au-unixtips3/

Http://www.cnblogs.com/chengmo/archive/2010/10/10/1847287.html

 

Getting started with regular expressions and grep, sed, awk (1)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.