Use of regular expressions and grep commands in Linux

Last Update:2018-12-06 Source: Internet

Author: User

Tags egrep

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

To use rule expressions, you must add ''at both ends of the matching mode ''. In this way, it is different from the shell's file wildcard number.
Regular Expression (regular exdivssion, RE)

1. What is regular notation:

What is a regular expression (regular exdivssion? To put it simply, in a Linux environment, we can use strings and some special character aids to compare texts so that users can filter the data they need.

These special characters and tools used together constitute the main axis of the regular expression!

For example, the/etc/rc. d/init. d directory is ready. If you want to find a file name containing the mail string, how can you search? Use grep in combination with mail and ten thousand characters to search for all file names "grep 'mail'/etc/rc. d/init. d /*』

2. Purpose of the regular expression for the system administrator:

For system administrators, the regular expression is a "good thing to learn !』 If the system is busy, the messages generated every day will be more than you can imagine, and we all know that, the system's "error message logon file" contains all the information generated by the system. Of course, this includes the record data of whether your system is "intruded. However, the amount of data in the system is too large. It is difficult for the system administrator to view so many messages and data every day, through the regular expression function, we can process the login information and analyze only the "error" information.

3. Wide application of regular Notation:

Apart from the system administrator, a lot of software and settings support regular representation. The most common example is "email server 』! Do you often receive the most criticized "advertising Letter" in emails? If I remove the ad mail on the server, the client will reduce a lot of unnecessary bandwidth consumption, right! So how can we remove advertising letters? Because almost all advertising letters have certain titles or content, as long as each letter has a special character string, remove any bad letters in regular notation! Currently, the two server software Sendmail and Postfix support the regular expression comparison function! Many server software and suites support regular representation.

4 grep

Syntax: [root @ test/root] # grep [-acinv] 'search for the string' filenames-list

Parameter description:

-A: searches binary files for data using text files.

-C: calculates the number of times the 'search string' is found.

-I: Case sensitivity is ignored, so the case sensitivity is the same.

-N: returns the row number by the way.

-V: reverse selection, that is, the line without the 'search string' content is displayed!

Example:

[Root @ test/root] # grep 'root'/var/log/secure

Search/var/log/secure. The file contains the root row.

[Root @ test/root] # grep-V 'root'/var/log/secure

Search for rows without root

[Root @ test/root] # grep [A-Z] anpath/etc/man. config

Note: grep is a common command. The most important function is to compare string data. It must be noted that "when grep searches for a string in a file, the data is retrieved in the unit of "whole row! 』

Grep is one of the simplest Regular Expression search commands. It does not support more rigorous formal expression content, but it is quite easy to use.

Example 1: Find the file containing the know character and list the row number. Note that the case sensitivity is different.

[Root @ test/root] # grep-N 'know' regexp.txt

Example 2: find that the file contains the * character and list the row number:

[Root @ test/root] # grep-n' \ * 'regexp.txt

Example 3: I want to list all knows, regardless of Case sensitivity, and list the row numbers:

[Root @ test/root] # grep-Ni 'know' regexp.txt

Note: similar Commands include egrep, awk, gawk, and SED, which will be detailed later.

5 charaters and egrep commands in regular notation

Special characters

^ Characters to be searched by word at the beginning of a row

Word $ characters to be searched at the end of a row

. Match any possible characters

\ The Escape Character turns special characters into common characters

? Any single character

* Duplicate characters in matching mode

[LIST] characters in the list

[Range] characters in the range in the list

[^ List] Reverse Selection, opposite to [LIST]

[^ Range] Reverse Selection, opposite to [range]

\ {N \} float n consecutive times with the same word as the previous one

\ {N, m \} and the previous same word float N-M consecutive

Note that the special characters in regular notation are different from the universal characters generally used to input commands in the Command column. For example, among the universal characters, * represents 0 ~ The meaning of multiple characters is unlimited, but in regular notation, * Indicates repeating the meaning of the previous character ~ The meaning of use is different. Don't confuse it!

Example: Under/etc, a line containing any character of XYZ is listed.

Grep [xyz]/etc /*

Example: I want to know that in/etc, if the first sentence is w-Z, it will be printed out?

Grep ^ [w-Z]/etc /*

6. DIFF: Check whether the two files have inconsistent commands!

Syntax: [root @ test/root] # diff file1 file2

Example: [root @ test/root] # diff index.htm index.html

Example: ls-L | grep '^ d'

Note: To use rule expressions, you must add ''at both ends of the matching mode ''. In this way, it is different from the shell's file wildcard number.

7. special characters for full rule expressions

Symbol execution

Pattern1 | pattern2 logic or

(Patten) Grouping Modes

Char + search for one or more duplicate instances with the previous character

Char? Search for one or zero instances with the previous characters

Example: T + matches one or more consecutive T, such as t TT ttt

T? Match 0 or 1 T, such as T or''

"Create | stream" matches any of the two modes.

Important Review

& #8226; Differences Between matching characters in Shell files and rule expressions

Shell file matching characters are used to match file names

The regular expression Re is mainly used as a "Search" string. It matches the content in the file and filters special messages;

& #8226; due to the different degree of rigor, there are more rigorous extensions of formal notation on formal notation;

& #8226; the processing method of regular expression is often "whole line" or "whole segment;

& #8226; grep and egrep are two common programs in regular notation. egrep can be matched in different schemas and supports more rigorous regular notation syntax.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More