File Summary of common commands for Linux text processing

Source: Internet
Author: User
Tags egrep

Transferred from: https://www.cnblogs.com/sheeva/p/6406285.html

Introduction

As a preference for Windows programmers, in the past do text processing is always like under Windows with notepad++ and other graphical tools, such as sometimes need to put a Linux server file a global string replacement such a simple operation, You have to go down to the local editor and pass it back. These two days bought this "bird elder brother's Linux private dish", finally quiet heart system learning a bit of Linux under the text processing, feel actually not imagined difficult, if earlier learned to save a lot of time certainly far more than the time spent studying.

Overview

Let's talk about what this article is about:

    1. Simply review the regular expression, and if you are familiar with the regular, at least know that the regular is divided into basic regular and extended regular can skip that part.
    2. The main body of this article: 4 Linux Text Processing commands: grep, sed, printf, awk.

The following officially begins.

Regular Expression Review

This part is for the regular expression of the reader to briefly review the regular expression, if you have not learned the regular expression of the reader suggested first to find relevant information to learn the regular expression and then look at this article.

Regular expressions are divided into basic regular expressions and extended regular expressions, as follows:

Basic Regular Expressions
Regular expression characters Meaning
^word Find text that starts with Word
word$ Find text that ends in Word
. Represents an arbitrary character
\ Escape character
* 0 to multiple characters
[ABC] Represents a character, the character is a or B or C
[A-z];[0-9] Represents a character from A to Z; a number from 0 to 9
[^ABC] Represents a character other than a, B, c
{M,n} M to n characters
Extending regular Expressions
Regular expression characters Meaning
+ One or more characters
0 or one character
| Or
() Group

Text Processing command grep

The role of grep is to look up characters by row and output lines that contain characters.

grep usage:

grep is generally used in two ways, from a file lookup to a pipeline input,

    1. grep ' word ' file.txt
    2. Cat File.txt|grep ' word '

Common parameters for grep:

Parameters Meanings and examples
-N Output result plus line number
--color=auto matching keyword highlighting
-a3 Outputs the following three rows of the matching row
-b2 Outputs the first two rows of a matching row
-V Reverse lookup, that is, the output of lines that do not contain keywords
-I. Ignore keyword capitalization when keyword matches

grep uses tips:

In most cases we want to highlight keywords (using the--color=auto parameter), so you can add them in the ~/.BASHRC file:

Alias grep= ' grep--color=auto '

, and then use

SOURCE ~/.BASHRC

Let the configuration take effect. This automatically takes the--color=auto parameter when we use grep.

GREP uses the example:

grep lookup is mainly based on the basic regular expression matching, the following is simply to give some common examples for reference.

grep ' t[ae]st '//Find tast or test

grep ' [0-9] '//Find numbers

grep ' [^a-z]oo '//Find Xoo, where x is a non-a-to-Z character

grep ' ^the '//Find the character starting with the, note here that the ^ appears in [] to represent a "non-character", as in the previous example, appearing outside [] to represent "start with a character", as in this example.

grep ' ^$ '//Find blank line

grep ' o\{2\} '//Find two O, it is important to note that {} has special meaning in the shell and therefore needs to be escaped, which is different from the normal use of the general, need to be noted.

Egrep

We know that regular expressions are divided into basic regular expressions and extended regular expressions, but grep supports only basic regular expressions, and if you use an extended regular expression, you need to use the Egrep command.

A few examples:

Egrep ' Gd|good '//Find GD or good

Egrep ' G (la|oo) d '//find glad or good

Egrep ' A (XYZ) +c '//Find AXC, where x is one or more ' xyz ' strings.

Sed

SED is a powerful command that can be used for 5 operations such as row deletion , row additions , row selection , row substitution , and string substitution .

Sed is a pipeline command that can handle pipeline input.

1. Delete rows

nl/etc/passwd | Sed ' 2d '//delete line 2nd

The input pipeline is omitted below

Sed ' 2,5d '//delete line 2nd to 5th

Sed ' 3, $d '//delete 3rd to last line, $ for last line

Sed '/^$/d '//delete empty lines

2. Line New

Sed ' 2a drink tea '//Add one line below the second line "Drink tea", A for Append

Sed ' 2i drink tea '//insert a row above the second line "Drink tea", I for insert

SED ' 2a a\

B\

C '//Add three lines "a", "B", "C" below the second line, only the end of each line will be added "\".

3. Row selection

Sed-n ' 5,7p '//Select the 5th to 7th line of output, you must add the-n parameter, otherwise the effect is that all rows are output, and 5 to 7 lines output two times.

4. Line substitution

Sed ' 2,5c no 2~5 lines '//Replace line 2nd to 5th with a line of string "No 2~5 lines"

5. String substitution

Sed ' s/string to be replaced/new string/g '//fixed format, beginning with S ending is G, middle three/separating the string to be replaced and the new string, note that the string to be replaced here can be a regular expression.

Write the results of the operation directly to the file

The default is to use SED to modify the file, just output the modified file, you can use > write to the new file. However, if you want to modify the original file, do not > to the original file, so that the result is the original file is emptied directly. To modify the original file, you can use the-i parameter, such as:

Sed-i ' 2d ' file.txt//directly deletes the second line in the original file.

It is very dangerous to modify the original file directly and cannot be restored once the error is corrected. You can print out the modified result without the-I parameter, and then add the-I parameter to confirm the error.

Printf

printf This command is not well described in words, but it is understood by the hands of the hand.

Save the following content as Printf.txt:

Name Chinese 中文版 Math averagedmtsai 77.33VBird 70.00Ken 60 90 70 73.33

Cat look first, the following effect:

Now take a look at the printf directive and add some parameters to perform

printf '%10s%10s%10s%10s%10s \ n ' Cat printf.txt '

Output Result:

is not much more beautiful than the results of cat output.

%10s represents this column with a fixed width of 10 characters. More formats are not introduced, this article we master a%10s is enough.

printf is not a pipe command, and to use it, you must use ' cat printf.txt ' to bring up the contents of the file as in the command above.

printf is widely used and is also applied to the printf command later in the awk command.

Awk

The awk command mainly handles the separation of files by separating them into columns, and can also be used to perform different processing of different rows through conditional judgments, even for numerical calculations ~

We also learn by example.

Let's take a look at the last logged 5 user information:

The first column in the diagram is the user name, the third column is the user IP, and now we want to pick out these two columns, which can be done with awk:

Last-5|awk ' {print $ \ t ' $ $} '

Output:

The command looks quite complicated, don't worry, it's simple.

First, awk uses a fixed format: awk ' {command} ', single quotes and curly braces are fixed formats.

And then the above command is

print "\ T" $    //awk The default is to separate each row into N columns with spaces and tabs, representing the first column and $ A for the third column.

It's a lot easier to see.

The data generated by the last command is tab delimited by default, and now we look at another example, executing cat/etc/passwd:

The data generated for each row is delimited, so you want to use awk to output the first and third columns to execute the delimiter:

Cat/etc/passwd|awk-f ': ' {print $ \ t ' $ '//    /-F ': ' represents the specified use: as a delimiter

Execution Result:

In addition to special symbols such as $1,$3,

The following special symbols can also be used in Awk's commands:

NF: Number of columns separated by each row

NR: Line number

Here is a comprehensive example of AWK's conditional judgments and numerical calculations, with a set of data saved as Pay.txt:

Name    1st   2nd   3rdVBird   23000   24000  25000DMTsai  21000   20000  23000bird2   43000   42000  41000

Now you want to add a column "Total" to calculate the sum of the values for each row.

This requirement can be accomplished with awk:

Cat Pay.txt |awk ' nr==1 {printf "%10s%10s%10s%10s%10s \ n", $1,$2,$3,$4, "total"}; nr>1 {printf "%10s%10s%10s%10s%10s \ n", $1,$2,$3,$4,$2+$3+$4} '

Operation Result:

Here are a few points:

    1. When the condition is added, awk is in the format: awk ' condition 1 {command 1}; condition 2{command 2} '
    2. The conditional judgment has the following logical operation:
      • >
      • <
      • >=
      • <=
      • = =//Note equal to two equals
      • !=
    3. You can directly calculate the values ($, $, $) in the row column.
Summarize

This article first reviews the regular expressions (basic regular expressions, extended regular expressions), then introduces 4 common commands, and finally we summarize the purpose of the four commands:

Command Use
Grep/egrep Keyword search
Sed
  1. Row Delete, add, replace, select
  2. Keyword substitution
Printf File format output
Awk
  1. Split each row into columns by delimiter and select some columns
  2. Different processing of non-peers by logical judgement
  3. Calculate a number of columns in a row

File Summary of common commands for Linux text processing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.