The use of Linux sed and awk

Source: Internet
Author: User

SED usage:

Original link: http://www.cnblogs.com/dong008259/archive/2011/12/07/2279897.html

Sed is a good file processing tool, itself is a pipe command, mainly in the behavior of the unit processing, you can replace the data rows, delete, add, select and other specific work, the following first understand the use of SED
The SED command line format is:
sed [-nefri] ' command ' input text

Common options:
        -n: Use Quiet (silent) mode. In the usage of general sed, all data from stdin is generally listed on the screen. However, if you add the-n parameter, only the line (or action) that is specially processed by SED is listed.
       -E: Action editing of SED directly in instruction-column mode;
        -F: Directly write the SED action in a file, and-f filename to perform the SED action within filename;
       -r:sed The action supports the syntax of the extended formal notation. (Presupposition is the basic formal notation of French law)
        I: Directly modifies the contents of the read file, not the screen output.        

Common commands:
        a  : New, a can be followed by a string, and these strings will appear on a new line (the current next line) ~
        c  : Replace, C can be followed by a string, these strings can replace the line between N1,N2!
        d  : Delete, because it is deleted, so D usually does not take any of the following;
          i  : Insert, I can be followed by a string, and these strings will appear on a new line (the current line);
          p  : Print, that is, a selection of information printed. Normally p works with parameter Sed-n ~
         s  : Replace, can be directly replaced by work! Usually this s action can be paired with formal notation! For example 1,20s/old/new/g is!

Example: (Suppose we have a file named ab)
      Delete a row
     [[email protected] ruby] # sed ' 1d ' ab              #删除第一行  
     [[email protected] ruby] # sed ' $d ' ab                #删除最后一行
     [[email  Protected] ruby] # sed ' 1,2d ' ab            #删除第一行到第二行
     [[email protected] ruby] # sed ' 2, $d ' ab             #删除第二行到最后一行

Show a row
. [[email protected] ruby] # sed-n ' 1p ' ab #显示第一行
[[email protected] ruby] # sed-n ' $p ' ab #显示最后一行
[[email protected] ruby] # sed-n ' 1,2p ' ab #显示第一行到第二行
[[email protected] ruby] # sed-n ' 2, $p ' AB #显示第二行到最后一行

Querying using a pattern
[[email protected] ruby] # sed-n '/ruby/p ' ab #查询包括关键字ruby所在所有行
[[email protected] ruby] # sed-n '/\$/p ' AB #查询包括关键字 $ where all lines, using backslashes \ Shielding special meaning

Add one or more lines of string
[email protected] ruby]# Cat AB
Hello!
Ruby is me,welcome to my blog.
End
[[email protected] ruby] # sed ' 1a drink tea ' ab #第一行后增加字符串 "Drink Tea"
Hello!
Drink tea
Ruby is me,welcome to my blog.
End
[[email protected] ruby] # sed ' 1,3a drink tea ' ab #第一行到第三行后增加字符串 ' drink tea '
Hello!
Drink tea
Ruby is me,welcome to my blog.
Drink tea
End
Drink tea
[[email protected] ruby] # sed ' 1a drink tea\nor coffee ' ab #第一行后增加多行, using line break \ n
Hello!
Drink tea
or coffee
Ruby is me,welcome to my blog.
End

Instead of one row or more rows
[[email protected] ruby] # sed ' 1c Hi ' AB #第一行代替为Hi
Hi
Ruby is me,welcome to my blog.
End
[[email protected] ruby] # sed ' 1,2c Hi ' ab #第一行到第二行代替为Hi
Hi
End

Replace a section in a row
Format: sed ' s/string to replace/new string/g ' (the string to replace can be used with regular expressions)
[[email protected] ruby] # sed-n '/ruby/p ' ab | Sed ' s/ruby/bird/g ' #替换ruby为bird
[[email protected] ruby] # sed-n '/ruby/p ' ab | Sed ' s/ruby//g ' #删除ruby

Insert
[[email protected] ruby] # sed-i ' $a bye ' ab #在文件ab中最后一行直接输入 "Bye"
[email protected] ruby]# Cat AB
Hello!
Ruby is me,welcome to my blog.
End
Bye

Delete a matching row

Sed-i '/Match string/d ' filename (note: If the match string is a variable, you need "" instead of ". Remember as if it were)

Replace a string in a matching row

Sed-i '/Match string/s/Replace source string/Replace target string/g ' filename

Linux awk usage Source Link: http://www.cnblogs.com/ggjucheng/archive/2013/01/13/2858470.html

Introduction

Awk is a powerful text analysis tool, with the search for grep and the editing of SED, which is especially powerful when it comes to analyzing data and generating reports. To put it simply, awk reads the file line-by-row, using spaces as the default delimiter to slice each row, and then perform various analytical processing of the cut.

AWK has 3 different versions: AWK, Nawk, and gawk, which are not specifically described, generally referred to as the GNU version of awk, Gawk,gawk.

Awk has its name from the first letter of its founder Alfred Aho, Peter Weinberger and Brian Kernighan's surname. In fact, Awk does have its own language: The awk programming language, a three-bit creator has formally defined it as "style scanning and processing language." It allows you to create short programs that read input files, sort data, manipulate data, perform calculations on input, and generate reports, as well as countless other features.

How to use
awk ' {pattern + action} ' {filenames}

Although the operation can be complex, the syntax is always the same, where pattern represents what AWK looks for in the data, and the action is a series of commands that are executed when a match is found. Curly braces ({}) do not need to always appear in the program, but they are used to group a series of instructions according to a particular pattern. pattern is the regular expression to be represented, surrounded by slashes.

The most basic function of the awk language is to browse and extract information in a file or string based on the specified rules, before awk extracts the information for additional text operations. A complete awk script is typically used to format the information in a text file.

Typically, awk is treated as a unit of a file's behavior. awk processes the text by executing the corresponding command for each line that receives the file.

Call awk

There are three ways of calling Awk

1. Command line mode awk [-f  field-separator]  ' commands '  input-file (s) where commands is the true awk command, [-F domain delimiter] is optional. Input-file (s) is the file to be processed. In awk, each line in a file, separated by a domain delimiter, is called a domain. In general, the default field delimiter is a space without naming the-F domain delimiter. The 2.shell script inserts all of the awk commands into a file and makes the awk program executable, and then the awk command interpreter is invoked as the first line of the script, again by typing the script name. Equivalent to the first line of the shell script: #!/bin/sh can be replaced by: #!/bin/awk3. Inserts all awk commands into a single file and then calls: Awk-f awk-script-file input-file (s) where,- The f option loads the awk script in Awk-script-file, and Input-file (s) is the same as above.

This chapter focuses on the command-line approach.

Getting Started instance

Suppose the output of Last-n 5 is as follows

[[email protected] ~]# last-n 5 <== Only remove the first five elements root pts/1 192.168.1.100 Tue  Feb 11:21   still logged Inroo T     pts/1   192.168.1.100  Tue Feb 00:46-02:28  (01:41) root     pts/1   192.168.1.100  Mon Feb  9 11:41-18:30  (06:48) Dmtsai   pts/1   192.168.1.100  Mon Feb  9 11:41-11:41  ( 00:00) root     tty1                   Fri Sep  5 14:09-14:10  (00:01)

If you only show the 5 most recently logged-in accounts

#last-N 5 | awk  ' {print '} ' rootrootrootdmtsairoot

The awk workflow is this: reads a record with a ' \ n ' line break, then divides the record by the specified domain delimiter, fills the field, and $ $ represents all fields, representing the first field, $n representing the nth field. The default Domain delimiter is the "blank key" or "[tab] key", so the login user, $ $ means the login user IP, and so on.

If you just show/etc/passwd's account

#cat/etc/passwd |awk-  F ': '  {print $} '  Rootdaemonbinsys

This is an example of awk+action, where each line executes action{print $.

-f Specifies the domain delimiter as ': '.

If you only display the/ETC/PASSWD account and the shell of the account, and the account and the shell are split by tab

#cat/etc/passwd |awk-  F ': '  {print ' \ t ' $7} ' root    /bin/bashdaemon  /bin/shbin     /bin/shsys     /bin/sh

If you just show/etc/passwd's account and the shell of the account, and the account is separated by a comma from the shell, and the column name Name,shell is added to all rows, add "Blue,/bin/nosh" to the last line.

CAT/ETC/PASSWD |awk-  F ': '  BEGIN {print ' Name,shell '}  {print $ ', ' $7} END {print ' Blue,/bin/nosh '} ' name, Shellroot,/bin/bashdaemon,/bin/shbin,/bin/shsys,/bin/sh....blue,/bin/nosh

The awk workflow is done by first executing the beging, then reading the file, reading a record with the/n line break, and then dividing the record by the specified field delimiter, populating the field, and $ $ representing all fields, representing the first field, $n representing the nth field, The action action corresponding to the execution pattern is then started. Then start reading the second record ... Until all the records have been read, the end operation is performed.

Search all rows with the root keyword/etc/passwd

#awk-F: '/root/'/etc/passwdroot:x:0:0:root:/root:/bin/bash

This is an example of the use of pattern, which matches the line of pattern (this is root) to execute the action (without specifying an action, the default output of the contents of each row).

Search support for the regular, for example, root start: awk-f: '/^root/'/etc/passwd

Search all lines that have the root keyword/etc/passwd and display the corresponding shell

# awk-f: '/root/{print $7} '/etc/passwd             /bin/bash

Action{print $7} is specified here.

awk built-in variables

Awk has many built-in variables for setting up environment information, which can be changed, and some of the most commonly used variables are given below.

ARGC               command-line arguments argv               command-line parameter arrangement environ            support the use of system environment variables in queues filename           awk browses the file name Fnr the                number of records to browse files FS                 Set input field delimiter, equivalent to command line-F option NF                 Browse record number of fields nr                 Read records ofs                output field delimiter ors                Output record delimiter Rs                 control record delimiter

In addition, the $ variable refers to the entire record. $ $ represents the first field of the current row, which is the second field of the current row,...... And so on

Statistics/etc/passwd: File name, line number per line, number of columns per row, corresponding full line contents:

#awk-  F ': '  {print ' filename: ' filename ', linenumber: ' NR ', columns: ' NF ', linecontent: ' $ '/etc/ passwdfilename:/etc/passwd,linenumber:1,columns:7,linecontent:root:x:0:0:root:/root:/bin/bashfilename:/etc/ Passwd,linenumber:2,columns:7,linecontent:daemon:x:1:1:daemon:/usr/sbin:/bin/shfilename:/etc/passwd,linenumber : 3,columns:7,linecontent:bin:x:2:2:bin:/bin:/bin/shfilename:/etc/passwd,linenumber:4,columns:7,linecontent:sys : x:3:3:sys:/dev:/bin/sh

Use printf instead of print to make your code more concise and easy to read

awk  -F ': '  {printf ("filename:%10s,linenumber:%s,columns:%s,linecontent:%s\n", Filename,nr,nf,$0)} '/etc/ passwd

Print and printf

The functions of print and printf two printouts are also available in awk.

The parameters of the print function can be variables, values, or strings. The string must be quoted in double quotation marks, and the arguments are separated by commas. If there are no commas, the parameters are concatenated together and cannot be distinguished. Here, the function of the comma is the same as the delimiter of the output file, except that the latter is a space.

The printf function, whose usage is basically similar to printf in the C language, can format strings, and when the output is complex, printf is more useful and the code more understandable.

awk Programming

Variables and Assignments

In addition to Awk's built-in variables, awk can also customize variables.

The following statistics/etc/passwd account number

awk ' {count++;p rint;} End{print "User Count is", count} '/etc/passwdroot:x:0:0:root:/root:/bin/bash......user count is 40

Count is a custom variable. The previous action{} has only one print, in fact print is just a statement, and action{} can have more than one statement, separated by a number.

The count is not initialized here, although the default is 0, but the proper approach is initialized to 0:

awk ' BEGIN {count=0;print ' [Start]user count is ', count} {Count=count+1;print $;} End{print "[End]user Count is], count} '/etc/passwd[start]user count is  0root:x:0:0:root:/root:/bin/bash ... [End]user count is  40

Count the number of bytes occupied by a file under a folder

Ls-l |awk ' BEGIN {size=0;} {size=size+$5;} End{print "[End]size is", size} ' [End]size is 8657198

If displayed in units of M:

Ls-l |awk ' BEGIN {size=0;} {size=size+$5;} End{print "[End]size is", size/1024/1024, "M"} ' [End]size is 8.25889 M

Note that the statistics do not include subdirectories of folders.

Conditional statements

The conditional statements in awk are drawn from the C language, as described in the following declaration:

if (expression) {    statement;    statement;    ... ...} if (expression) {    statement;} else {    statement2;} if (expression) {    statement1;} else if (expression1) {    statement2;} else {    statement3;}

Count the number of bytes in a file under a folder, filtering files of 4096 size (typically folders):

Ls-l |awk ' BEGIN {size=0;print ' [start]size is ', size} {if ($5!=4096) {size=size+$5;}} End{print "[End]size is", size/1024/1024, "M"} ' [End]size is 8.22339 M

Looping statements

The looping statements in awk also draw on the C language, supporting while, Do/while, for, break, continue, which are semantically identical to the semantics of the C language.

Array

Because the subscript of an array in awk can be numbers and letters, the subscript of an array is often referred to as the keyword (key). The values and keywords are stored inside a table for the Key/value application hash. Since hash is not stored sequentially, it is found in the display of array contents, which are not displayed in the order you expect. Arrays, like variables, are created automatically when they are used, and awk automatically determines whether they store numbers or strings. In general, an array in awk is used to collect information from records, which can be used to calculate sums, count words, and how many times the tracking template is matched.

Show/ETC/PASSWD's account

Awk-f ': ' BEGIN {count=0;} {Name[count] = $1;count++;}; End{for (i = 0; i < NR; i++) print I, Name[i]} '/etc/passwd0 root1 daemon2 bin3 sys4 sync5 Games ...

This uses the For loop to iterate through the array

The use of Linux sed and awk

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.