Basic shell programming [3]-common tools awk \ sed \ sort \ uniq \ od, awkuniq

Source: Internet
Author: User

Basic shell programming [3]-common tools awk \ sed \ sort \ uniq \ od, awkuniq
Awk

Awk is a very useful thing. It is widely used in the presentation and processing of linux system analysis results. You can also use pipelines, input | awk ''| output1. first, you need to know the form of awk 'COMMAND 'file such as awk' {print $0} 'a.txt B .txt (one or more files can be followed) 2. command learning. Command is the essence of awk. Its structure is 'condition {action} condition 2 {Action 2 }...... 2.1 keyword learning: variable name meaning
Number of ARGC command line Variables
ARGV command line meta Array
FILENAME current input file name
Number of records in the current FNR File
The input field delimiter of FS. The default Delimiter is a space.
RS input record delimiter
Number of domains in the current NF record
No. of NR records so far
OFS output domain Separator
ORS output record separator 2.2 ConditionAnd Action. The conditions include: begin end, which indicates initialization and scanning. For example, $1 = "abc" $ NR = 5/^ tcp/(indicating regular expression matching) if this parameter is not specified, it indicates "full match ". From this perspective, conditions are essentially a filtering rule. Action: {print NR, NF, $1, $ NF,} {if (xxx) xxx; else xxx ;}{ for (key in array) xxx} 3. instance learning: view the number of connections established by the machine. netstat-n | awk '/^ tcp/{++ state [$ NF]} END {for (key in state) print key, "\ t", state [key]} 'check the number of bytes occupied by each connection --- apacheps aux | grep-v grep | awk'/httpd/{sum + = $6; n ++}; END {print sum/n} 'splits each row of the abc File Based on commas (,), and sorts the rows by the second column, the result is output to awk-F, '{print $1}' abc | sort-n-k 2-t:-r> abc-sort in abc-sort.

Explanation of the sort command:
-N is sorted by numbers.
-K is based on the second column
-T: separator:
-R is a flashback of git to view the changes to be submitted. git diff master HEAD -- stat | awk '{printf "% s \ n ", $1} '| grep domain | awk-F'/''{printf" % s \ n ", $ NF} '| sort batch rename ls * need to replace * | awk' {org = $0; gsub ("need to replace", "Replace "); system ("echo" org "" $0)} 'sed

Sed is used in many ways, but according to the above section, it is used to replace the most content.

Sed-I-e's/^ dubbo_provider_version =. * [^ e] $/&-pre/'/home/wuji/webroot-xxx/WEB-INF/classes/biz. properties. replace dubbo_provider_version = 1.0.0 with dubbo_provider_version = 1.0.0-pre with sed-e's/abc/def 'file.txt in properties to replace abc with def. The first part of the regular expression can be obtained in the second part. Note that we get all the content starting with dubbo, not the. * part. This is the knowledge of regular expressions. In addition, s can be extended to Example 2: Remove all html tags $ sed-e's/<[^>] *> // G' myfile.html g: if you do not add only Replace the first match, it will replace all the files that match sed-I directly instead of outputting vim on the screen. Friends who have learned vim can easily think of vim's command mode, there are also: s/abc/def writing, so a lot of linux knowledge can be passed by the class. Uniq

Uniq can remove duplicate rows or make group by statistical files file: aabbb sort file | uniq: AB sort file | uniq-c: 2 a3 B is combined with sort to bring all a together to prevent a following B. Uniq-d only displays duplicate,-c only displays non-duplicate, and the two are mutually exclusive. Uniq-dc only displays duplicates and statistics SortFunction Description: sorts text files. Syntax: sort [-bcdfimMnr] [-o <output file>] [-t <delimiter>] [+ <start column>-<End column>] [-- help] [-- verison] [file] supplementary instructions: sort sorts the content of a text file in the unit of action. Example: sort by the second letter of the first key column: $ sort-k 1.2 file.txt sort by the second letter of the first key column, if the second letter is the same, it is sorted in descending order according to the numerical value in the third column. $ Sort-k 1.2, 1.2-k 3, 3nr file.txt-k sorting field, distinguished by-t separator, starting from 0. -N is sorted by number format. If the default string method is used for comparison, the value 20 and the value 9 is larger than the value 9. -R Reverse Order-d sorting, except English letters, numbers, and spaces are processed, and other characters are ignored. -B ignores the leading space characters in each line. -U removes duplicate rows. ( You can use this to repeat) Note that the sort option is not particularly worth mentioning. The syntax format of the-k option is as follows: [FStart [. CStart] [Modifier] [, [FEnd [. CEnd] [Modifier] The syntax format can be divided into two parts by commas (,): Start and End. If the End part is not set, End is considered to be the End of the row. The Start part is also composed of three parts. The Modifier part is the option part similar to n and r we mentioned earlier. Let's focus on FStart and C. Start in Start. C. Start can also be omitted. If it is omitted, it indicates starting from the beginning of the local domain. In the previous example,-k 2 and-k 3 are examples that omit C. Start. FStart. CStart, where FStart indicates the domain used, and CStart indicates that the FStart field starts from the first character of the FStart field ". Similarly, in the End section, you can set FEnd. CEnd. If you omit. CEnd, it indicates the End to the End of the domain, that is, the last character of the local domain. Or, if you set CEnd to 0 (zero), it also indicates the end to the end of the "domain ".  Od

The od command is a tool used to analyze the file content. In most cases, you do not know the file content encoding. In this case, the most direct method is to use the od command to view the byte composition in the file. Usage: od-Ax-tcx4 file. It can BE used to analyze whether the character encoding is UTF-8, whether it is LE, and how to distinguish it. You also need to understand the rules of each encoding. For example, UTF-8 is generally used to display Chinese Characters in three bytes, while gbk is two.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.