Linux Command Text Processing (2)
Cut command
When the cut command is used to operate the columns of a file, it can be regarded as a column editor. It corresponds to most "editors", such as sed, grep, and sort. When they operate on text, unit: behavior.
The main function of cut is to output one or several columns of text. For English text, a single character occupies one column, so the output columns are output several characters.
The main options are as follows:
-C: specifies the number of columns to be output, which can be a single number or a range of 3-5.
m@meng:~$ cat new apple 3Apple 7pear 6pear 4banana 1orange 11m@meng:~$ cut -c 1-6 new apple Apple pear pear bananaorange
-B: specify the number of bytes to be output in each line, which is similar to the-c option, especially for English text, because an English letter is a byte (I think it is more appropriate to change English to ASCII text ).
m@meng:~$ cut -b 3 new ppaana
-F: the more powerful part of cut is processing formatted text, that is, each line can be divided into several fields. It seems that many commands provide such functions, such as sort, but they are barely doing so. The key is that they are too poorly identified by delimiters. In this regard, awk is far ahead.
The-f option is used to specify which field to output. The default Delimiter is tab. In fact, it seems that multiple spaces can also be identified. I will focus on the separator issue if I have time.
m@meng:~$ cut -f 2 new 3764111
In fact, the separator between the name and number in the new file is not a tab, but the cut is recognized correctly. However, this situation is not certain, for example:
m@meng:/etc/network$ sudo netstat -apn | sed '3,6 p' -n | cut -f 1tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 2899/sendmail: MTA:tcp 0 0 127.0.0.1:953 0.0.0.0:* LISTEN 1192/named tcp 0 0 0.0.0.0:538 0.0.0.0:* LISTEN 1251/gdomap tcp 0 0 0.0.0.0:445 0.0.0.0:* LISTEN 672/smbd
How can I write the source code ???
-D: Specifies the separator, which is generally used with-f. The delimiter can only be a single character. -S: only the rows containing delimiters are output. It will overwrite some functions of-f, because only-f will output rows that do not contain delimiters at the same time. After the-s option is added, the rows without delimiters will be deleted. -Output-delimiter = str: Specify the output separator as str. The default Delimiter is the same as the input delimiter.
m@meng:~$ cut --output-delimiter=: -f 1-2 new apple: 3Apple: 7pear: 6pear: 4banana: 1orange: 11m@meng:~$ cut --output-delimiter=: -c 1-4 new applApplpearpearbanaoran
Obviously, this option is only valid for different fields...
Uniq command
Duplicate lines in the text can be detected, similar to the-u option in sort.
-D: only the duplicate rows are displayed.
m@meng:~$ cat new apple 3apple 3Apple 7pear 6pear 4banana 1orange 11m@meng:~$ uniq -d new apple 3
-C,-count: displays the number of repetitions before each row.
m@meng:~$ uniq -c new 2 apple 31 Apple 71 pear 61 pear 41 banana 11 orange 11
-I: case insensitive.
-U: only the rows that are not repeated are output.
m@meng:~$ uniq -u new Apple 7pear 6pear 4banana 1orange 11
These are the main options. Others, such as-s and-f, are ignored several times.