4.1 sort text
4.1.1 sort rows
If the command line option is not provided, the entire record is sorted by the order defined by the current locale.
In the traditional C locale, that is, the ASCII order.
4.1.2 sort by fields
The-k option is followed by a field number or a number.
Each number can be followed by a dot or a modifier.
If only one field number is specified, the sort key value starts from the beginning of the field,
Continues until the end of the record (rather than the end of the field ).
If a pair of field numbers are given, the sort key value starts from the start of the first field value,
End with the end of the second field value. Use a dot to indicate the character position.
-K2.4, 5.6 refers to the comparison starting from the fourth character of the second field, which is always the sixth character of the fifth field.
$ Sort-t:-k1, 1/etc/passwd are sorted by user name
Bin: x: 1: 1: bin:/sbin/nologin
Chico: x: 12501: 1000: Chico Marx:/home/chico:/bin/bash
Harpo: x: 12502: 1000: Harpo Marx:/home/harpo:/bin/ksh
...
$ Sort-t:-k3, 3nr/etc/passwd reflected UID sorting
Or-k3nr, 3 or-k3, 3-n-r.
4.1.3 sorting of text blocks
Sometimes it is necessary to sort data composed of multiple rows of records. Take the address list as an example:
$ Cat my-friends
# SORTKEY: Schlo, Hans Jurgen
Hans Jurgen Schlo
Unter den Linden 78
D-10117 Berlin
Germany
# SORTKEY: Jones, Adrian
...
Tip: Use awk to identify paragraph intervals and temporarily use an unused character to replace the branch in each address.
The lines seen by sort will become like this:
# SORTKEY: Schlo, Hans Jurgen ^ ZHans Jurgen Schlo ^ ZUnter den Linden 78 ^ Z...
Cat my-friends | read the address file
Awk-v RS = "" '{gsub ("\ n", "^ Z"); print}' | the conversion address is a single row.
Sort-f | sort address data, case insensitive
Awk-v ORS = "\ n" '{gsub ("^ Z", "\ n"); print}' | restore the row Structure
Grep-V' # SORTKEY 'delete a tag row
1. The gsub () function is global substitution, similar to the s/x/y/g Structure Under sed.
2. Record Separator of input data when the RS variable is used ).
Input data is usually separated by line breaks, making each line a single record.
RS = "" is a special usage, that is, records are separated by blank rows.
3. ORS is the output record splitter.
Note: '{action}' is the operation on each field, and RS and ORS are the settings of the record.
4.1.5 sort Stability
Sort is unstable.
4.2 duplicate Deletion
Sort-u is the elimination operation based on the matched key value, rather than the matching record.
Uniq has three useful options:
-C: the number of times that the row repeats before each output row.
-D is used to display only duplicate rows.
-U only displays non-duplicate rows.
4.3 format the paragraph again
Fwt-w 30
4.4 calculate the number of lines, words, and characters
Wc outputs a report line by default, including the number of rows, number of words, and number of bytes.
Available options:-c,-l, and-w ).
Copy codeThe Code is as follows: $ echo Testing one two three | wc-c
1 4 22
$ Wc/etc/passwd/etc/group
4.6 extract the beginning or end number of rows
Display the first n records of each file in the file list:
Head-n
Head-n
Awk 'fnr <= N'
Sed-e nq
Sed nq
Observe the increasing system information logs and Ctrl-C stop the tail.
Copy codeThe Code is as follows: $ tail-n 25-f/var/log/messages