I believe that under Linux for file operations often use the sort and Uniq commands, the following system describes the use of these two commands.
Sort Command is very useful in Linux, it will sort the files and output the sorting results standard. The sort command can get input from either a specific file or from stdin.
Grammar
Sort (options) (parameters)
Options
-B: Ignores whitespace characters that begin before each line;
-C: Check whether the file has been sorted in order;
-D: When sorting, ignore other characters while processing English letters, numbers and space characters;
-F: When sorting, lowercase letters are treated as uppercase letters;
-I: When sorting, omit other characters except ASCII characters between 040 and 176;
-M: Merge the files of several sort numbers;
-M: Sorts the first 3 letters according to the abbreviation of the month;
-N: According to the size of the numerical order;
-o< output file;: Deposit the sorted result into the prepared document;
-r: Sort in reverse order;
-t< delimited character;: Specifies the field separator character to use when sorting;
+< start >-< end fields: Sort by the specified fields, ranging from the Start field to the previous column in the End field.
Parameters
File: Specifies the list of files to be sorted.
Instance
Sort compares each line of a file/text as a unit, comparing it from the first character backwards, to the ASCII value in turn, and finally outputting them in ascending order.
[Email protected] text]# cat sort.txt
aaa:10:1.1
ccc:30:3.3
ddd:40:4.4
bbb:20:2.2
eee:50:5.5
eee:50:5.5
[Email protected] text]# sort Sort.txt
aaa:10:1.1
bbb:20:2.2
ccc:30:3.3
ddd:40:4.4
eee:50:5.5
eee:50:5.5
Ignore the same row using the-u option or uniq:
[email protected] text]# cat Sort.txt
aaa:10:1.1
ccc:30:3.3
ddd:40:4.4
bbb:20:2.2
eee:50:5.5
eee:50:5.5
[Email protected] text]# sort-u sort.txt
aaa:10:1.1
bbb:20:2.2
ccc:30:3.3
ddd:40:4.4
eee:50:5.5
Or
[Email protected] text]# Uniq sort.txt
aaa:10:1.1
ccc:30:3.3
ddd:40:4.4
bbb:20:2.2
eee:50:5.5
Use of the-N,-R,-K,-t options for sort:
[email protected] text]# cat Sort.txt
AAA:BB:CC
aaa:30:1.6
ccc:50:3.3
ddd:20:4.2
bbb:10:2.5
eee:40:5.4
eee:60:5.1
#将BB列按照数字从小到大顺序排列:
[Email protected] text]# sort-nk 2-t: sort.txt
AAA:BB:CC
bbb:10:2.5
ddd:20:4.2
aaa:30:1.6
eee:40:5.4
ccc:50:3.3
eee:60:5.1
#将CC列数字从大到小顺序排列:
[Email protected] text]# SORT-NRK 3-t: sort.txt
eee:40:5.4
eee:60:5.1
ddd:20:4.2
ccc:50:3.3
bbb:10:2.5
aaa:30:1.6
AAA:BB:CC
#-N is sorted by numeric size,-R is in reverse order,-K is the field that specifies the sort of love that needs to be sorted,-t specifies that the field delimiter is a colon
The uniq command is used to report or ignore duplicate rows in a file and is generally used in conjunction with the sort command.
Grammar
Uniq (option) (parameter)
Options
-C or--count: Displays the number of occurrences of the row next to each column;
-D or--repeated: Displays only the rows that appear repeatedly;
-f< field > or--skip-fields=<;: Ignores comparison of the specified field;
-s< character position > or--skip-chars=< character position;: Ignores the comparison of the specified character;
-U or--unique: Show only one row at a time;
-W < character position > or--check-chars=< character position;: Specifies the character to compare.
Example
To delete duplicate rows:
Uniq file.txt
Sort File.txt | Uniq
Sort-u file.txt
Show only one line:
Uniq-u file.txt
Sort File.txt | Uniq-u
Count the number of times each line appears in the file:
Sort File.txt | Uniq-c
Find duplicate lines in the file:
Sort File.txt | Uniq-d
using sort and uniq to find the set, intersection, and difference of two files
Collection: Cat File1.txt file2.txt | Sort | Uniq > File.txt
Intersection: Cat File1.txt file2.txt | Sort | Uniq-d >file.txt
Differential set: Cat File1.txt File2.txt | Sort | Uniq-u >file.txt
Linux sort and uniq about the use of sort de-weight