Transferred from: http://blog.csdn.net/wklken/article/details/6562098
Sort,uniq,join,cut,paste,split
Command--sort
Sort classifies many different fields in different column order
Command format:
Sort–cmu–o Out-putfile [Other options] *POS1 *pos2 input-files
Options:
-C test file is categorized
-m merge two categorical files
-U Delete all replicated rows
-o stores the output file name of the sort result
-B classify with fields Yes, ignore first space
-n Specifies that the classification is a numeric classification on a field
-t domain delimiter; Use non-empty or tab-key to split a field
-R to sort order or compare inverse
+n n is a domain number, use this field to start sorting
n n is the domain number, which is ignored when classifying comparisons, and is generally used with +n
Post M,n is passed to M,n, M is the domain number, and n is the number of characters starting to classify
Example:
1. Save the output
$sort –o Result Sortfile
$sort sortfile > Result
Sort start way, sort think a space/more space for the delimiter, to join the other, you must use-t, when executing, first look at-T, if there, use it to split, if none, use a space
2. See if the file is sorted
$sort –c Sortfile
3. Use other separators
$sort –t:sortfile
4. Reverse the Order
$sort –t:-R sortfile
5. Unique classification, duplicate row removal in original file
$sort –u Sortfile
6. Specify the category field, 1 start
$sort –t:-K 4 Sortfile
$sort –t:-K 4–k 1 Sortfile
7. Specify the sort sequence
$sort +0-2 +3 Sortfile
8. Merging two classification files
$sort –m Sorted-file1 Sorted_file2
Command--uniq
Uniq Removing or prohibiting duplicate rows from a text file, General Uniq assumes that the file is sorted and the result is correct [sort–u uniqueness option to remove all duplicate rows]
Repeating rows in Uniq refer to rows that continue to recur
Format: uniq–udc–f input-file output-file
Options:
-U displays only non-repeating rows
-D displays only duplicate rows, one for each repeating line
-C Print the number of occurrences of each repeating row
-F N is a number, the first n fields are ignored
1. Show only non-repeating rows
$uniq –u Sortfile
2. Extracting non-repeating rows to a file
$uniq –u sortfile Result
3. Show duplicate rows only
$uniq –d Sortfile
4. Print duplicate rows and the number of occurrences
$uniq –c Sortfile
5. Ignore comparison of specified columns
$uniq –f2 Parts.txt
Command--join
Connect lines from two categorical text files
Premise: File1,file2 is classified
Each file has some elements associated with another file--even
Sort of like a set of demand
Note: Text fields are less than 20 when Joio =
Format: Join [options] Input-file Input-file2
Options:
An n is a number that displays unmatched rows from file n when connected,-A1 represents a mismatched row for the first file
o n,m n is the file number, M is the domain number, and 1,3 indicates that only the third domain of the file 1 is displayed
J N m N is the file number, M is the domain use other domain to do the connection domain
The T-field delimiter is used to set non-whitespace/non-tab separators
1. Connect two files [the default connection domain is domain 0]
$Join Name.txt. Turn.txt
2. Show unmatched rows for the first file
$join –a1 name.txt Town.txt
3. Setting Display connection Results
$join –o 1.1,2.2 name.txt town.txt
First File first field and second file second field as the result of the display
4. Setting up a connection domain
$join –j 1 3–j 2 2 file1 file2
Command--cut
Used to cut columns or fields from a standard input or text file
You can paste the clipped text into another file
Format: Cut [options] file1 file2
Options:
-C list Specifies the number of cut characters
-F field specifies the number of clipping fields
-d Specifies a delimiter that differs from the space/tab
-c Specifies the cut character range, characters, such as-C 1,5,7 1,5,6,7 character-C 1-50 first 50 characters
-f Specifies cut domain range-f 1,5 cut 1, 52 domain-F 1,10-12 cut 1,10,11,12 four domains
1. Using domain separators
$cut –d:-F3 data
[Email protected] temp]# cut-d:-f1/etc/passwd |head-5
Root
Bin
Daemon
Adm
Lp
–d: Indicates that cut is used instead: as a delimiter,-F1 represents the first field meaning.
2. Cut the specified field
$cut –d:-f1,3 Data # #表示取出各行第一及第三个字段意思
3. Cut characters
$who –u| Cut–c 1-8
[Email protected] temp]# who-u
Root tty1 2011-10-19 22:09 old 2463 (: 0)
Root pts/0 2011-11-0408:48. 7804 (192.168.0.86)
Root PTS/2 2011-10-3109:25 old 18934 (: 0.0)
Root PTS/3 2011-10-3109:47 old 18934 (: 0.0)
[Email protected] temp]# who-u |cut-c 1-8
Root
Root
Root
Root
Command: Paste
Paste pasting data into related files
There are two different sources of data that should be sorted first to ensure that the number of file rows is the same
Format:p aste–d–s File1file2
Options:
-D Specify a different delimiter
-S merges each file into rows instead of sticking them by line
File1
1
2
File2
A
B
1. The merger
$pastefile 1 File2
1 A
2 B
2. Specifying separators
$paste –d:file2 File1
A:1
B:2
3. Merge two rows instead of sticking to rows
$paste –s file1 file2
Command: Split
Used to cut files into small files
Format: split–output_file_size input_filename output_filename
Where Out_file_size is divided by the number of rows, the default 1000
Turn-shell Notes--command: Sort,uniq,join,cut,paste,split