Common shell file processing commands

Source: Internet
Author: User

Merge and split files:
1. Sort
Command Format:
Sort-CMU-O output_file [other options] + pos1 + pos2 input_file
-C: whether the test file has been classified.
-M combines two classification files.
-U: delete all duplicate rows.
-O stores the output file name of the s o r t result.
Other options include:
-When B uses a domain for classification, the first space is ignored.
-N indicates that the category is a digital classification in the domain.
-T domain separator; use non-spaces or tab keys to separate fields.
-R: reverse sorting.
+ N is the domain number. Use this domain number to start classification. [The first column is the domain 0]
Pos1 pos2 is passed to M, N. M is the domain number, and N is the number of characters that start to be classified. For example, 4 or 6 means that the data is classified by 5th domains and starts from 7th characters.

$ Sort f1.txt
$ Sort-T: f2.txt
For numeric fields, use
$ Sort + 0n f1.txt % 0 indicates 1st domains. Any domain can be used.
$ Sort + 2 f1.txt % sorted by 3rd Domains
Remove duplicate rows when sorting $ sort-u f1.txt %

$ Head-N f1.txt % display the first n rows of the file
$ Tail-N f1.txt

$ Sort + 0n-r f1.txt | head-1 | awk '{print "worst case", $1, "has been rented by", $2 }'
Worst Case 483 has been rented by tfj

Merge files:
$ Sort-M file1 file2

Extract and sort the usernames in/etc/passwd:
$ Sort-T: + 0/etc/passwd | awk-F: '{print $1 }'

It is sorted by the last domain of the IP Address:
$ Sort-T. + 3N iplist_file % the file content is NNN. NNN xx

Uniq: delete * consecutive * repeated rows
Format: uniq-UDC-F input_file [output_file]
Option description:
-U only displays non-duplicate rows.
-D: only duplicate data rows are displayed. Only one row is displayed for each duplicate row.
-C: print the number of times each duplicate row appears.
-F n is a number, and the first n fields are ignored.
Some systems do not recognize the-F option. In this case, use-N instead.
$ More a.txt
May
May
May
Haha
May
$ Uniq
May
Haha
May

Connection file: Join
Two files must have the same domain (similar to the Join Operation of a database)

Cut: extract File Content
The general format of c u t is:
Cut [Options] file1 file2
The following describes the available options:
-C List specifies the number of characters to cut.
-F field specifies the number of cut fields.
-D specifies the domain separator different from space and t a B key.
-C is used to specify the cut range, as shown below:
-C 1st-7 Cut 5th characters, followed by 7th to characters.
-C1-50 cut the first 5 0 characters.
-F format is the same as-C format.
-F 1st cut 5th domain, domain.
-F 1st-12 cut the 1st domain, 1st 0 domain to 2 domain.

$ More
P. joines: Office Runner: id897
S. Round: Unix admin: id666
L. clip: personel chief: id982
$ Cut-D:-F3
Id897
Id666
Id982
Extract the username and shell used from/etc/passwd:
$ Cut-D:-F1, 7/etc/passwd
Root:/bin/sh
Tfj:/bin/bash
....

Paste: Paste
Paste-D: F1 F2 % the two files do not have to have the same number of rows
-D: Specifies the default delimiter space.
$ More
Tfj
ZYC
$ More B
Id111
Id222
$ Paste a B
Tfj id111
ZYC id222

Split: Split large files
Split-N file % N is the number of lines of the split file (up to 1000)
The generated small file is named xAA, XAB, xac... xzz.

TR: character conversion from stdin by replacement or Deletion
Tr-CDs pattern-string string_to_be_manipulated

$ Echo "aaaaaabbcccd" | tr-s "[A-Z]"
ABCD
Delete empty rows:
$ Tr-s "[/n]" <File
Convert uppercase to lowercase:
$ Cat file | TR "[A-Z]" [A-Z]"
All tr functions can be implemented using sed.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.