Common Linux commands (Text Processing)

Source: Internet
Author: User
Tags printable characters
Common Linux commands (Text Processing)-General Linux technology-Linux technology and application information. For details, refer to the following section. Sort command

The sort command sorts all rows in the file. The sort command has many very useful options, which were initially used to sort the file content in the database format. In fact, the sort command can be considered as a very powerful data management tool to manage files with similar content to database records.

The Sort command sorts the content in the file line by line. If the first character of the two lines is the same, the command will continue to compare the next character of the two lines, the comparison will continue.

Syntax:

Sort [Option] File

Note: The sort command sorts all the rows in the specified file and displays the results on the standard output. If the input file is not specified or "-" is used, the sorting content comes from the standard input.

Sort sorting is done by comparing one or more keywords extracted from the input line. The sorting keyword defines the smallest character sequence used for sorting. By default, the entire behavior keyword is ordered by ASCII characters.

The options for changing the default settings are as follows:

-M if the given file is sorted, merge the file.

-C check whether the given files are sorted in order. If they are not all sorted, an error message is printed and the system exits with the status value 1.

-U considers that only one row is left for the same row after sorting.

-O: the output file writes the sorting output to the output file instead of the standard output file. If the output file is one of the input files, sort first writes the content of the file to a temporary file, then sort and write the output results.

The options for changing the default sorting rule are as follows:

-D is ordered alphabetically. Only letters, numbers, spaces, and tabs are meaningful for comparison.

-F treats lowercase letters as uppercase letters.

-I ignore non-printable characters.

-M as the month comparison: "JAN" <"FEB"

-R outputs the sorting results in reverse order.

+ Posl-pos2 specifies one or more fields as the sorting keyword. The field position starts from posl to pos2 (including posl, excluding pos2 ). If pos2 is not specified, the keyword is from posl to the end of the row. The position of the field and character starts from 0.

-B ignores leading spaces (spaces and tabs) when searching for sorting keywords in each line ).

-T separator specifies the character separator as the field separator.

Uniq command

After the file is processed, duplicate rows may appear in its output file. For example, if you use the cat command to merge two files and then use the sort command to sort the files, duplicate lines may appear. In this case, you can use the uniq command to delete these repeated rows from the output file, leaving only unique samples for each record.

Syntax:

Uniq [Option] File

Note: This command reads the input file and compares adjacent lines. Under normal circumstances, the second and later repeated rows are deleted, and the row comparison is performed based on the sorting sequence of the character set used. The result after processing the command is written to the output file. The input and output files must be different. If the input file is expressed as "-", it is read from the standard input.

The options of this command are as follows:

-C: In the output, add the number of times this row appears in the file at the beginning of each row. It can replace the-u and-d options.

-D: only duplicate rows are displayed.

-U only displays rows that are not repeated in the file.

-N the first n fields are ignored together with the blank space before each field. A field is a non-space, non-tab string, separated by tabs and spaces (the Field starts from 0 ).

+ N the first n characters are ignored, and the previous characters are skipped (the characters start from 0 ).

-F n is the same as-n, where n is the number of fields.

-S n is the same as + n, where n is the number of characters.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.