Linux Sort Command Chinese manual (info sort translation)

Source: Internet
Author: User
Tags month name numeric value sorts


(1). This manual only selects useful information for translation, if you want to check the full content, please do your own info sort.

(2) in the translation, the use of "note" in parentheses, for my addition, non-original content, help to understand and explain.

(3). The sort command for this article is on CentOS 7.2 and the version is sort (GNU coreutils) 8.22, some options may not be supported on CentOS 6, such as "--debug".

(4). When you do not understand the sort processing field and sorting mechanism, it is highly recommended that you do not look at man sort.

7.1 ' sort ': the sort Text files===========================sort command is used to sort, merge, or compare all rows of a given file (which can be given multiple), if no input file is given or the input file is "-", The standard input is read. By default, sort prints the result of the operation in standard output. Syntax: sort [OPTION] ... [File]...sort has 3 modes of operation: sort (default), merge, and check to see if the order has been sorted. Use the following 3 options to change the mode of operation: '-C '--check '--check=diagnose-first ' checks whether the given file has been sorted: if the detection is not sorted, the diagnostic information is output and exited with status Code 1, which contains the first unordered line.     Otherwise, exit with success status. A maximum of one detection file can be given. '-C '--check=quiet '--check=silent ' It is similar to '-C ', but does not output diagnostic information. If the file is sorted, exit with a success status or exit with status code 1. Only one file can be given at most. '-M '--merge 'merging multiple files, each input file must already be sorted. Merges are merged into groups based on the sorted results. Sort is generally used for sorting, but it still provides the merge function because it merges fast. Sort collation: Sorts the given field according to the order of the fields given in the command line, sorting based on the sort options assigned to each field until a different sort option is found or the end of the row sequence is complete. If no sort key is given (note: Key  is- k is specified), the entire row is sorted. Finally, if all of the given keys are equal, the entire row will be sorted in a completely default order (note: in ascending alphabetical order), but "-r" can change the result of the ascending and descending order. This sort is called the "last sort." use the "-S" option to suppress "Last order", which preserves the original relative order of rows that have the same sort results. The "-u" option also disables "last sort". Unless explicitly specified, all comparisons are sorted according to the collation of the character set specified by "Lc_collate". Exit status code: 0 No error occurred when 1 if the "-C" or "-C" detection found that the input data is not sorted when the 2 error occurs if the environment variable "TMPDIR" is set, sort will use it as a temporary directory instead of the default "/tmp". "-T"option overrides the value set by the environment variable. The following options affect the output of the sort. They can be either specified as global options or as part of a key. if no key is specified, the global option will be used for the entire row, otherwise the specified key will inherit the global option unless the key itself also specifies the option (note: The key that specifies the option itself overrides the global option). To consider portability, it is recommended that you specify the global options"- k"(or"--key") in the front.'-B '--ignore-leading-blanks ' ignores the leading whitespace symbol for key (including spaces, tabs). When this option is not given, the whitespace symbol has an effect on the character position specified by the "-k" option (Note: For example, the 2nd character specified by "-K 2.2" may be blank). '-F '--ignore-case ' treats lowercase characters as uppercase characters. For example, "B" and "B" are equal. When used with the "-u" option (Note: Duplicate rows can only be output once), the equivalent lines of those lowercase characters are discarded (note: In other words, the output is uppercase lines). (There is currently no way to discard the equivalent lines of uppercase characters,even if you use "-r", because at any time, the "-r" option simply reverses the final sort result without affecting the sorting process)。 '-H '--human-numeric-sort '--sort=human-numeric ' to sort the file size format. First, the positive and negative sort (positive >0> negative), then the size suffix (0<k=k<m<g<t ...), and then the numerical sort. It doesn't care if the conversion accuracy is 1000 or 1024, because it always automatically expands to the nearest suffix (note: For example, 999M and 1G comparisons will take 1000 as the unit of conversion, 1023M and 1G will be compared with 1024 as the conversion unit).     '-M '--month-sort '--sort=month ' is sorted by month in character format. An initial string, consisting of any amount of blanks, followed by a month name abbreviation, was folded to UPPER case  And compared in the order ' JAN ' < ' FEB ' < ... < ' DEC '. Invalid names compare low to valid names. '-n '--numeric-sort '--sort=numeric ' is sorted by value. Empty string "" or "+" is treated as none.     Numeric sorting is an exact sort and does not round out after sorting. NoteThe difference between a numeric sort and a default collation is that when a non-mathematical character is encountered in key, such as white space, letters, special characters, and so on, the sort is ended directly(The matching value is not found in the sort interior). Other words"-K 2" and "-K 2n", although these two keys will be extended to the end of the line, the former will be compared from the second field to the end of the line in the character set order, while the latter may only match the 2nd field, because the second field and the third field may have special symbols, resulting in a numerical sort directly ending.   therefore, for an input such as "ABC 100 200", when specifying "-K 2n", the key is "100 200", but because the middle contains a blank, the ordering of the key ends in the second field. If "ABC 100\0200", "-K 2n" in the sort, although it appears to be 100200, but only 100 is sorted, that is, if there is another row of the 2nd field value is 110, it appears that a large 100200 will be less than 110. Test statement: echo-e "b 100:200 200\na 110 300" | Tr ': ' |\
sort-t '-k2n-k1 therefore,for"-N", it is absolutely impossible to cross key boundaries. However, the default collation will work across keys. '-R '--reverse ' reverses the result of the comparison, making the larger key in the result appear earlier. (Note: the "-R" does not change the sorting behavior, but instead reverses the output of the sorted end, so it affects only the output after the sorting is finished)'-K Pos1[,pos2] '--key=pos1[,pos2] ' specifies the sorted key, which is the starting and ending field for each row sort (if the POS2 is omitted, the end position is the end of the line). The POS format is "f[." C][opts] ", where f represents the ordinal of the field, and C represents the ordinal of the character in the field. The positions of fields and characters are calculated starting from 1. If the character position of POS2 is specified as 0, the last character in the POS2 field is represented. If POS1 is omitted in the ". C ", the default value is 1 (the starting character of the field), if omitted in POS2. C ", the default value is 0 (the terminating character of the field). opts is a sort option that overrides the global options so that the key can be sorted by independent sorting options. Keys can span multiple fields. ( Note: opts specifies that the role of POS1 and POS2 is the same as a "-K" specifies a key, either POS1 or POS2, which is valid for this key, but "B"except for options, see below) Example: To sort a second field, use "--key=2,2" (-K 2,2). You can use the--debug option to help you view, analyze, and determine the fields that are used for sorting in each row. '--debug ' shows the part used for sorting in each row. Additional information is also given. '-o output-file '--output=output-file ' writes the sorted output to output-file. In general, Sort reads all the inputs before opening the output file, so you can safely save the sorting results to the input file, like"Sort-o file1 file1" and "Cat File1 | sort-o file1". However, the "-M" option opens the output file before reading the input, so the following statement is an unsafe statement: "Cat File1 | Sort-m-O file1-". '-S '--stable ' prohibits sort execution ' final sort '. This option will not work if you do not specify a field option or a global option unless you specify "-R"options. (Note: The final sort: At the same time as the comparison of key, the last method of sort is to sort the whole line again in a completely default order, that is, the whole line is sorted alphabetically and in ascending order.) This is called "Final Sort". If no option is specified, it is itself completely default, so there is no need to do the final sort. If the "-r" option is specified, because "-r" is the reverse sort of the final result, it will affect the result of this "last Order") '-t SEPARATOR '--field-separator=separator ' when searching for key in each line, use s The eparator character is used as the delimiter for the field. By default, fields are between whitespace and non-whitespace empty string is split. Therefore, if the input behavior"Foo Bar", the default will be divided into two fields"Foo"And"Bar", (note: Empty characters between white and non-whitespace characters are the beginning of the line and"oo"position). The field delimiter is not the content in the delimited field, so"sort-t ""Right"Foo Bar"when delimited, divides into 3 fields: Empty field,"Foo"And"Bar". However, each individual field is extended to the end of the line, just like "-K 2", or like "-K 2,3"fields that contain scopes, all of which retain the field delimiter when extended. (Note: to Sort-t' 'As an example,"- K 2"Actually, it means"Foo Bar", it extends to the end of the line, and the middle field delimiter is preserved. and"- k"Actually, it means"Foo", because the key is explicitly specified to the end of the second field, but the middle field delimiter remains) if you want to specify that the field delimiter is empty, use the" /"For example"sort-t ' + '"。'--parallel=n ' sets the number of parallel threads that the sort runs to N. By default, n is set to the number of CPUs that can be obtained, but the maximum limit is 8, because the performance gain decreases after more than 8. '-u '--unique ' in general, '-u ' will output only the first row of the repeating row after sorting. This option disables "final sorting" (note: see previous translation). "Sort-u" and "sort | Uniq "is equivalent, but may not be equivalent after extending more options, for example," Sort-n-U "only checks the uniqueness of the numeric part, but" sort-n | Uniq "After sorting the values of the rows, Uniq checks the uniqueness of the entire row. '-Z '--zero-terminated ' uses '/' to split each row instead of using a newline character. The key specified by "-K" can be followed by an option such as "Bfhgnr", in which case the key will not inherit the global option. In addition to the "B" option, all options work on the entire key, whether it is written on POS1 or POS2. If the "B" option is specified, it only acts independently on POS1 or POS2, but if the global "-B" is inherited, it acts on the entire key. If the input row contains leading white space characters and no "-t" option is used, "-K" is usually combined with "-B"or some implied option to ignore leading whitespace characters (Ghn), otherwise leading whitespace characters can cause the divided field to be very confusing. if the field or character position specified in Pos exceeds the end of the line or field, the key is empty. If the "-B" option is specified, ". C section will be calculated from the first non-whitespace character of the field. Here are some examples that illustrate the use of different options: * Sort by value, and descending (reverse) sort-n-R * Alphabetically, ignoring first and second fields, and ignoring leading whitespace for the third field. A single key is used here, starting with a non-whitespace character in the third field, extending to the end of the line. This whole key is sorted alphabetically. Sort-k 3b * Sorts the second field numerically and de-sorts the rules by specifying the 3rd, 4 characters between the fifth field and alphabetical order. Use ":" as the field delimiter. Sort-t:-K 2,2n-k 5.3,5.4 (Note: Whenever you want to sort a field, it is advisable to specify its starting and ending positions explicitly) Note that if you write a"- k 2n"Instead of"- k 2,2n", the key will extend from the second field to the end of the line, which is the main sort key, and the secondary sort key"- k 5.3,5.4"Sort the main sort key and then sort by alphabetical order. In the vast majority of cases, it is generally not the desired behavior to have the key back extended. also be aware that the "n" option is scoped to the first key. This is equivalent to "-K 2n,2" or "-K 2n,2n". All modifiers except "-B" , whether written in Pos1 or POS2, will act on the entire key. (Note: because the N option cannot span key, it is even written as "-K 2n" is also equivalent, but the following two commands are different: Sort-t:-K 2-k 5.3,5.4n sort-t:-K 2,2-k 5.3,5.4n because the default character set collation spans key, the first command begins with the 2nd field until the end of the line, so The entire key is sorted by character first, and then the secondary key is sorted by numeric value on that basis. As in the following example: Sort-t:-K 5n-k 2 Even if the field of the primary key is behind the field of the secondary key, the secondary key will still span the primary key because it is a character set sort. * Sort the 5th field of the/etc/passwd file and omit the leading whitespace. If the 5th field sort results are equal, the UID of the 3rd field is further sorted by value. The field delimiter is ":". Sort-t:-K 5b,5-k 3,3n/etc/passwd sort-t:-n-k 5b,5-k 3,3/etc/passwd sort-t:-b-k 5,5-k 3,3 N/ETC/PASSWD above three commands are equivalent. The first command specifies the POS1 of the first key to ignore the leading whitespace, and the second key to sort by numeric values. In the other two commands, the key with the missing option inherits the global option. The reason for this inheritance to work correctly is that "-K 5b,5b" and "-K 5b,5" are equivalent. * To sort a series of log files, the main sort key is IPv4, and the secondary sort key is timestamp. If both the primary and secondary keys are identical, they are output in the relative order of the files being read. The log file contains a line format that is roughly the following:[01/apr/2004:06:31:51 +0000] Message 1[24/apr/200 4:20:17:39 +0000] Message 2 uses a single space to precisely segment these fields. The IPV4 address columns are sorted in dictionary order, such as less than because 61 is less than 129. Sort-s-T ' k 4.9n-k 4.5m-k 4.2n-k 4.14,4.21 File*.log | Sort-s-t '. '-1,1n-k 2,2n-k 3,3n-k 4,4n The example cannot be implemented with just one sort statement because the IPV4 address needs to use "." Delimited, and timestamps need to be separated by spaces. Therefore, use two sort statements: The first sort statement is sorted by timestamp, and the second statement is sorted by IPV4. The first sort command uses "-K" to isolate each field, sorted by year, then by month, then by day, and then by the time: minutes: seconds. In addition to the key "time: minute: Seconds", the rest of the keys do not need to specify the end position of the key, because the "n" and "M" options do not span the left edge of each key in the domain. The second sort command sorts the IPv4 addresses in dictionary order. The second sort statement uses the "-S" option to prevent the relationship of the primary sort key from being broken by the secondary sort key, and the "-S" option is used in the first sort statement to ensure consistency of the two sort statements on the "-S" attribute.

(Note: Because the N option cannot span key boundaries and non-mathematical characters, the second sort command above and
The following commands are equivalent:
Sort-s-t '. '-n-k1-k2-k3-k4

Back to series article outline:

Reprint Please specify source: Note: If you think this article is not bad please click on the lower right corner of the recommendation, with your support to inspire the author more enthusiasm for writing, thank you very much!

Linux Sort Command Chinese manual (info sort translation)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.