Linux text processing tool sort and uniq instance details

Source: Internet
Author: User
Tags dovecot

Linux text processing tool sort and uniq instance details

Linux text processing tool sort and uniq instance details

Sort: Sorts input rows by key value field, data type option, and locate

Syntax: sort [option] [file (s)]

Main options:

-B. Ignore the white space at the beginning.

-C. Check whether the files are correctly sorted.

-F case-insensitive sorting is considered as uppercase letters

-M combines several sorted files into a sorted output data stream.

-M: sort the first three letters by the abbreviation of the month.

-K defines the sort key value field and sorts it by that field (file ).

-N is sorted by the value size.

-O outfile: saves the sorted results to the specified file.

-R is sorted in reverse order, from large to small

-T chat uses a single character chat as the default field delimiter to replace the default blank character

-U only has unique records. All records with the same key value are discarded and the same data appears once.

-- Help: displays help.

-- Version: displays version information.

1). sort by field

Sort key value field type

Letter

Description

B

Ignore blank at the beginning

D

Dictionary order

F

Case Insensitive

G

Compare with a general floating point number, only applicable to the GNU version

I

Ignore unprintable characters

N

Compare with an integer (number)

R

Inverted sorting order

Instance 1: ordered by traditional ASCII codes

[Bkjia @ test ~] $ LC_ALL = C sort/etc/passwd

# Gzdev1: x: 829: 829:/home/gzdev1:/bin/bash

# Gzdev2: x: 830: 830:/home/gzdev2:/bin/bash

...

Meat: x: 814: 814:/home/Meat:/bin/bash

Adm: x: 3: 4: adm:/var/adm:/sbin/nologin

...

Bin: x: 1: 1: bin:/sbin/nologin

Cvsroot: x: 778: 502:/home/cvsroot:/bin/bash

...

Messages: x: 81: 81: System message bus: // sbin/nologin

Dovecot: x: 99: 99: dovecot:/usr/libexec/dovecot:/sbin/nologin

...

Ftpuser: x: 505: 505:/home/ftpuser:/bin/bash

EM: x: 42: 42:/var/EM:/sbin/nologin

 

Appendix:

# LC_ALL = C is used to remove all localization settings so that the command can be correctly executed.

# LC_ALL: it is a macro. If this value is set, this value will overwrite the setting values of all LC. Note that the LANG value is not affected by this macro.

# "C" is the default locale of the system, and "POSIX" is the alias of "C. So when we install a new system, the default locale is C or POSIX.

 

Instance 2: sorted by user name

[Bkjia @ test ~] $ Sort-t:-k1, 1/etc/passwd

Adm: x: 3: 4: adm:/var/adm:/sbin/nologin

Avahi: x: 70: 70: Avahi daemon: // sbin/nologin

Bin: x: 1: 1: bin:/sbin/nologin

Cvsroot: x: 778: 502:/home/cvsroot:/bin/bash

Daemon: x: 2: 2: daemon:/sbin/nologin

Ftp: x: 14: 50: FTP User:/var/ftp:/sbin/nologin

EM: x: 42: 42:/var/EM:/sbin/nologin

...

#-T specifies that the separator is a semicolon, and-K specifies that the first character of the first field is sorted.

 

Instance 3: reverse UID sorting

[Bkjia @ test ~] $ Sort-t:-k3nr/etc/passwd

[Bkjia @ test ~] $ Sort-t:-k3nr, 3/etc/passwd

# The more precise field type should be-k3, 3nr or-k3nr, 3 or-k3, 3-n-r,

# Start from Field 3, sort in reverse order of value type, and end with field 3

Nfsnobody: x: 65534: 65534: Anonymous NFS User:/var/lib/nfs:/sbin/nologin

Sninf_kenchoi: x: 860: 860:/home/sninf_kenchoi:/bin/bash

Bkjia: x: 859: 859:/home/bkjia:/bin/bash

Gz_kinma: x: 857: 857:/home/gz_kinma:/bin/bash

Sninf_tonyhung: x: 856: 856:/home/sninf_tonyhung:/bin/bash

Sninf_simonlau: x: 855: 855:/home/sninf_simonlau:/bin/bash

Sninf_kenchan: x: 854: 854:/home/sninf_kenchan:/bin/bash

Sninf_thomaschan: x: 853: 853:/home/sninf_thomaschan:/bin/bash

Gz_jones: x: 851: 851:/home/gz_jonesyan:/bin/bash

...

#-T indicates that the Delimiter is a semicolon,-K indicates that the fields are sorted by 3rd, n indicates that the fields are compared by integer, and r indicates that the fields are sorted in reverse order.

 

Instance 4: sort by unique GID

[Bkjia @ test ~] $ Sort-t:-k4n-u/etc/passwd

Root: x: 0: 0: root:/bin/bash

Bin: x: 1: 1: bin:/sbin/nologin

Daemon: x: 2: 2: daemon:/sbin/nologin

Adm: x: 3: 4: adm:/var/adm:/sbin/nologin

Lp: x: 4: 7: lp:/var/spool/lpd:/sbin/nologin

Mail: x: 8: 12: mail:/var/spool/mail:/sbin/nologin

News: x: 9: 13: news:/etc/news:

...

#-T specifies that the Delimiter is a semicolon, and-K indicates that the fields are sorted by 4th. n indicates that the fields are compared by integers, And the u table indicates that the fields are uniquely sorted.

 

2). Sort text blocks

[Bkjia @ test ~] $ Cat> my-friends

# SORTKEY: ma, Kin

Kin ma

Zhujiangxincheng 78

D-305 Letaijie

TaiShan

# SORTKEY: yan, Jones

Jones yan

Dongpu 68

B _602 Dongpujie

YangJiang

# SORTKEY: wu, Will

Will wu

Shangshe 36

A_205 Heguanlu

MaoMing

[Bkjia @ test ~] $ Cat my-friends |

Awk-v RS = "" '{gsub ("\ n", "^ Z"); print}' |

Sort-f

# SORTKEY: ma, Kin ^ ZKin ma ^ Zzhujiangxincheng 78 ^ ZD-305 Letaijie ^ ZTaiShan

# SORTKEY: wu, Will ^ ZWill wu ^ ZShangshe 36 ^ ZA_205 Heguanlu ^ ZMaoMing

# SORTKEY: yan, Jones ^ ZJones yan ^ ZDongpu 68 ^ ZB_602 Dongpujie ^ ZYangJiang

[Bkjia @ test ~] $ Cat my-friends | # pipeline in the address data file

Awk-v RS = "" '{gsub ("\ n", "^ Z"); print}' | # convert the address to a single row

Sort-f | # sort address data, case insensitive

Awk-v ORS = "\ n" '{gsub ("^ Z", "\ n"); print }'

# Restore the row structure. Note: Some versions cannot be restored.

# Gsub () is replaced globally, similar to the s/x/y/g architecture under sed.

# SORTKEY: ma, Kin

Kin ma

Zhujiangxincheng 78

D-305 Letaijie

TaiShan

# SORTKEY: wu, Will

Will wu

Shangshe 36

A_205 Heguanlu

MaoMing

# SORTKEY: yan, Jones

Jones yan

Dongpu 68

B _602 Dongpujie

YangJiang

[Bkjia @ test ~] $ Cat my-friends |

Awk-v RS = "" '{gsub ("\ n", "^ Z"); print}' |

Sort-f |

Awk-v ORS = "\ n" '{gsub ("^ Z", "\ n"); print}' |

Grep-V' # SORTKRY '# delete a marked row

 

3). sort stability: unstable

 

[Bkjia @ test ~] $ Sort-t _-k1, 1-k2, 2 <EOF

> One_two

> One_two_three

> One_two_four

> One_two_five

> EOF

One_two

One_two_five

One_two_four

One_two_three

 

4). Duplicate Deletion

Uniq: filter data

Option:

-C: Display repeated times

-D: only duplicate rows are displayed.

-U only displays non-duplicate rows

Usage:

Sort... | uniq...

 

Instance:

[Bkjia @ test ~] $ Cat> number

One

Two

Threefour

Four

Five

Two

One

One

[Bkjia @ test ~] $ Sort number | uniq # sort

Five

Four

One

Threefour

Two

[Bkjia @ test ~] $ Sort number | uniq-c # show the number of duplicates along with the band

1 five

2 one

1 threefour

2 two

[Bkjia @ test ~] $ Sort number | uniq-d # Show duplicate rows only

One

Two

[Bkjia @ test ~] $ Sort number | uniq-u # only show non-duplicate rows

Five

Threefour

[Bkjia @ test ~] $

This article permanently updates the link address:

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.