From: http://www.ibm.com/developerworks/cn/linux/l-tip-prompt/l-tiptex6/
Repeated rows usually do not cause problems, but sometimes they do. In this case, you don't have to spend an afternoon preparing filters for them. The uniq command is a handy tool. Learn how it saves your time and energy.
After sorting, you will find that some rows are duplicated. Sometimes this duplicate information is not required. you can remove it to save disk space. You do not have to sort text lines, but remember
uniq
When reading rows, they are compared and only two or more consecutive rows are removed. The following example shows how it actually works:
Listing 1. Remove duplicate rows with uniq
<Br/> $ cat happybirthday.txt <br/> happy birthday to you! <Br/> happy birthday to you! <Br/> happy birthday dear tux! <Br/> happy birthday to you! <Br/> $ sort happybirthday.txt <br/> happy birthday dear tux! <Br/> happy birthday to you! <Br/> happy birthday to you! <Br/> happy birthday to you! <Br/> $ sort happybirthday.txt | uniq <br/> happy birthday dear tux! <Br/> happy birthday to you! <Br/>
Warning do not use
uniq
Or any other tool that removes duplicate rows from a file that contains financial or other important data. In this case, repeated rows almost always represent another transaction of the same amount, removing it will cause a lot of difficulties for the accounting department. Never do this!
|
More information about uniq
This series of articles introduces the text utility, which supplements the information found on the book page and information page. If you open a new terminal window and enter man uniq Or info uniq Or open a new browser window and view Uniq manual page at gnu.org For more information. |
|
What if you want to make your work easier, such as displaying only unique or repeated rows? You can use
-u
(Unique) and
-d
(Repeated) options to achieve this, for example:
Listing 2. Use the-U and-D options
$ Sort happybirthday.txt | uniq-u <br/> happy birthday dear tux! <Br/> $ sort happybirthday.txt | uniq-D <br/> happy birthday to you!
You can also use
-c
Option from
uniq
To obtain some statistics:
Listing 3. Use the-C option
$ Sort happybirthday.txt | uniq-UC <br/> 1 Happy Birthday dear tux! <Br/> $ sort happybirthday.txt | uniq-DC <br/> 3 happy birthday to you! <Br/>
Even if
uniq
It is still useful to compare the complete line, but it is not all of the functions of the command. It is particularly convenient to use:
-f
Option, followed by the number of fields to be skipped. It can skip a specified number of fields. This is useful when you view system logs. Generally, some items are replicated many times, which makes it difficult to view logs. Easy to use
uniq
The task cannot be completed because each item starts with a different time stamp. However, if you tell it to skip all the time fields, your logs will become easier to manage at once. Try
uniq -f 3 /var/log/messages
.
There is another option
-s
, Its function is like
-f
Same, but skipped the given number of characters. You can use it together
-f
And
-s
.
uniq
Skip the field and then skip the character. If you only want to use pre-configured characters for comparison, what should you do? Try it
-w
.