Understand sort and uniq commands (including uniq's-u and-d options) [plain] cat a B | sort | uniq> c # c is a union B union cat a B | sort | uniq-d> c # c is a intersect B intersection cat a B B | sort | uniq-u> c # c is set difference a-B difference set view the content of file a B [root @ hadoop luanshous#] # cat a
Uniq can check for repeated occurrences of rows in a text file.
Grammar
Uniq [-cdu][-f
[Input file] [Output FILE]
Parameters:
-C or--count displays the number of occurrences of the row next to each column.
-D or--repeated displays only the rows that appear repeatedly.
-f
-s
-U or--unique only show one row at a time.
-w
--help display Help.
--version Displays version informati
Seven examples of the uniq command: the uniq command in Linux can be used to process repeated lines in text files, this tutorial explains some of the most common usage of the uniq command, which may be helpful to you. The following File test will be used as the test file... seven examples of the uniq command: the
Turn: http://www.justwinit.cn/post/3671/
Note:You can use the uniq command to delete adjacent duplicate rows:UniqHowever, if a text contains duplicate but not adjacent lines, it cannot be deleted. You need to combine the sort command:Sort | uniqThe equivalent sort command is:Sort-u========================================================== =Statistics after deduplication:Sort needsort.txt | uniq | WC
Origin
uniq [Options] File
Description: This command reads the input file and compares the adjacent rows. Under normal circumstances, the second and more repeated rows will be deleted, and the row comparison is based on the sort sequence of the character set used. The result of this command is written to the output file. The input file and output file must be different. If the input file is represented by "-", it is read from standard input.
The options fo
The-u parameter of the sort command for basic text processing-uniq command is only valid for key-value segments. sort will discard records with the same key-value segments even if the other parts are different. To put it simple, Linuxshell uses the existing tools and commands in Linux in a certain way (process control and entry... the-u parameter of the sort command for basic text processing-uniq command is
Uniq removes duplicate rows or statistics
Main options
-U (unique) and-D (repeated)-C (Statistics)-f (number of skipped fields, default tab)-S (same as F, unit character)
1. Print the same record once (it must be sorted)
Sort a.txt | uniq or
Sort-u a.txt-O B .txt [root @ m165 root] # Cat a.txt
A B 2
A B 4
A B 2
A d 4
A B 4
Count duplicate rows
[Root @ m165 root] # Sort a.txt |
I believe that under Linux for file operations often use the sort and Uniq commands, the following system describes the use of these two commands.Sort Command is very useful in Linux, it will sort the files and output the sorting results standard. The sort command can get input from either a specific file or from stdin. Grammar Sort (options) (parameters)Options -B: Ignores whitespace characters that begin before each line;-C: Check whether the fil
Sort
The sort command sorts the rows in the files specified by the file parameter and writes the results to the standard output. If the file parameter specifies multiple files, the sort command connects the files and sorts them as a file.
Sort syntax
[Root@www ~]# Sort [-fbmnrtuk] [file or stdin]
options and Parameters:-
F : Ignores case differences, such as a and a as encoding;
-B : Ignores the front spaces part; -M: Sort by the name of the month, such as the sorting method of a few lines,
The sortsort command sorts the rows in the File specified by the File parameter and writes the result to the standard output. If the File parameter specifies multiple files, the sort command
The sort command sorts the rows in the File specified by the File parameter and writes the result to the standard output. If the File parameter specifies multiple files, the sort command connects these files and sorts them as one File.
Sort syntax
[root@www ~]# sort [-fbMnrtuk] [file or stdin]Options and p
1. Uniq commandUniq-report or omit repeated linesDescription: Uniq A unique check of the specified ASCII file or standard input to determine which rows are repeated in the text file. Often used for system troubleshooting and log analysisCommand format:Uniq [OPTION] ... [File1 [File2]]Uniq removes duplicate lines from the sorted text file File1, outputting to stan
The uniq command of a linux Command every day is used to remove repeated row output from the file. (Do not change the original file) uniq -- help can be used to view command parameters. Uniq file1 shows the content in file1, and repeated rows are only displayed once. Uniq-c file1 shows the content in file1, and the rep
uniq Command Introduction:This command reads the input file and compares adjacent rows.1. command format:Uniq [OPTION] ... [INPUT [OUTPUT]]2. Command function:The second and later repeating rows are deleted, and the row comparison is based on the sort sequence of the character set used. The result of the command processing is written to the output file. The input file and output file must be different. If the input file is represented by "-", it is
Reprint Address: http://blog.51yip.com/shell/1022.htmlInstance details Linux go down except duplicate line command UniqOne, what's uniq for?The duplicate lines in the text are basically not what we want, so we're going to get rid of them. There are other commands in Linux that can remove duplicate lines, but I think Uniq is a convenient one. When using Uniq, pay
Original article address
Linux Shell learning: How to Use the uniq command to introduce the function of the uniq command: display unique rows, only once for those repeated rows! Next we will explain through practical examples. [keyword] Linux Shell uniq
Check the content of the test.txt file and you can see the repeated rows in the file.
Root@hexu.org ~ #Ca
Linux sort uniq awk head completes the access log statistics sorting function, uniqawk
We often collect some access logs during development. The URLs in the access logs are massive and many of them are duplicated. Take the url as an example. Count the top 5 URLs that appear frequently in the url and sort them in descending order.
Linux Command:Cat url. log | sort | uniq-c | sort-n-r-k 1-t ''| awk-F' //'' {p
When we develop, we often count some access logs, the URLs in the access log are huge, and many are duplicates. In the URL, for example, the URL appears in the first 5 frequency of the number of URLs, and the number of occurrences in descending order.Linux commands:cat url.log | sort | uniq-c |sort-n-r-k 1-t ' | awk-f '//' {print $} 'Now let's analyze the meaning of these combinations of commands.0) Sample Access log1) cat T1.logIndicates that the co
1. uniq-report or ignore duplicate rowsby default, only the same rows are adjacent to only one parameter the most commonly used general and sort commands are used to count the number of repeating rows. NAME Uniq-report or omit repeated linessynopsis uniq[option] ... [INPUT [OUTPUT]] Common Parameters-C,--count #统计次数会把重复出现行的次数统计好打印到每一行的前面Example 1.1 statistics of
One, uniq what to do with the
The duplicate lines in the text are basically not what we want, so we need to get rid of them. Linux has other commands to remove duplicate rows, but I think Uniq is a more convenient one. When using Uniq, pay attention to the following two points1, when manipulating text, it is typically used in combination with the sort command, b
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.