In short, this technique corresponds to one of the following scenarios
Suppose there is text as follows
ccccaaaabbbbddddbbbbccccaaaa
Now it needs to be re-processed, this is very simple, sort-u can be done, but if I want to keep the text in the original order, such as there are two aaaa
, I just want to remove the second aaaa
, and the first aaaa
in bbbb
the front, go to the heavy and still in front of it, So I expect the output to be
ccccaaaabbbbdddd
Of course, the problem itself is not difficult, it is easy to write in C + + or Python, but the so-called sledgehammer, can be solved with the shell command, it is always our first choice. The answer is given at the end, and here's how I thought about it.
We sometimes want to add our own directory to the environment variable path when it is written in the ~/.BASHRC file, such as the directory to be added to $home/bin
export PATH=$HOME/bin:$PATH
So we're adding a path to path. H OM E /bINandLetitin themostbeforefacewasSearchCableto the,butwhenI'mourPracticeLine'souR Ce /.bashR C'after, Home/bin directory will be added to the path, if we add another directory next time, such as
export PATH=$HOME/local/bin:$HOME/bin:$PATH
When executed source ~/.bashrc
, the$home/bin directory will actually have two records in path, although this does not affect the use, but for an obsessive-compulsive disorder, this is intolerable, so the problem becomes, we need to remove the path of the repeated paths, and keep the original path order unchanged, that is, who was in front, to go back to the previous, because the shell command is executed from the beginning of the first path to find, so the order is important
Okay, so much so that we're going to reveal the final result, take the data from the beginning of the article as an example, assuming the input file is in.txt, the command is as follows
cat -n in.txt | sort -k2,2 -k1,1n | uniq -f1 | sort -k1,1n | cut -f2-
These are very simple shell commands, which are explained in a little bit below
cat -n in.txt : 输出文本,并在前面加上行号,以\t分隔sort -k2,2 -k1,1n : 对输入内容排序,primary key是第二个字段,second key是第一个字段并且按数字大小排序uniq -f1 : 忽略第一列,对文本进行去重,但输出时会包含第一列sort -k1,1n : 对输入内容排序,key是第一个字段并按数字大小排序cut -f2- : 输出第2列及之后的内容,默认分隔符为\t
You can start with the first command and combine it in turn to see the actual output, which makes it easier to understand. How to deal with the duplicate path in $path, or in the previous example, just use TR before and after the conversion can be
export PATH=$HOME/local/bin:$HOME/bin:$PATHexport PATH=`echo $PATH | tr ‘:‘ ‘\n‘ | cat -n | sort -k2,2 -k1,1n | uniq -f1 | sort -k1,1n | cut -f2- | tr ‘\n‘ ‘:‘`
In fact, the use of path will have a problem, such as we have executed the above command, if you want to remove $home/bin this path, only to modify the following is not enough
export PATH=$HOME/local/bin:$PATHexport PATH=`echo $PATH | tr ‘:‘ ‘\n‘ | cat -n | sort -k2,2 -k1,1n | uniq -f1 | sort -k1,1n | cut -f2- | tr ‘\n‘ ‘:‘`
Since we have added $home/bin to the $path, doing so does not function as a delete, perhaps the best way is to know all the paths clearly, and then display the specified, instead of taking the appended way
Shell command Tips--text to weight and maintain the original order