In simple terms, this technique corresponds to the following scenario
Suppose there is text as follows
Copy Code code as follows:
Cccc
Aaaa
bbbb
dddd
bbbb
Cccc
Aaaa
Now it needs to be processed, this is simple, sort-u can be done, but if I want to keep the original order of the text, for example, there are two aaaa, I just want to remove the second AAAA, and the first AAAA in front of BBBB, after heavy, still want to be in front of it, So I expect the output to be
Copy Code code as follows:
Of course, the problem itself is not difficult, in C + + or Python is easy to write, but the so-called murder with a sledgehammer, can be used to solve the shell command, it is always our first choice. The answer is given at the end, and here's how I think about it.
We sometimes want to add our own directory to the environment variable path, which is written in the ~/.BASHRC file, such as the directory to be added to $home/bin
Copy Code code as follows:
Export path= $HOME/bin: $PATH
So we're going to append the path $home/bin to the path and get it on top, but when we execute the source ~/.BASHRC, the $HOME/bin directory will be added to the path if we add another directory next time, such as
Copy Code code as follows:
Export path= $HOME/local/bin: $HOME/bin: $PATH
When you execute source ~/.BASHRC, $HOME/bin directory in the path will actually have two records, although this does not affect the use, but for an obsessive-compulsive disorder, this is unbearable, so the problem becomes, we need to remove the $path in the duplicate path, and keep the original path order unchanged, which is the original who is in front, after the weight is still in front, because in the execution of the shell command from the first path to start looking, so the order is very important
Okay, so much that we're going to reveal the final result, taking the data at the beginning of the article as an example, assuming the input file is in.txt, the command follows
Copy Code code as follows:
Cat-n In.txt | sort-k2,2-k1,1n | Uniq-f1 | sort-k1,1n | cut-f2-
These are very simple shell commands, and here's a little explanation.
Copy Code code as follows:
Cat-n in.txt: Output text, preceded by a line number, separated by \ t
sort-k2,2-k1,1n: Sort the input, primary key is the second field, second key is the first field and is sorted by number size
UNIQ-F1: Ignores the first column, and then the text is weighed, but the output contains the first column
sort-k1,1n: Sort the input, key is the first field and sorted by number size
cut-f2-: Output The 2nd and subsequent contents, the default separator is \ t
You can start from the first command, and then combine to see the actual output effect, so it is easier to understand. What to do with duplicate paths in $path, or in the previous example, just use TR to do the conversion.
Copy Code code as follows:
Export path= $HOME/local/bin: $HOME/bin: $PATH
Export path= ' echo $PATH | Tr ': ' \ n ' | Cat-n | sort-k2,2-k1,1n | Uniq-f1 | sort-k1,1n | cut-f2-| Tr ' \ n ': '
In fact, there is a problem with the path, such as after we executed the above command, if you want to remove $home/bin this path, only to modify the following is not enough
Copy Code code as follows:
Export path= $HOME/local/bin: $PATH
Export path= ' echo $PATH | Tr ': ' \ n ' | Cat-n | sort-k2,2-k1,1n | Uniq-f1 | sort-k1,1n | cut-f2-| Tr ' \ n ': '
Since we've added $home/bin to the $path, this does not work as a deletion, perhaps the best way is to know all the paths clearly and then show the designation instead of taking the additional way