One or eight basic character interception methods:
1. Use the # operator. Removes the first occurrence of the substr left character (including substr) from the left side, preserving the right character.
Usage is #*substr
That is, delete all characters from the left to the first "//" and all of the Left
2. Use the # # operator. Removes the last occurrence of the substr left character (including substr) from the left side, preserving the right character.
Usage is ##*subst
That is, delete the last occurrence of "/" and all of its left characters
3, use the% operator. Removes the first occurrence of the substr right character (including substr) from the right edge, preserving the left character.
Remove all characters from the right to the first "/" and to the right of the
4. Use the percent sign operator. Removes the last occurrence of the substr right character (including substr) from the right side, preserving the left character.
Delete all characters from the right to the last "/" and to the right
5, starting from the left side of the first few characters and the number of characters
To use: Start:len
That is, 0 means the first character on the left begins, and 5 represents the total number of characters.
6, starting from the left side of the first few characters to the end
To use: Start
That is, 7 means the 8th character on the left starts
7, starting from the right side of the first few characters and the number of characters
To use: 0-start:len
That is, 0-10 means the 10th character starts at the right, and 6 indicates the number of characters.
8, starting from the right side of the first few characters until the end
To use: 0-start
That is, 0-4 means the 4th character starts at the right.
Second, using cut for string interception
The Cut command accepts three methods of positioning:
1, Byte (bytes), with option-B;
2, character (characters), with option-C;
3. Field (fields), with option-F.
Third, byte-B use
(1) Positioning as "byte"
If we want to extract the 3rd byte of each row, that's it:
Note:after-B can be set to extract which byte, in fact, there is no space between-B and 3 is also possible, but the recommended space:)
(2) If "byte" is positioned, I want to extract 3rd, 4th, 5th and 12th bytes
-B supports the notation of form 3-5, and multiple positions are separated by commas.
But one thing to note is that if you use the-B option for the Cut command, when you execute this command, the cut will first sort all the positions after-B and then extract them. Can not reverse the order of positioning Oh.
(3) A little trick like "3-5"
Note:3 represents from the first byte to the third byte, and 3-from the third byte to the end of a line. you can see that in both cases, the third byte "E" is included. If the who|cut-b -3,3-is executed, the entire line is output, and no successive two overlapping e is present.
Four, the use of the character-C
But how is the-C and-B no different? In fact, just because this example is not good, so do not see, who output are single-byte characters, so with-B and c no difference, if the extraction of Chinese, the difference is out.
As above, with-C will be in character units, the output is normal, and-B only in bytes (8-bit bits) to calculate, the output is garbled. When you encounter multibyte characters, you can use the-n option, which is used to tell the cut not to disassemble multibyte characters.
Five, domain-F is what is going on?
(1) Why there is a "domain" extraction, because the B and C just mentioned can only extract information in a fixed format of the document, and for non-fixed format information is helpless, and "domain" can.
For example, the/etc/passwd file, which does not have a fixed format like who's output information, is relatively fragmented. However, the colon plays a very important role in each line of the file, and the colon is used to separate each item.
The cut command provides such an extraction method, in particular, to set the "spacer", and then set the "Extract the first domain", OK.
For example (the first five elements of/etc/passwd):
Use-D to set the delimiter as a colon: and then use the-F setting to take the first field.
(2) When you set the-F, you can also use 3-5 or 4-similar format:
(3) What to do if you encounter spaces and tabs?
Sometimes the tab is really difficult to identify, there is a way to see whether a space is composed of a number of spaces or a tab character.
If it is a tab, it will appear as the \ t symbol, and if it is a space, it will be displayed as is. Tabs and spaces can be judged by this method.
What symbols should be used in cut-d to set tabs or spaces? Actually, the default spacer of the-D option for cut is the tab \ t, so when you're using a tab, you can omit the-D option, and use-F to take the field directly.
This article is from "GREEN" blog, please make sure to keep this source http://green906.blog.51cto.com/10697569/1791108
The method of intercepting string in shell and basic usage of cut