Shell script -- cut command, shell script -- cut
Directory:
1.1 option description
1.2 filter by field
1.3 Use -- complement
1.4 split by byte or character
1.5 Use -- output-delimiter
1.6 specify the range in cut
1.1 option description
The cut command divides a row into multiple columns according to the specified delimiter. Its weakness is that it is difficult to handle repeated multiple separators. Therefore, it is often combined with the compression function of tr.
-b
: Filter by byte;
-n
: Used with the "-B" option, which indicates that bytes cannot be separated;
-c
: Filter by characters;
-f
: Filter by fields;
-d
: Specifies the field separator. If-d is not specified, the default field separator is "TAB". Therefore, it can only be used with the "-f" option.
-s
: Avoid printing rows that do not contain delimiters;
--complement
: Complement the selected byte, character, or field (reverse selection means or complementary set );
--output-delimiter
: Specifies the output delimiter. The default Delimiter is the input delimiter.
Assume the content shown below in/tmp/abc. sh. Note: The columns of rows 2nd to 5th are not separated by a single space. Some have repeated spaces, and some have only one space. That is to say, the text content is not very regular. There is no space in the last line.
[Root @ xuexi tmp] # cat abc. sh NO Name SubjectID Mark remarks 1 longshuai 001 56 fail 2 gaoxiaofang 001 60 pass 3 zhangsan 001 50 fail 4 lisi 001 80 pass 5 wangwu 001 90 pass djakldj; lajd; sla
The following is an example of cut.
1.2 filter by field
There are five fields in abc. sh. Filter out the second field name column and 4th field mark column. Use space as the separator.
[root@xuexi tmp]# cut -d" " -f2,4 abc.shName 001 50djakldj;lajd;sla
We can see that the output is a mess of unexpected results. The reason is that the delimiter space is repeated multiple times in the separator. To display the result correctly, you need to remove repeated spaces.
You can use the tr tool to compress consecutive characters.
[root@xuexi tmp]# cat abc.sh | tr -s " " | cut -d " " -f2,4Name Marklongshuai 56gaoxiaofang 60zhangsan 50lisi 80wangwu 90djakldj;lajd;sla
However, the last row in the output has no delimiters, and the output needs to be canceled using-s.
[root@xuexi tmp]# cut -d" " -f2,4 abc.sh -sName Marklongshuai 56gaoxiaofang 60zhangsan 50lisi 80wangwu 90
1.3 Use -- complement
Output all fields except the 2nd and 4th fields.
[Root @ xuexi tmp] # cut-d ""-f2, 4 abc. sh-s -- complementNO SubjectID remarks 1 001 fail 2 001 pass 3 001 fail 4 001 pass 5 001 pass
1.4 split by byte or character
English and Arabic numerals are single-byte characters, and Chinese characters are double-byte characters or even three-byte characters.
Use-B to filter by byte, and use-c to separate by character.
Note:-d cannot be specified when split by byte or character, because-d is used to divide fields.
[Root @ xuexi tmp] # cut-B1-3 abc. sh # filter content NO 1 l2 g3 z4 l5 wdja In the first 1-3 bytes
Garbled characters appear in the results because of filtering Chinese characters.
[root@xuexi tmp]# cut -b20 abc.sh
Therefore, the "-B" option must be combined with the "-n" option to prohibit the "-B" option from forcibly splitting multi-byte characters, resulting in garbled characters.
[Root @ xuexi tmp] # cut-n-b20 abc. sha is not 0 and
It can also be separated by characters.
[Root @ xuexi tmp] # cut-c20 abc. sh a is not 0 and
1.5 Use -- output-delimiter
Use "-- output-delimiter" to specify the output separator.
When you use-B or-c to separate multiple segments, you can use -- output-delimiter. Otherwise, these segments will be spliced together.
[Root @ xuexi tmp] # cut-b3-5, 6-8 abc. sh # spliced together Name longshgaoxiazhangslisi 0 wangwuakldj; [root @ xuexi tmp] # cut-b3-5, 6-8 abc. sh -- output-delimiter "," # Use commas to separate multiple segments of Na, me lon, gshgao, xiazha, ngslis, I 0wan, gwuakl, and dj;
1.6 specify the range in cut
You can use "N-", "N-M", and "-M" to represent N characters (or bytes or fields) per line, respectively) after all content, N-M segment content and before M segment content. Note the boundary of N and M.
[Root @ xuexi tmp] # cut-d ""-f3-abc. sh-s # output the third field and all subsequent content SubjectID Mark remarks 001 56 fail 001 60 pass 001 50 fail 001 80 pass 001 90 pass
When the range is crossed, no output is repeated. For example-f3-5, 4-6, output-f3-6.
[Root @ xuexi tmp] # cut-d ""-f3-5, 4-6 abc. sh-s # range cross SubjectID Mark remarks 001 56 fail 001 60 pass 001 50 fail 001 80 pass 001 90 pass
If the range order is unordered, Linux sorts the range (in ascending order) before outputting it. For example,-f4-6, 2 is equivalent to-f2, 4-6.
[Root @ xuexi tmp] # cut-d ""-f4-6, 2 abc. sh-s Name Mark remarks longshuai 56 fail gaoxiaofang 60 pass zhangsan 50 fail lisi 80 pass wangwu 90 pass
Back to series article outline: http://www.cnblogs.com/f-ck-need-u/p/7048359.html
Reprinted please indicate the source: Success!