Shell script -- cut command, shell script -- cut

Source: Internet
Author: User

Shell script -- cut command, shell script -- cut

Directory:
1.1 option description
1.2 filter by field
1.3 Use -- complement
1.4 split by byte or character
1.5 Use -- output-delimiter
1.6 specify the range in cut

1.1 option description

The cut command divides a row into multiple columns according to the specified delimiter. Its weakness is that it is difficult to handle repeated multiple separators. Therefore, it is often combined with the compression function of tr.

-b: Filter by byte;
-n: Used with the "-B" option, which indicates that bytes cannot be separated;
-c: Filter by characters;
-f: Filter by fields;
-d: Specifies the field separator. If-d is not specified, the default field separator is "TAB". Therefore, it can only be used with the "-f" option.
-s: Avoid printing rows that do not contain delimiters;
--complement: Complement the selected byte, character, or field (reverse selection means or complementary set );
--output-delimiter: Specifies the output delimiter. The default Delimiter is the input delimiter.

Assume the content shown below in/tmp/abc. sh. Note: The columns of rows 2nd to 5th are not separated by a single space. Some have repeated spaces, and some have only one space. That is to say, the text content is not very regular. There is no space in the last line.

[Root @ xuexi tmp] # cat abc. sh NO Name SubjectID Mark remarks 1 longshuai 001 56 fail 2 gaoxiaofang 001 60 pass 3 zhangsan 001 50 fail 4 lisi 001 80 pass 5 wangwu 001 90 pass djakldj; lajd; sla

The following is an example of cut.

1.2 filter by field

There are five fields in abc. sh. Filter out the second field name column and 4th field mark column. Use space as the separator.

[root@xuexi tmp]# cut -d" " -f2,4 abc.shName 001 50djakldj;lajd;sla

We can see that the output is a mess of unexpected results. The reason is that the delimiter space is repeated multiple times in the separator. To display the result correctly, you need to remove repeated spaces.

You can use the tr tool to compress consecutive characters.

[root@xuexi tmp]# cat abc.sh | tr -s " " | cut -d " " -f2,4Name Marklongshuai 56gaoxiaofang 60zhangsan 50lisi 80wangwu 90djakldj;lajd;sla

However, the last row in the output has no delimiters, and the output needs to be canceled using-s.

[root@xuexi tmp]# cut -d" " -f2,4 abc.sh -sName Marklongshuai 56gaoxiaofang 60zhangsan 50lisi 80wangwu 90

1.3 Use -- complement

Output all fields except the 2nd and 4th fields.

[Root @ xuexi tmp] # cut-d ""-f2, 4 abc. sh-s -- complementNO SubjectID remarks 1 001 fail 2 001 pass 3 001 fail 4 001 pass 5 001 pass

1.4 split by byte or character

English and Arabic numerals are single-byte characters, and Chinese characters are double-byte characters or even three-byte characters.

Use-B to filter by byte, and use-c to separate by character.

Note:-d cannot be specified when split by byte or character, because-d is used to divide fields.

[Root @ xuexi tmp] # cut-B1-3 abc. sh # filter content NO 1 l2 g3 z4 l5 wdja In the first 1-3 bytes

Garbled characters appear in the results because of filtering Chinese characters.

[root@xuexi tmp]# cut -b20 abc.sh  

Therefore, the "-B" option must be combined with the "-n" option to prohibit the "-B" option from forcibly splitting multi-byte characters, resulting in garbled characters.

[Root @ xuexi tmp] # cut-n-b20 abc. sha is not 0 and

It can also be separated by characters.

[Root @ xuexi tmp] # cut-c20 abc. sh a is not 0 and

1.5 Use -- output-delimiter

Use "-- output-delimiter" to specify the output separator.

When you use-B or-c to separate multiple segments, you can use -- output-delimiter. Otherwise, these segments will be spliced together.

[Root @ xuexi tmp] # cut-b3-5, 6-8 abc. sh # spliced together Name longshgaoxiazhangslisi 0 wangwuakldj; [root @ xuexi tmp] # cut-b3-5, 6-8 abc. sh -- output-delimiter "," # Use commas to separate multiple segments of Na, me lon, gshgao, xiazha, ngslis, I 0wan, gwuakl, and dj;

1.6 specify the range in cut

You can use "N-", "N-M", and "-M" to represent N characters (or bytes or fields) per line, respectively) after all content, N-M segment content and before M segment content. Note the boundary of N and M.

[Root @ xuexi tmp] # cut-d ""-f3-abc. sh-s # output the third field and all subsequent content SubjectID Mark remarks 001 56 fail 001 60 pass 001 50 fail 001 80 pass 001 90 pass

When the range is crossed, no output is repeated. For example-f3-5, 4-6, output-f3-6.

[Root @ xuexi tmp] # cut-d ""-f3-5, 4-6 abc. sh-s # range cross SubjectID Mark remarks 001 56 fail 001 60 pass 001 50 fail 001 80 pass 001 90 pass

If the range order is unordered, Linux sorts the range (in ascending order) before outputting it. For example,-f4-6, 2 is equivalent to-f2, 4-6.

[Root @ xuexi tmp] # cut-d ""-f4-6, 2 abc. sh-s Name Mark remarks longshuai 56 fail gaoxiaofang 60 pass zhangsan 50 fail lisi 80 pass wangwu 90 pass

Back to series article outline: http://www.cnblogs.com/f-ck-need-u/p/7048359.html

Reprinted please indicate the source: Success!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.