Linux commands: Cut

Source: Internet
Author: User

Introduction to the Cut command:

Cut is a selection command that analyzes a piece of data and takes out what we want. In general, the selection of information is usually for "line" to analyze, not the entire information analysis.

1. command format:

Cut [-bn] [file]

Cut [-c] [file]

Cut [-DF] [file]

2. Command function
Cut bytes, characters, and fields from each line of the file and write those bytes, characters, and fields to standard output.
If you do not specify a File parameter, the Cut command reads standard input. One of the-B,-C, or-f flags must be specified.


3. Main parameters
-B: Split in bytes. (bytes)

These byte locations will ignore multibyte character boundaries unless the-n flag is also specified.
-C: Split in characters. (characters)

-D: Custom delimiter, default is tab.
-F: Used with-D to specify which area to display. (Fields)

-N: Cancels splitting multibyte characters. Used only with the-B flag.

If the last byte of the character falls within the range of the <br/> indicated by the List parameter of the-B flag,

The character will be written out, otherwise the character will be excluded.

4. Command instance:

instance one: position "byte" to display the 3rd and 14th characters respectively

[email protected] tmp]# who

Root pts/0 2016-10-07 09:00 (10.*.*.*)
Root PTS/1 2016-10-07 09:37 (10.*.*.*)

[email protected] tmp]# who | Cut-b 3
O
O

[email protected] tmp]# who | Cut-b 14
0
1


Example two:

1, "byte" positioning, while displaying the 3rd, 4 and 14th characters

[email protected] tmp]# who | Cut-b 1-4,14
Root0
Root1


2, the order from small to large positioning, and then extracted. Can not reverse the order of positioning

[email protected] tmp]# who | Cut-b 10,1-4,14
Rootp0
Rootp1

3, 3 means from the first byte to the third byte

[email protected] tmp]# who | Cut-b-3
Roo
Roo

4, 3-Indicates from the third byte to the end of the line

[email protected] tmp]# who | Cut-b 3-
OT pts/0 2016-10-07 09:00 (10.*.*.*)

OT pts/1 2016-10-07 09:37 (10.*.*.*)

[email protected] tmp]# who | Cut-b -3,3-
Root pts/0 2016-10-07 09:00 (10.*.*.*)

Root PTS/1 2016-10-07 09:37 (10.*.*.*)

5, are single-byte characters, so there is no difference between-B and-C,

If you extract Chinese,-B will only be silly in bytes (8-bit bits) to calculate, the output is garbled.

[[email protected] tmp] #cat cut_ch.txt
Monday
Tuesday
Wednesday
[[email protected] tmp]# Cut-b 3 cut_ch.txt garbled



[[email protected] tmp] #cut-C 3 cut_ch.txt
One
Two
Three

When you encounter multibyte characters, you can use the-n option, which is used to tell the cut not to disassemble multibyte characters. Examples are as follows:

[email protected] tmp]# cat Cut_ch.txt |CUT-NB
Star
Star
Star


6, Domain, is set "spacer", and then set "extract the first few domains"


Why is there a "domain" extraction, because the B and C just mentioned can only extract information in a fixed-format document, and for non-fixed-format information is helpless. This is where "domain" comes in handy.

If you look at the/etc/passwd file, you will find that it is not the same format as the WHO output, but rather fragmented emissions. However, colons play a very important role in each line of the file, and colons are used to separate each item.


Take the first 3 lines of/etc/passwd as an example:

[Email protected] tmp]# Cat/etc/passwd|head-n 5
Root:x:0:0:root:/root:/bin/bash
Bin:x:1:1:bin:/bin:/sbin/nologin
Daemon:x:2:2:daemon:/sbin:/sbin/nologin


Use-D to set the delimiter as a colon, and F to set the first field I want to list, listing all the user names.

[[email protected] tmp]# cat/etc/passwd|head-n 5|cut-d:-F 1
Root
Bin
Daemon


When you set-F, you can use a format such as 3-5 or 4-similar:

[Email protected] tmp]# cat/etc/passwd|head-n 5|cut-d:-F 1,3-5,7
Root:0:0:root:/bin/bash
Bin:1:1:bin:/sbin/nologin
Daemon:2:2:daemon:/sbin/nologin

7, tab identification, the following method can be seen whether a space is composed of a number of spaces or a tab.

[email protected] tmp]# cat Tab_space.txt
This is tab finish.
This is several space finish.


[[email protected] tmp]# sed-n l Tab_space.txt (the character behind sed-n is lowercase of l)

This is tab\tfinish.$
This is several space finish.$


If a tab is displayed, it will appear as the \ t symbol, and if it is a space, it will be displayed as it is.
Tabs and spaces can be judged by this method.


8. Tabs and spaces are spacer characters

The default spacer for the-D option of cut is a tab character, which completely omits the-D option;

Cut-d ' Set a space as a spacer;


[[email protected] tmp]# cat tab_space.txt |cut-d '-F 1 (single quotes with 1 spaces and only 1 allowed)

This
This


Linux commands: Cut

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.