Introduction to the Cut command:
Cut is a selection command that analyzes a piece of data and takes out what we want. In general, the selection of information is usually for "line" to analyze, not the entire information analysis.
1. command format:
Cut [-bn] [file]
Cut [-c] [file]
Cut [-DF] [file]
2. Command function
Cut bytes, characters, and fields from each line of the file and write those bytes, characters, and fields to standard output.
If you do not specify a File parameter, the Cut command reads standard input. One of the-B,-C, or-f flags must be specified.
3. Main parameters
-B: Split in bytes. (bytes)
These byte locations will ignore multibyte character boundaries unless the-n flag is also specified.
-C: Split in characters. (characters)
-D: Custom delimiter, default is tab.
-F: Used with-D to specify which area to display. (Fields)
-N: Cancels splitting multibyte characters. Used only with the-B flag.
If the last byte of the character falls within the range of the <br/> indicated by the List parameter of the-B flag,
The character will be written out, otherwise the character will be excluded.
4. Command instance:
instance one: position "byte" to display the 3rd and 14th characters respectively
[email protected] tmp]# who
Root pts/0 2016-10-07 09:00 (10.*.*.*)
Root PTS/1 2016-10-07 09:37 (10.*.*.*)
[email protected] tmp]# who | Cut-b 3
O
O
[email protected] tmp]# who | Cut-b 14
0
1
Example two:
1, "byte" positioning, while displaying the 3rd, 4 and 14th characters
[email protected] tmp]# who | Cut-b 1-4,14
Root0
Root1
2, the order from small to large positioning, and then extracted. Can not reverse the order of positioning
[email protected] tmp]# who | Cut-b 10,1-4,14
Rootp0
Rootp1
3, 3 means from the first byte to the third byte
[email protected] tmp]# who | Cut-b-3
Roo
Roo
4, 3-Indicates from the third byte to the end of the line
[email protected] tmp]# who | Cut-b 3-
OT pts/0 2016-10-07 09:00 (10.*.*.*)
OT pts/1 2016-10-07 09:37 (10.*.*.*)
[email protected] tmp]# who | Cut-b -3,3-
Root pts/0 2016-10-07 09:00 (10.*.*.*)
Root PTS/1 2016-10-07 09:37 (10.*.*.*)
5, are single-byte characters, so there is no difference between-B and-C,
If you extract Chinese,-B will only be silly in bytes (8-bit bits) to calculate, the output is garbled.
[[email protected] tmp] #cat cut_ch.txt
Monday
Tuesday
Wednesday
[[email protected] tmp]# Cut-b 3 cut_ch.txt garbled
[[email protected] tmp] #cut-C 3 cut_ch.txt
One
Two
Three
When you encounter multibyte characters, you can use the-n option, which is used to tell the cut not to disassemble multibyte characters. Examples are as follows:
[email protected] tmp]# cat Cut_ch.txt |CUT-NB
Star
Star
Star
6, Domain, is set "spacer", and then set "extract the first few domains"
Why is there a "domain" extraction, because the B and C just mentioned can only extract information in a fixed-format document, and for non-fixed-format information is helpless. This is where "domain" comes in handy.
If you look at the/etc/passwd file, you will find that it is not the same format as the WHO output, but rather fragmented emissions. However, colons play a very important role in each line of the file, and colons are used to separate each item.
Take the first 3 lines of/etc/passwd as an example:
[Email protected] tmp]# Cat/etc/passwd|head-n 5
Root:x:0:0:root:/root:/bin/bash
Bin:x:1:1:bin:/bin:/sbin/nologin
Daemon:x:2:2:daemon:/sbin:/sbin/nologin
Use-D to set the delimiter as a colon, and F to set the first field I want to list, listing all the user names.
[[email protected] tmp]# cat/etc/passwd|head-n 5|cut-d:-F 1
Root
Bin
Daemon
When you set-F, you can use a format such as 3-5 or 4-similar:
[Email protected] tmp]# cat/etc/passwd|head-n 5|cut-d:-F 1,3-5,7
Root:0:0:root:/bin/bash
Bin:1:1:bin:/sbin/nologin
Daemon:2:2:daemon:/sbin/nologin
7, tab identification, the following method can be seen whether a space is composed of a number of spaces or a tab.
[email protected] tmp]# cat Tab_space.txt
This is tab finish.
This is several space finish.
[[email protected] tmp]# sed-n l Tab_space.txt (the character behind sed-n is lowercase of l)
This is tab\tfinish.$
This is several space finish.$
If a tab is displayed, it will appear as the \ t symbol, and if it is a space, it will be displayed as it is.
Tabs and spaces can be judged by this method.
8. Tabs and spaces are spacer characters
The default spacer for the-D option of cut is a tab character, which completely omits the-D option;
Cut-d ' Set a space as a spacer;
[[email protected] tmp]# cat tab_space.txt |cut-d '-F 1 (single quotes with 1 spaces and only 1 allowed)
This
This
Linux commands: Cut