The cut command for Linux file operations

Source: Internet
Author: User


One,cut command

1,the Cut command is the command to intercept the line. First look at the man in the cut command parameter

      -b, --bytes=LIST               select only these bytes       -c, --characters=list               Select only these characters      -d, --delimiter=delim               use DELIM  instead of tab forfield delimiter      -f, --fields= list              select only  These fields;  also print any line that contains nodelimiter  character, unless the -s option is specified       -n    &nBsp;with -b: don ' t splitmultibyte characters      -- complement              complement  the set of selectedbytes, characters or fields       -s, --only-delimited               do not print lines not containingdelimiters       --output-delimiter=string               Use string as the output delimiterthe default is to use the  input delimiter

2, parameter explanation

-B: Split in bytes. These bytes ignore multi-byte boundaries unless you specify- n. What is multi-byte? For example,

characters, a Chinese account of two bytes (different encoding, possibly 2-4 bytes).

-N: Used only with -B to cancel splitting multibyte characters.

Cases:

[[email protected] test2]# more Test.txtabc123ijkmnq[[email protected] test2]# cut-b 1 test.txta1im[[email protected] tes t2]# more test2.txt #我这是ut-f8 code, a Chinese account of 3 bytes 123456789 characters 123 #不加-n parameter indicates that only 6 bytes are printed, so Chinese cannot print [[email protected] test2]# Cut-b 6 test2.txt6# Cancel the split, as long as the 6th oneself contains imminent printing [[email protected] test2]# CUT-NB 6 test2.txt6 [[email protected] test2]# CUT-NB 4 test2.t xt# the 4th byte does not completely contain a single character, it is not printed in Chinese 4

-C: split in character bits. When a character occupies one byte , it is the same as the-B effect, but is different for multibyte characters.

Cases:

[[email protected] test2]# Cut-b 3 test.txtc3kq[[email protected] test2]# cut-c 3 test.txtc3kq[[email protected] test2]# Cut-b 3 Test2.txt3 [[email protected] test2]# cut-c 3 TEST2.TXT3 Word Three


-D: Custom delimiter, default is tab.

-f fields -f divided by the delimiter, so general and -d use together. Because can be specified to specify the

       The characters are split, Use more flexible, so -f and -d is a common parameter.

such as the PASSW file, separate each field with a colon ":". [Email protected] test2]# cat/etc/passwd|head-n 5root:x:0:0:root:/root:/bin/bashbin:x:1:1:bin:/bin:/sbin/ nologindaemon:x:2:2:daemon:/sbin:/sbin/nologinadm:x:3:4:adm:/var/adm:/sbin/nologinlp:x:4:7:lp:/var/spool/lpd:/ sbin/nologin# will only print out the 5th column after the ":" Partition [[email protected] test2]# cat/etc/passwd|head-n 5 |cut-d:-F 5ROOTBINDAEMONADMLP

-S: Prints lines that do not have a separator. Suppose a file line is split with "-" , and if a line does not have a "-" symbol, it does not print

[Email protected] test2]# more Test6.txt1-2-3-4a,b,c,de-f-g-hh:i:j:ki-ii-iii-iv-v[[email protected] test2]# cut-d '-' -f 1-test6.txt1-2-3-4a,b,c,de-f-g-hh:i:j:ki-ii-iii-iv-v[[email protected] test2]# cut-d '-'-F 1--stest6.txt1-2-3-4e- F-g-hi-ii-iii-iv-v

you can see that the-F is split into a single domain based on the specified character, and then you can select the desired domain.

before our example is the choice of a split byte, character or domain, in fact , cut also supports printing multiple characters at the same time, in a few examples to understand, very simple do not long explanation.

[[email protected] test2]# cat Test.txtabc123ijkmnq[[email protected] test2]# cat test2.txt123456789 Chinese characters 123 [email Protected] test2]# Cut-b 1,3 test.txt# 1th and 3rd fields ac13ikmq# 1th to 3rd domain, can also-2 means that the 1-2,2-represents the 2nd domain to the last. Note that the split starts at 1. [[email protected] test2]# cut-c 1-3test2.txt123 in text 123 [[email protected] test2]# ll |cut-d '-f1,10|sed ' 1d ' #只查看权限效果- Rw-r--r--. Test2.txt-rw-r--r--. test3.txt-rw-r--r--. Test.txt

we said just now. the-D option is separated by a tab character by default, but some files have a tab character and a space.

[Email protected] test2]# more Test3.txtid name Age 001 318,002 Li 419,003 Harry 20[[email protected] Te    st2]# cut-d '-F 2-3test3.txtid name age 001 318 [[email protected] test2]# cut-f 2-3test3.txt name age Zhang San 18002 Li 419 Wang 520

We found that two effects were not the desired result, in fact I want the result to show the name and age. Let's take a look at how the file is actually stored and know why it's the result.

[Email protected] test2]# sed-n l test3.txtid\t\345\247\223\345\220\215\t\345\271\264\351\276\204$001\t\345\274\ 240\344\270\211\t18$002 \346\235\216\345\233\233 19$003\t\347\216\213\344\272\224 20$

you can see that the first line, the second line islinked by a tab (\ t), and the third line is separated by a space, and the fourth line is a tab and a space.

in the first case, separated by a space, the first to second line has no space so it is not split at all. The third line splits into many blocks, printing 2-3 blocks, which are still spaces. The four lines are the same, notice that there are several spaces in the middle.

in the second case, the tab is divided, the first to second line can be printed as expected, the third line has no tabs that are not split so print all, the fourth line is divided into 2 pieces, so print 2-3 block, then print all the back.

To divide a space into a process:

[Email protected]]# more test5.txt1 6[[email protected]]# cut-d '-F 2 test5.txt3 [email protected]  TEST2] #cut-d '-F 3 test5.txt 4 [[Email protected]]# cut-d '-F 4 test5.txt 5 [[Email protected]]# cut-d '-F 5 Test5.txt 6

As you can understand, each delimiter is assumed to have a slash, but it does not print out and may be empty between slashes.

This imaginary " slash " can also be printed out, and can be customized with a string instead. Use --output-delimiter=string to implement.

[[Email protected]]# cut-d '-'-F 1--s test6.txt--output-delimiter= ' $ ' 1$2$3$4e$f$g$hi$ii$iii$iv$v

Replace the original delimiter with the new separator.

We also see that the cut on the whitespace processing is not so idealized, fewer parameters, remember the line.


This article is from the "Ding classmate 1990" blog, please be sure to keep this source http://dingtongxue1990.blog.51cto.com/4959501/1696201

The cut command for Linux file operations

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.