Splitting and merging large files using Linux commands
Linux's split command is a good choice when you are faced with splitting a large file. It contains multiple parameters and supports sharding by row and size.
The syntax of the split command is as follows:
Split [-- help] [-- version] [-a] [-B] [-C] [-l] [file to be cut] [output file name prefix]
The parameters are described as follows:
-A, -- suffix-length = the suffix length used by N (2 by default)-B, -- bytes = SIZE the byte SIZE of each output file-C, -- line-bytes = SIZE the maximum byte SIZE of each output file per line-d, -- numeric-suffixes uses the numeric suffix instead of the letter suffix-l, -- lines = NUMBER: set the NUMBER of lines in each output file. -- help displays help information -- version displays version information.
The following are examples:
1. Split the splittest.txt file into multiple files. The size of each split file is 10 MB. Command:
$ split -b 20m splitTest.txt$ lssplitTest.txt xaa xab xac
2. Split the splittest.txt file into multiple files. The size of each split file is 10 MB. Specify the prefix of the split file. The command is as follows:
$ split -b 20m splitTest.txt split$ lssplitaa splitab splitac splitTest.txt
3rd, split the splittest.txt file into multiple files, each with 0.5 million lines. Command:
$ wc -l splitTest.txt 1502216 splitTest.txt$ split -l 500000 splitTest.txt split$ lssplitaa splitab splitac splitad splitTest.txt
4. Split the splittest.txt file into multiple files, with 0.5 million lines per file. Specify the suffix of the split file as a number and the number digits as three digits. command:
$ wc -l splitTest.txt 1502216 splitTest.txt$ split -l 500000 -d -a 3 splitTest.txt split$ lssplit000 split001 split002 split003 splitTest.txt
You can use the cat command to merge the split files into new files:
$ cat split0* > original.txt