Seq processing in a shell

Source: Internet
Author: User
Processing background of seq in a shell: extract the content of the file using shell. the file name is generated using the serial number below. There are nearly 400 million file records as follows: www.2cto.com original script #! /Bin/sh # str1 & quot; # filecount 'ls-l/root/gjj | wc-l... one-time shell seq processing
Background: use shell to extract the content in the file. the file name is generated using the serial number below. There are nearly 400 million files as follows:
Original www.2cto.com script
#! /Bin/sh
# Str1 = ""

# Filecount = 'ls-l/root/gjj | wc-l | awk '{print $1 }''
# Echo $ filecount

For n in 'seq $1 $2'
Do
Filename = "/windows_gjj/" $ {n} ". txt"
Echo $ filename

Dos2unix $ filename
Sed-I '1, 76d' $ filename
Sed-I '41, $ d' $ filename
Sed-I's/<. * "> // g' $ filename
Sed-I's/<. *> // g' $ filename
Sed-I's/^ [[: space:] * // g' $ filename
Sed-I '/^ $/D' $ filename
# Sed-I's/; // g' $ filename

# Cat $ filename>/tmp/all_gjj.log

Flag = 'grep "" $ filename | wc-l | awk '{print $1 }''

If [$ flag-ne 10]; then
Cat $ filename>/tmp/all_gjj.log
Echo "************************************** **************************************** * ************* ">/tmp/all_gjj.log
LCOUNT = 'WC-l $ filename | awk '{print $1 }''
Str1 = ""
For I in 'seq 1 10'
Do
Sed-I '1D '$ filename
Str = 'head-n 1 $ filename'
Echo $ str>/tmp. log
Str1 =$ {str1 }$ {str} "|"
Echo $ str1
Sed-I '1D '$ filename
Done

Echo $ str1>/root/gjj.txt
Fi

Done

In the script, $1 and $2 represent the starting serial number.
At the beginning, it was normal to use this script to extract the file content. However, when the file name contains seven digits, the following error occurs:

As follows:
[Root @ ALL ~] # Sh tiqu. sh 2908637 2908640
/Windows_gjj/2.90864e000006.txt
Dos2unix: converting file/windows_gjj/2.90864e000006.txt to UNIX format...
Dos2unix: problems converting file/windows_gjj/2.90864e000006.txt
Sed: unable to read/windows_gjj/2.90864e000006.txt: no file or directory
Sed: unable to read/windows_gjj/2.90864e000006.txt: no file or directory
Sed: unable to read/windows_gjj/2.90864e000006.txt: no file or directory
Sed: unable to read/windows_gjj/2.90864e000006.txt: no file or directory
Sed: unable to read/windows_gjj/2.90864e000006.txt: no file or directory
Sed: unable to read/windows_gjj/2.90864e000006.txt: no file or directory
Grep:/windows_gjj/2.90864e000006.txt: the file or directory does not exist.
Cat:/windows_gjj/2.90864e000006.txt: the file or directory does not exist.
Wc:/windows_gjj/2.90864e000006.txt: the file or directory does not exist.
Sed: unable to read/windows_gjj/2.90864e000006.txt: no file or directory
Head: unable to open "/windows_gjj/2.90864e000006.txt" to read data: No such file or directory
|
Sed: unable to read/windows_gjj/2.90864e000006.txt: no file or directory
Sed: unable to read/windows_gjj/2.90864e000006.txt: no file or directory
Head: unable to open "/windows_gjj/2.90864e000006.txt" to read data: No such file or directory
|
Sed: unable to read/windows_gjj/2.90864e000006.txt: no file or directory
Sed: unable to read/windows_gjj/2.90864e000006.txt: no file or directory
Head: unable to open "/windows_gjj/2.90864e000006.txt" to read data: No such file or directory

Analysis: The main cause of this problem is that the shell expresses the 7-digit number in the form of an exponential, resulting in the failure to find the corresponding file.

Solution: in order to enable the 7-digit display as a number, the-f,-w and other options in seq fail to achieve the expected effect. Finally, the compromise method is adopted, the highest bit is replaced by characters, and the last 6 is generated by seq. the parameter-w is used to keep the width of the digits consistent. the script for repairing this is as follows:
#! /Bin/sh
# Str1 = ""

# Filecount = 'ls-l/root/gjj | wc-l | awk '{print $1 }''
# Echo $ filecount

For n in 'seq-w $1 $2'
Do
N = "2" $ {n}
Filename = "/windows_gjj/" $ {n} ". txt"
Echo $ filename

Dos2unix $ filename
Sed-I '1, 76d' $ filename
Sed-I '41, $ d' $ filename
Sed-I's/<. * "> // g' $ filename
Sed-I's/<. *> // g' $ filename
Sed-I's/^ [[: space:] * // g' $ filename
Sed-I '/^ $/D' $ filename
# Sed-I's/; // g' $ filename

# Cat $ filename>/tmp/all_gjj.log

Flag = 'grep "" $ filename | wc-l | awk '{print $1 }''

If [$ flag-ne 10]; then
Cat $ filename>/tmp/all_gjj.log
Echo "************************************** **************************************** * ************* ">/tmp/all_gjj.log
LCOUNT = 'WC-l $ filename | awk '{print $1 }''
Str1 = ""
For I in 'seq 1 10'
Do
Sed-I '1D '$ filename
Str = 'head-n 1 $ filename'
Echo $ str>/tmp. log
Str1 =$ {str1 }$ {str} "|"
Echo $ str1
Sed-I '1D '$ filename
Done

Echo $ str1>/root/gjj.txt
Fi

Done

[Root @ ALL ~] # Sh tiqu. sh 908636 908640
/Windows_gjj/2908636.txt
Dos2unix: converting file/windows_gjj/2908636.txt to UNIX format...
|
/Windows_gjj/2908637.txt
Dos2unix: converting file/windows_gjj/2908637.txt to UNIX format...

Achieved the expected goal!
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.