Background:
Work there are two of remote computer room need to transmit data, data full name is standard, in a directory named Unified prefix plus number. such as/path/from/file. {1..100}. While the special line of the machine room is limited to the transmission speed of a single SCP process, such as maximum in 100mb/s, if you start 100 SCP directly, you will encounter the concurrent connection limit of SSH.
So it is necessary to control the concurrency number, that is, not exceeding the concurrency limit of SSH, but also the bandwidth of the single network card is nearly saturated, as soon as possible to complete the transmission (assuming that the line bandwidth is much larger than the single network
Realize
It was previously known that creating a named pipe through MKFIFO enables control of concurrency. Now to implement one.
Before this, if the Mkfifo do not understand, you can refer to this connection, the author is very detailed, I will not build wheels.
Here is a direct code and some explanation. Because the bandwidth of a single process is as above, consider 9 concurrency. The code is as follows:
1#!/bin/Bash2 3 Your_func ()4{# Use your cmd or func instead ofSleepHere. Don'T End with background (&)5 Date+%s6 Echo "SCP hostname:/home/user/path/from/file.$1 remote_host:/home/user/path/to/"7 Sleep 28 }9 Ten concurrent () One{# from $1to $2, (included $1,$2itself), Con-current $3cmd Astart=$1&& end=$2&& cur_num=$3 - -# Ff_filewhichis opened by FD4'll be really removed after script stopped the Mkfifo./fifo.$$ && Exec4<>./fifo.$$ &&RM-F./fifo.$$ - -# Initial FIFO:Write$cur _num line to $ff _file - for(i= $start; i< $cur _num+ $start; i++)); Do + Echo "init time Add $i">&4 - Done + A for(i= $start; i<= $end; i++)); Do atRead-u4# read fromMkfifo file -{# REPLY is Var forRead - Echo-E"--Current loop: [cmd ID: $i; FIFO ID: $REPLY]" - - Your_func $i - Echo "Real Time Add $ (($i + $cur _num))" 1>&4#WriteTo $ff _file in} & # & to Backgroud each processinch {} - Done to wait#waitAll con-current cmdinch{} been running over + } - theConcurrent0 8 3
The above with 3 for the concurrency number, the execution 0 to 8th 9 times, in order to display the following execution results.
1Bash concurrent.SH2--Current loop: [cmdID:0; FifoID: Init TimeAdd0 ]3--Current loop: [cmdID:1; FifoID: Init TimeAdd1 ]4--Current loop: [cmdID:2; FifoID: Init TimeAdd2 ]5 14535184006 14535184007 SCPhostname:/home/user/path/from/file.0remote_host:/home/user/path/to/8 SCPhostname:/home/user/path/from/file.2remote_host:/home/user/path/to/9 1453518400Ten SCPhostname:/home/user/path/from/file.1remote_host:/home/user/path/to/ One--Current loop: [cmdID:3; FifoID: Real TimeAdd3 ] A--Current loop: [cmdID:4; FifoID: Real TimeAdd5 ] ---Current loop: [cmdID:5; FifoID: Real TimeAdd4 ] - 1453518402 the SCPhostname:/home/user/path/from/file.3remote_host:/home/user/path/to/ - 1453518402 - 1453518402 - SCPhostname:/home/user/path/from/file.5remote_host:/home/user/path/to/ + SCPhostname:/home/user/path/from/file.4remote_host:/home/user/path/to/ ---Current loop: [cmdID:6; FifoID: Real TimeAdd6 ] +--Current loop: [cmdID:7; FifoID: Real TimeAdd7 ] A--Current loop: [cmdID:8; FifoID: Real TimeAdd8 ] at 1453518404 - SCPhostname:/home/user/path/from/file.6remote_host:/home/user/path/to/ - 1453518404 - 1453518404 - SCPhostname:/home/user/path/from/file.7remote_host:/home/user/path/to/ - SCPhostname:/home/user/path/from/file.8remote_host:/home/user/path/to/
From the date output, you can see that 3 concurrency is performed every 2 seconds.
Describe the overall process
The value of Set N is the number of concurrent numbers. By initializing N-Rows in the FIFO (which can be null values), and then using the FIFO feature, the YOUR_FUNC call is initiated once per line in the FIFO, and the FIFO is empty when the FIFO reads n times. It will block when you read it again. This starts when execution is N concurrency (1-n).
When the concurrent execution of a process your_func, any one completes the operation, the next will entertain the following statement:
echo "Real time Add $ (($i + $cur _num)" 1>&4
In this way, a new line is written to the FIFO, and the pending process of the previously blocked n+1 is read successfully and begins to execute in the {} statement block. In this way, the concurrency control is realized by the blocking function of the read FIFO.
It is important to note that when concurrency is large, multiple concurrent processes, even when using sleep for the same number of seconds to simulate, also have a sequence of process scheduling problems, and therefore do not end in the boot order, possibly after the process started.
As a result, two numbers are not necessarily equal in the output shown in the following statement. The greater the number of concurrent numbers, the greater the difference.
--Current loop: [cmd id:8; FIFO id:real time add 9 ]
Custom functions
Modify the Custom function Your_func, this function actually only needs one line to complete.
Your_func () { sleep here. Don't End with background (&) Date +%s SCP hostname:/home/user/path/from/file. $1 remote_host:/home/user /path/to/}
It is important to note that the SCP command does not need to add the & symbol to the backend. Because in the upper level has been pressed backstage concurrency.
Let's explain the 14th line of the concurrent function.
EXEC digit<> filename
This is a command that is seldom used in a common sense. Especially the ' <> ' symbol. Since I don't understand, let's check the system help.
man bash# search '
By using man bash to search for exec plus spaces, you'll find instructions for exec. Note If the direct man exec, will search to Linux programer ' s manual, is a description of the call to Execl, EXECLP, Execle, Execv, EXECVP, execvpe-execute a file of this heap of system functions.
Also pay attention to Oh,4<> These characters do not add space, must be attached to write. You can add spaces before word.
RM file
Mkfifo first creates the pipeline file, and then it binds the file to file descriptor 4 by exec. Maybe you're wondering what the RM operation is behind. In fact, when the file is bound to the file descriptor, the kernel has opened the file through the open system call, this time to perform the RM operation, delete the file inode, but the concurrent function has been connected to the File block area.
If you have encountered such a situation, you understand: if the online nginx log is not sliced, access.log will become larger, then you directly RM access.log file, the file is missing, but DF View system does not free up disk space. This is because RM just deleted the inode, but this has been opened by the open file, Nginx Process control block in the File descriptor table in the corresponding FD, there is a corresponding file pointer to the file in memory of the file table, and its in-Memory v-node table, And eventually point to the actual storage block of the file. So nginx can continue to write logs, the disk is still being written. Only restart or reload, let the process read the configuration again, re-open the corresponding file, only to find that the file does not exist, and create a new file. This is because the Inode node has been released, and when you view it with DF, you can see that the available space has increased.
Do not understand can refer to Apue Figure 3.1 and think about the explanation.
Therefore, RM 14 does not affect subsequent script execution until the end of the script, and the system reclaims all file descriptors.
Initialization
18-20 lines do the work of initializing the pipeline. There are two types of reading pipelines:
1# style12 for(i= $start; i< $cur _num+ $start; i++)); Do3 Echo "init time Add $i">&44 Done5 6# style27 for(i= $start; i< $cur _num+ $start; i++)); Do8 Echo "init time Add $i"9 Done>&4
The difference is that the characters ' >&4 ' are placed behind the Echo statement or done, both of which are available for the Echo statement, which is for the entire for loop.
Similarly, in the next for loop, the Read command has two different ways:
# style1 for(i= $start; i<= $end; i++)); DoRead-U4{your_func $iEcho "Real Time Add $ (($i + $cur _num))" 1>&4#WriteTo $ff _file}& Done# style2 for(i= $start; i<= $end; i++)); DoRead {your_func $iEcho "Real Time Add $ (($i + $cur _num))" 1>&4#WriteTo $ff _file}& Done<&4
About reply
Explain the reply variable again. This is what the Read command reads from the FIFO in the above loop. In fact, in the entire script, there is no need to pay attention to this point. But here is the accompanying explanation.
Through the ability to read and write FIFO, the Echo is realized as follows:
--Current loop: [cmd id:7; FIFO id:real time add 7]
How do you know about the reply? We have to get a man again. In order to find the parameters of read. First man read found the wrong. Look again as read is the Bash self-built command.
1 Mans 2 # search ' Shell Variables ' 3 4 REPLY Set to the line of input read by the read builtin command when no arguments is supplied.
"Mkfifo" creates named pipes in the Shell to control concurrent execution of multiple processes