Linux Shell Multi-process concurrency and concurrency number control

Source: Internet
Author: User
Tags shebang

1. Basic knowledge Preparation 1.1. Linux background process

UNIX is a multitasking system that allows multiple users to run several programs at the same time. The Shell's metacharacters & provides a way to run programs in the background that do not require keyboard input. After the command is entered, followed by & the character, the command is sent to the Linux background and the terminal can continue to enter the next command.
Like what:

sh a.sh &sh b.sh &sh c.sh &

These three commands will be sent to the Linux background at the same time , to the extent that these three commands are executed concurrently .

1.2. linux file descriptors

The file Descriptor (abbreviated FD) is formally a non-negative integer. In fact, it is an index value that points to the record table in which the kernel opens a file for each process maintained by the process. When a program opens an existing file or creates a new file, the kernel returns a file descriptor to the process. Each UNIX process will have three standard file descriptors to correspond to three different streams:

File Descriptor name
0 Standard Input
1 Standard Output
2 Standard Error

Each file descriptor corresponds to an open file, and different file descriptors can have the same open file, the same file can be opened by a different process, or it can be opened multiple times by the same process.

In /proc/PID/fd , the PID file descriptors owned by the process are listed, for example

#!/bin/bash  source /etc/profile; # $$ represents the PID of the current process  Pid=$$# view the file descriptor of the current process point to  ll/proc/ $PID /fdecho   "-------------------" ; echo  # file descriptor 1 is bound to file Tempfd1  ([-e ./tempfd1] | | touch./TEMPFD1) && exec
      1  <>./tempfd1# view the file descriptor of the current process point to  ll/ Proc/ $PID /fdecho   "-------------------" ; echo ;  
[[Email protected]lhost learn_linux]$ sh learn_redirect.sh Total0LRWX------.1Ouyangyewei Ouyangyewei -Jan4  A: - 0/dev/pts/0LRWX------.1Ouyangyewei Ouyangyewei -Jan4  A: - 1/dev/pts/0LRWX------.1Ouyangyewei Ouyangyewei -Jan4  A: - 2/dev/pts/0Lr-x------.1Ouyangyewei Ouyangyewei -Jan4  A: - 255/home/ouyangyewei/workspace/learn_linux/learn_redirect.sh-------------------[[email protected] learn_linux] $ cat Tempfd1 Total0LRWX------.1Ouyangyewei Ouyangyewei -Jan4  A: - 0/dev/pts/0LRWX------.1Ouyangyewei Ouyangyewei -Jan4  A: - 1/HOME/OUYANGYEWEI/WORKSPACE/LEARN_LINUX/TEMPFD1LRWX------.1Ouyangyewei Ouyangyewei -Jan4  A: - 2/dev/pts/0Lr-x------.1Ouyangyewei Ouyangyewei -Jan4  A: - 255/home/ouyangyewei/workspace/learn_linux/learn_redirect.sh-------------------

In the example above, line 12th binds the file descriptor 1 to the file, and tempfile after that, the file descriptor 1 points to the tempfile file, and the standard output is redirected to the file tempfile .

1.3. Linux Pipelines

In Unix or Unix-like operating systems, pipelines are a collection of processes that are linked by standard input and output, so that each process's output is directly input to the next process.

The Linux pipeline consists of two types:

    • Anonymous pipeline
    • Named pipes

Pipeline has a feature, if there is no data in the pipeline, then the operation of the pipeline data will be stuck, until the pipeline into the data, and then read out before terminating this operation; Similarly, a write pipeline operation without a read pipeline operation will be stuck.

1.3.1. Anonymous Pipelines

In the command line of Unix or Unix-like operating systems, anonymous pipes use the vertical line in ASCII | as the anonymous pipe character, and the anonymous pipe ends up with two ordinary, anonymous, open file descriptors: A read-only and a write-only end , This allows other processes to connect to the anonymous pipeline.

For example:

cat file | less

To execute the instructions above, the shell creates two processes to execute separately cat and less . Shows how the two processes use the pipeline:

It is worth noting that two processes are connected to the pipeline so that the write process connects cat its standard output (file descriptor fd 1 ) to the write side of the pipeline, and the read process less connects its standard input (file descriptor fd 0 ) to the read-in side of the pipeline. In fact, these two processes do not know the existence of pipelines, they simply read the data from the standard file descriptor and write the data. The shell has to do the work involved.

1.3.2. Named Pipes (Fifo,first in first out)

Named Pipes are also called FIFO, semantically speaking, FIFO is actually similar to anonymous pipelines, but it is worth noting that:

    • In the file system, the FIFO has a name and is in the form of a device-specific file;
    • Any process can share data through FIFO;
    • The FIFO data flow will be blocked unless the FIFO has both read and write processes;
    • Anonymous pipelines are created automatically by the shell and exist in the kernel, whereas FIFO is created by a program (such as a mkfifo command) that exists in the file system;
    • The anonymous pipeline is a one-way byte stream, while the FIFO is a bidirectional byte stream;

For example, you can use FIFO to implement single-server, multi-client applications:

With the above knowledge preparation, it is now possible to begin to tellhow the number of processes per concurrent can be controlled when Linux multi-process concurrency occurs.

2. Multi-process concurrency control of Linux

Recently small a needs to produce the 2015 full-year KPI Data report, now small A has written the production script, production script can only produce a specified day of KPI data, assuming that a production script to run 5 minutes, then:
* If the loop sequence is executed, then it takes time: 5 * 365 = 1825 minutes, approximately equal to 6 days
* If it is put into the Linux background concurrent execution, 365 background tasks, the system can not withstand Oh!

Since it is not possible to put 365 tasks into Linux background execution at a time, can it be possible to automatically put n tasks into the background and execute concurrently? Of course it's OK.

#!/bin/bashSource/etc/profile;# -----------------------------Tempfifo=$$.fifo# $$ Indicates the PID of the currently executing fileBegin_date= $           # Start TimeEnd_date= $             # End Timeif[$# -eq 2] Then    if["$begin _date"\>"$end _date"] Then        Echo "error! $begin _date is greater than $end _date"        Exit 1;fiElse    Echo "error! Not enough params. "    Echo "Sample:sh loop_kpi 2015-12-01 2015-12-07"    Exit 2;fi# -----------------------------Trap"exec 1000>&-;exec 1000<&-;exit 0" 2Mkfifo$tempfifoexec  +<>$tempfifoRm-rf$tempfifo for((i=1; i<=8; i++)) Do    Echo>& + Done while[$begin _date!=$end _date] Do    Read-u1000 {Echo $begin _dateHive- FKpi_report.sql--hivevar date=$begin _date        Echo>& +} & begin_date= ' date- D "+1 Day $begin _date"+"%y-%m-%d"` DoneWaitEcho "Done!!!!!!!!!!"
  • Line 6th to 22nd: for example: sh loop_kpi_report.sh 2015-01-01 2015-12-01
    • $1Represents the first parameter of a script entry, equal to 2015-01-01
    • $2Represents the second parameter of a script entry, equal to 2015-12-01
    • $#Indicates the number of script parameters, equal to 2
    • The 13th row is used to compare the size of the incoming two dates, which \> is escaped
  • Line 26th: Indicates that when the script is run, if the interrupt command is received, the Ctrl+C read and write of file descriptor 1000 is closed and exits normally
    • exec 1000>&-;The write that closes the file descriptor 1000
    • exec 1000<&-;Indicates that the read of closing the file descriptor 1000
    • Trap is the capture interrupt command
  • Line 27th to 29th:
    • Line 27th, create a pipeline file
    • Line 28th, bind the file descriptor 1000 with the FIFO, < read the binding, > write the binding, <> then identify all operations on the file descriptor 1000 equal to the operation of the pipe file $tempfifo
    • Line 29th, there may be a question: Why not use the pipe file directly? In fact, this is not superfluous, an important feature of the pipeline is that the read and write must exist simultaneously, that one operation is missing, the other operation is stuck, and the binding file descriptor (read, write binding) on line 28th solves the problem.
  • Line 31st to 34th: write to file descriptor 1000. By looping through 8 empty rows, this 8 is the number of threads we want to define for the background concurrency. Why write blank lines instead of writing other characters? Because the pipe file is read, it is in the behavior unit
  • Line 37th to 42nd:
    • The 37th line, read -u1000 the function is to read a line in the pipeline, here is to read a blank line, each read the pipeline will reduce a blank line
    • Line 39th to 41st, notice the end of line 42nd? & It indicates that the process is placed in the Linux background to execute
    • Line 41st, after performing the background task, writes a blank line to the file descriptor 1000. This is the key, because read -u1000 each operation will cause the pipeline to reduce a blank line, when the Linux background into 8 tasks, because the file descriptor 1000 has no readable blank line, will cause the read -u1000 wait.
3. References
    • Unix Power Tools
    • Unix System Programming Manual
    • UNIX Pipeline: Https://zh.wikipedia.org/wiki/%E7%AE%A1%E9%81%93_ (Unix)

Linux Shell Multi-process concurrency and concurrency number control

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.