Shell multi-process concurrency and Concurrency Control

Source: Internet
Author: User

Shell multi-process concurrency and Concurrency Control
1. Basic Knowledge preparations 1.1. linux background processes

Unix is a multitasking system that allows multiple users to run multiple programs at the same time. Shell metacharacters&Provides a method to run programs that do not require keyboard input in the background. Enter the command, followed&The command will be sent to the linux background for execution, and the terminal can continue to enter the next command.
For example:

sh a.sh &sh b.sh &sh c.sh &

These three commands will beThese three commands are also sent to the linux background for execution.Concurrent execution.

1.2. linux file descriptor

The file descriptor (fd) is a non-negative integer in form. In fact, it is an index value that points to the record table for opening files for each process maintained by the kernel. When the program opens an existing file or creates a new file, the kernel returns a file descriptor to the process. Each unix process has three standard file descriptors that correspond to three different streams:

File descriptorName0 Standard Input1Standard Output2Standard Error

Each file descriptor corresponds to an open file. At the same time, different file descriptors can correspond to the same open file. The same file can be opened by different processes, it can also be opened multiple times by the same process.

In/proc/PID/fdProcess ListPIDAll file descriptors, such

#! /Bin/bashsource/etc/profile; #$ $ indicates the PIDPID of the current process =$ # view the file descriptor of the current process to ll/proc/$ PID/fdecho "-------------------"; echo # file descriptor 1 is bound to the file tempfd1 ([-e. /tempfd1] | touch. /tempfd1) & exec 1 <>. /tempfd1 # view the file descriptor of the current process to ll/proc/$ PID/fdecho "-----------------"; echo;
[ouyangyewei@localhost learn_linux]$ sh learn_redirect.sh total 0lrwx------. 1 ouyangyewei ouyangyewei 64 Jan  4 22:17 0 -> /dev/pts/0lrwx------. 1 ouyangyewei ouyangyewei 64 Jan  4 22:17 1 -> /dev/pts/0lrwx------. 1 ouyangyewei ouyangyewei 64 Jan  4 22:17 2 -> /dev/pts/0lr-x------. 1 ouyangyewei ouyangyewei 64 Jan  4 22:17 255 -> /home/ouyangyewei/workspace/learn_linux/learn_redirect.sh-------------------[ouyangyewei@localhost learn_linux]$ cat tempfd1 total 0lrwx------. 1 ouyangyewei ouyangyewei 64 Jan  4 22:17 0 -> /dev/pts/0lrwx------. 1 ouyangyewei ouyangyewei 64 Jan  4 22:17 1 -> /home/ouyangyewei/workspace/learn_linux/tempfd1lrwx------. 1 ouyangyewei ouyangyewei 64 Jan  4 22:17 2 -> /dev/pts/0lr-x------. 1 ouyangyewei ouyangyewei 64 Jan  4 22:17 255 -> /home/ouyangyewei/workspace/learn_linux/learn_redirect.sh-------------------

In the example above, the first line is the file descriptor 1 and the filetempfileAfter binding, file descriptor 1 pointstempfileFile. The standard output is redirected to the file.tempfile.

1.3. linux Pipeline

In Unix or Unix-like operating systems, a pipeline is a collection of processes linked by standard input and output. Therefore, the output of each process is directly used as the input of the next process,

There are two types of linux pipelines:

  • Anonymous Pipeline
  • Named Pipe

The pipeline has a feature. If there is no data in the pipeline, the operations to retrieve the pipeline data will be stranded until the pipeline enters the data and then reads the data. Similarly, if the operation to write data to the MPs queue does not read data from the MPs queue, the operation will be stuck.

1.3.1. Anonymous Pipeline

In Unix or Unix-like command lines, anonymous pipelines use vertical lines in ASCII|As an anonymous pipeline operator, the two ends of an anonymous pipeline are two common, anonymous, open file descriptors: OneRead-only end and oneWrite-only, which makes other processes unable to connect to the anonymous pipeline.

For example:

cat file | less

To execute the preceding commands, Shell creates two processes to execute them separately.catAndless. Demonstrate how these two processes use pipelines: It is worth noting that both processes are connected to the pipeline, so that the write processcatThe standard output (the file descriptor isfd 1) Connected to the writing end of the MPs queue to read the process.lessThe standard input (the file descriptor isfd 0) Connected to the MPs Queue Reader. In fact, these two processes do not know the existence of pipelines. They only read data from standard file descriptors and write data. Shell must complete related work.

1.3.2. Named Pipe (FIFO, First In First Out)

Named pipelines are also called FIFO. In terms of semantics, FIFO is actually similar to anonymous pipelines, but it is worth noting that:

  • In a file system, FIFO has a name and exists in the form of a device-specific file;
  • Any process can share data through FIFO;
  • Unless there are read and write processes at both ends of the FIFO, the data flow of the FIFO will be blocked;
  • Anonymous pipelines are automatically created by shell and exist in the kernel, while FIFO is created by programs (for examplemkfifoCommand) in the file system;
  • Anonymous pipelines are unidirectional byte streams, while FIFO is bidirectional byte streams;

For example, you can use FIFO to implement single-server and multi-client applications:

With the above knowledge preparation, you can start to talk about it now,How to control the number of concurrent processes in linux for multi-process concurrency.

2. linux multi-process Concurrency Control

Recently, Mr. A needs to generate the KPI data report for the entire year of 2015. Now Mr. A has prepared the production script. The production script can only produce KPI data for A specified day at A time, assume that it takes 5 minutes to run a production script. * if it is cyclically executed, it takes 5*365 = 1825 minutes, about 6 days * if it is a one-time execution in the linux background, the system will be unable to afford 365 background tasks!

Since 365 tasks cannot be put in the linux background for execution at a time, can N tasks be automatically put in the background for concurrent execution at a time? Of course.

#! /Bin/bashsource/etc/profile; # ----------------------------- tempfifo =fifo. fifo # $ indicates the PIDbegin_date of the current execution file = $1 # Start Time end_date = $2 # End Time if [$ #-eq 2] then if ["$ begin_date" \> "$ end_date"] then echo "Error! $ Begin_date is greater than $ end_date "exit 1; fielse echo" Error! Not enough params. "echo" Sample: sh loop_kpi 2015-12-01 2015-7 "exit 2; fi # ----------------------------- trap" exec 1000> &-; exec 1000 <&-; exit 0 "2 mkfifo $ temp1_oexec 1000 <> $ temp1_orm-rf $ temp1_ofor (I = 1; I <= 8; I ++ )) do echo & amp; 1000 donewhile [$ begin_date! = $ End_date] do read-u1000 {echo $ begin_date hive-f kpi_report. SQL -- hivevar date = $ begin_date echo> & 1000} & begin_date = 'date-d "+ 1 day $ begin_date "+" % Y-% m-% d "'donewaitecho" done !!!!!!!!!! "
  • 6th ~ 22 rows: for example:sh loop_kpi_report.sh 2015-01-01 2015-12-01:
    • $1Indicates the first parameter of the input parameter of the script, which is equal
    • $2Indicates the second parameter of the input parameter of the script, which is equivalent
    • $#Indicates the number of input parameters in the script, which is equal to 2.
    • The second row is used to compare the size of two input dates,\>Is escape
  • Row 26th: indicates that when the script is running, ifCtrl+CIf the command is interrupted, the read and write operations of file descriptor 1000 are disabled and the system Exits normally.
    • exec 1000>&-;Indicates that write of file descriptor 1000 is disabled.
    • exec 1000<&-;Indicates disabling reading the file descriptor 1000.
    • Trap is the capture interrupt command
  • 27th ~ 29 rows:
    • Line 3: Create an MPS queue File
    • Line 3: bind file descriptor 28th to FIFO,<Read binding,>Write binding,<>All the operations that identify the file descriptor 1000 are equivalent to$tempfifoOperations
    • Row 3 may have the following question: why not directly use MPs queue files? In fact, this is not an alternative. An important feature of the pipeline is that the read and write operations must exist at the same time. If one operation is missing, the other operation is stranded, and the file descriptor bound to row 3 (read and write binding) solved this problem.
  • 31st ~ Row 34: Write the file descriptor 1000. Write eight empty rows in a loop. This 8 is the number of concurrent threads in the background to be defined. Why write empty lines instead of other characters? Because the reading of MPs queue files is in the unit of action
  • 37th ~ 42 rows:
    • Row 37th,read -u1000The function of reading a row in the MPs queue is to read an empty row. Each time the MPs queue is read, a blank row is reduced.
    • 39th ~ 41 rows. Note&? It indicates that the process is executed on the linux background.
    • Row 3: After the background task is executed, an empty row is written to file descriptor 41st. This is the key, becauseread -u1000Each operation reduces the number of empty lines in the MPs queue. When eight tasks are put in the linux background, the file descriptor 1000 does not have any blank lines to read.read -u1000Always waiting.
3. References
  • Unix Power Tools
  • UNIX programming manual
  • UNIX pipeline: https://zh.wikipedia.org/wiki/%E7% AE %A1%E9%81%93_ (Unix)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.