1.2. linux file descriptorThe file descriptor (fd) is a non-negative integer in form. In fact, it is an index value that points to the record table for opening files for each process maintained by the kernel. When the program opens an existing file or creates a new file, the kernel returns a file descriptor to the process. Each unix process has three standard file descriptors that correspond to three different streams:
File descriptorName0 Standard Input1Standard Output2Standard ErrorEach file descriptor corresponds to an open file. At the same time, different file descriptors can correspond to the same open file. The same file can be opened by different processes, it can also be opened multiple times by the same process.
In/proc/PID/fd
Process ListPID
All file descriptors, such
#! /Bin/bashsource/etc/profile; #$ $ indicates the PIDPID of the current process =$ # view the file descriptor of the current process to ll/proc/$ PID/fdecho "-------------------"; echo # file descriptor 1 is bound to the file tempfd1 ([-e. /tempfd1] | touch. /tempfd1) & exec 1 <>. /tempfd1 # view the file descriptor of the current process to ll/proc/$ PID/fdecho "-----------------"; echo;
[ouyangyewei@localhost learn_linux]$ sh learn_redirect.sh total 0lrwx------. 1 ouyangyewei ouyangyewei 64 Jan 4 22:17 0 -> /dev/pts/0lrwx------. 1 ouyangyewei ouyangyewei 64 Jan 4 22:17 1 -> /dev/pts/0lrwx------. 1 ouyangyewei ouyangyewei 64 Jan 4 22:17 2 -> /dev/pts/0lr-x------. 1 ouyangyewei ouyangyewei 64 Jan 4 22:17 255 -> /home/ouyangyewei/workspace/learn_linux/learn_redirect.sh-------------------[ouyangyewei@localhost learn_linux]$ cat tempfd1 total 0lrwx------. 1 ouyangyewei ouyangyewei 64 Jan 4 22:17 0 -> /dev/pts/0lrwx------. 1 ouyangyewei ouyangyewei 64 Jan 4 22:17 1 -> /home/ouyangyewei/workspace/learn_linux/tempfd1lrwx------. 1 ouyangyewei ouyangyewei 64 Jan 4 22:17 2 -> /dev/pts/0lr-x------. 1 ouyangyewei ouyangyewei 64 Jan 4 22:17 255 -> /home/ouyangyewei/workspace/learn_linux/learn_redirect.sh-------------------
In the example above, the first line is the file descriptor 1 and the filetempfile
After binding, file descriptor 1 pointstempfile
File. The standard output is redirected to the file.tempfile
.
1.3. linux PipelineIn Unix or Unix-like operating systems, a pipeline is a collection of processes linked by standard input and output. Therefore, the output of each process is directly used as the input of the next process,
There are two types of linux pipelines:
- Anonymous Pipeline
- Named Pipe
The pipeline has a feature. If there is no data in the pipeline, the operations to retrieve the pipeline data will be stranded until the pipeline enters the data and then reads the data. Similarly, if the operation to write data to the MPs queue does not read data from the MPs queue, the operation will be stuck.
1.3.1. Anonymous PipelineIn Unix or Unix-like command lines, anonymous pipelines use vertical lines in ASCII|
As an anonymous pipeline operator, the two ends of an anonymous pipeline are two common, anonymous, open file descriptors: OneRead-only end and oneWrite-only, which makes other processes unable to connect to the anonymous pipeline.
For example:
cat file | less
To execute the preceding commands, Shell creates two processes to execute them separately.cat
Andless
. Demonstrate how these two processes use pipelines: It is worth noting that both processes are connected to the pipeline, so that the write processcat
The standard output (the file descriptor isfd 1
) Connected to the writing end of the MPs queue to read the process.less
The standard input (the file descriptor isfd 0
) Connected to the MPs Queue Reader. In fact, these two processes do not know the existence of pipelines. They only read data from standard file descriptors and write data. Shell must complete related work.
1.3.2. Named Pipe (FIFO, First In First Out)Named pipelines are also called FIFO. In terms of semantics, FIFO is actually similar to anonymous pipelines, but it is worth noting that:
- In a file system, FIFO has a name and exists in the form of a device-specific file;
- Any process can share data through FIFO;
- Unless there are read and write processes at both ends of the FIFO, the data flow of the FIFO will be blocked;
- Anonymous pipelines are automatically created by shell and exist in the kernel, while FIFO is created by programs (for example
mkfifo
Command) in the file system;
- Anonymous pipelines are unidirectional byte streams, while FIFO is bidirectional byte streams;
For example, you can use FIFO to implement single-server and multi-client applications:
With the above knowledge preparation, you can start to talk about it now,How to control the number of concurrent processes in linux for multi-process concurrency.
2. linux multi-process Concurrency ControlRecently, Mr. A needs to generate the KPI data report for the entire year of 2015. Now Mr. A has prepared the production script. The production script can only produce KPI data for A specified day at A time, assume that it takes 5 minutes to run a production script. * if it is cyclically executed, it takes 5*365 = 1825 minutes, about 6 days * if it is a one-time execution in the linux background, the system will be unable to afford 365 background tasks!
Since 365 tasks cannot be put in the linux background for execution at a time, can N tasks be automatically put in the background for concurrent execution at a time? Of course.
#! /Bin/bashsource/etc/profile; # ----------------------------- tempfifo =fifo. fifo # $ indicates the PIDbegin_date of the current execution file = $1 # Start Time end_date = $2 # End Time if [$ #-eq 2] then if ["$ begin_date" \> "$ end_date"] then echo "Error! $ Begin_date is greater than $ end_date "exit 1; fielse echo" Error! Not enough params. "echo" Sample: sh loop_kpi 2015-12-01 2015-7 "exit 2; fi # ----------------------------- trap" exec 1000> &-; exec 1000 <&-; exit 0 "2 mkfifo $ temp1_oexec 1000 <> $ temp1_orm-rf $ temp1_ofor (I = 1; I <= 8; I ++ )) do echo & amp; 1000 donewhile [$ begin_date! = $ End_date] do read-u1000 {echo $ begin_date hive-f kpi_report. SQL -- hivevar date = $ begin_date echo> & 1000} & begin_date = 'date-d "+ 1 day $ begin_date "+" % Y-% m-% d "'donewaitecho" done !!!!!!!!!! "
- 6th ~ 22 rows: for example:
sh loop_kpi_report.sh 2015-01-01 2015-12-01
:
$1
Indicates the first parameter of the input parameter of the script, which is equal
$2
Indicates the second parameter of the input parameter of the script, which is equivalent
$#
Indicates the number of input parameters in the script, which is equal to 2.
- The second row is used to compare the size of two input dates,
\>
Is escape
- Row 26th: indicates that when the script is running, if
Ctrl+C
If the command is interrupted, the read and write operations of file descriptor 1000 are disabled and the system Exits normally.
exec 1000>&-;
Indicates that write of file descriptor 1000 is disabled.
exec 1000<&-;
Indicates disabling reading the file descriptor 1000.
- Trap is the capture interrupt command
- 27th ~ 29 rows:
- Line 3: Create an MPS queue File
- Line 3: bind file descriptor 28th to FIFO,
<
Read binding,>
Write binding,<>
All the operations that identify the file descriptor 1000 are equivalent to$tempfifo
Operations
- Row 3 may have the following question: why not directly use MPs queue files? In fact, this is not an alternative. An important feature of the pipeline is that the read and write operations must exist at the same time. If one operation is missing, the other operation is stranded, and the file descriptor bound to row 3 (read and write binding) solved this problem.
- 31st ~ Row 34: Write the file descriptor 1000. Write eight empty rows in a loop. This 8 is the number of concurrent threads in the background to be defined. Why write empty lines instead of other characters? Because the reading of MPs queue files is in the unit of action
- 37th ~ 42 rows:
- Row 37th,
read -u1000
The function of reading a row in the MPs queue is to read an empty row. Each time the MPs queue is read, a blank row is reduced.
- 39th ~ 41 rows. Note
&
? It indicates that the process is executed on the linux background.
- Row 3: After the background task is executed, an empty row is written to file descriptor 41st. This is the key, because
read -u1000
Each operation reduces the number of empty lines in the MPs queue. When eight tasks are put in the linux background, the file descriptor 1000 does not have any blank lines to read.read -u1000
Always waiting.
3. References
- Unix Power Tools
- UNIX programming manual
- UNIX pipeline: https://zh.wikipedia.org/wiki/%E7% AE %A1%E9%81%93_ (Unix)