PHP5.5 A better new feature is the implementation of support for generators and collaborative programs. For generators, PHP documentation and various other blog posts (like this one or this one) have been explained in great detail. Collaboration programs are less concerned about, so the collaborative process, although it has a very powerful function but also difficult to be known, it is difficult to explain.
This article guides you through the use of collaborative programs to implement task scheduling, through an example to achieve the understanding of technology. I will make a simple background introduction in the previous sections. If you already have a better foundation, you can skip to the "Collaborative multitasking" section.
Build Device
The basic idea of the builder is also a function, the return value of this function is output sequentially, instead of returning only a single value. Or, in other words, the generator makes it easier for you to implement the iterator interface. Here's a simple explanation by implementing a xrange function:
Copy Code code as follows:
<?php
function xrange ($start, $end, $step = 1) {
for ($i = $start; $i <= $end; $i = = $step) {
Yield $i;
}
}
foreach (xrange (1, 1000000) as $num) {
echo $num, "\ n";
}
The above xrange () function provides the same functionality as the PHP built-in function range (). But the difference is that the range () function returns an array that contains a group of values ranging from 1 to 1 million (note: Check the manual). The xrange () function returns an iterator that sequentially outputs the values, and does not actually compute in the form of an array.
The advantages of this approach are obvious. It allows you to handle large data sets without having to load them into memory at once. You can even handle an infinitely large stream of data.
Of course, this function can be implemented differently through the generator, but it can be implemented by inheriting the iterator interface. It would be more convenient to implement it by using a generator without having to implement the 5 methods in the iterator interface.
Generator is an interruptible function
It is important to understand the synergy from the builder and how it works internally: The generator is an interruptible function in which the yield form the interrupt point.
Immediately following the example above, if you call Xrange (1,1000000), the code in the Xrange () function does not really run. Instead, PHP simply returns an instance of the generator class that implements the iterator interface:
Copy Code code as follows:
<?php
$range = xrange (1, 1000000);
Var_dump ($range); Object (Generator) #1
Var_dump ($range instanceof iterator); BOOL (TRUE)
You invoke an iterator method on an object once, where the code runs once. For example, if you call $range->rewind (), then the code in Xrange () runs to the place where the control flow appears yield for the first time. In this case, this means that when the $i= $start, the yield $i run. The value passed to the yield statement is obtained using $range->current ().
In order to continue executing the code in the generator, you must invoke the $range->next () method. This will start the generator again until the yield statement appears. Therefore, successive calls to the next () and current () methods will enable you to get all the values from the generator until a yield statement is not present at some point. For Xrange (), this situation occurs when $i exceeds $end. In this case, the control flow will reach the end of the function, so no code will be executed. Once this happens, the Vaild () method returns False, at which point the iteration ends.
co-process
The main thing the coprocessor adds to the above function is the ability to send the data to the generator. This translates the generator to the caller's one-way communication into two-way communication between the two.
Pass the data to the coprocessor by calling the generator's send () method instead of its next () method. The following logger () coprocessor is an example of how this communication works:
Copy Code code as follows:
<?php
function Logger ($fileName) {
$fileHandle = fopen ($fileName, ' a ');
while (true) {
Fwrite ($fileHandle, yield. "\ n");
}
}
$logger = Logger (__dir__. '/log ');
$logger->send (' Foo ');
$logger->send (' Bar ')
As you can see, here yield is not used as a statement, but as an expression. That is, it has a return value. The return value of the yield is the value passed to the Send () method. In this example, yield will first return "Foo" and then return to "Bar".
In the example above yield only as a receiver. It is possible to mix two uses, that is, to receive and to send. Examples of how communications are received and sent are as follows:
Copy Code code as follows:
<?php
Function Gen () {
$ret = (yield ' yield1 ');
Var_dump ($ret);
$ret = (yield ' yield2 ');
var_dump ($ret);
}
$gen = Gen ();
Var_dump ($gen->current ()); String (6) "Yield1"
Var_dump ($gen->send (' Ret1 ')); String (4) "Ret1" (the Var_dump in Gen)
String (6) "Yield2" (the Var_dump of the->send () return value)
Var_dump ($gen->send (' Ret2 ')); String (4) "Ret2" (again from within Gen)
NULL (The return value of->send ())
It's a bit difficult to understand the exact order of the output right away, so make sure you know why you're exporting it in this way. I would like to highlight two points: 1th, the use of parentheses on both sides of the yield expression is not accidental. For technical reasons (although I have considered adding an exception to the assignment, as in Python), parentheses are required. 2nd, you may have noticed that rewind () was not invoked before calling current (). If this is done, the rewind operation has been implicitly performed.
Multi-Task collaboration
If you read the logger () example above, then do you think "Why should I use a coprocessor for two-way communication?" Why can't I just use the usual classes? "You are perfectly right to ask. The above example illustrates the basic usage, but the context does not really show the advantages of using a coprocessor. This is the reason for listing a number of examples of a synergistic process. As mentioned in the previous introduction, the coprocessor is a very powerful concept, but such applications are rare and often complex. It's hard to give some simple and real examples.
In this article, what I decided to do is to use the coprocessor to achieve multitasking. The problem we try to solve is that you want to run multiple tasks (or "programs") concurrently. However, the processor can only run one task at a time (the goal of this article is not to consider multi-core). So the processor needs to switch between different tasks, and always let each task run "a little while".
Multi-task collaboration The term "collaboration" illustrates how to switch: it requires that currently running tasks automatically pass control back to the scheduler, so that it can run other tasks. This is in contrast to preemption multitasking: A scheduler can interrupt a task that runs for a period of time, whether it likes it or not. Collaborative multitasking is used in earlier versions of Windows (WINDOWS95) and Mac OS, but they later switch to using preemptive multitasking. The reason is quite clear: if you rely on the program to automatically return control, then the bad behavior of the software will easily occupy the entire CPU for itself, not shared with other tasks.
This time you should understand the connection between the coprocessor and the Task Scheduler: The yield directive provides a way to interrupt the task itself, and then passes control to the scheduler. Therefore, the coprocessor can run several other tasks. Further, yield can be used to communicate between tasks and dispatchers.
Our goal is to use a more lightweight wrapper for "tasks" with a process function:
Copy Code code as follows:
<?php
Class Task {
protected $taskId;
protected $coroutine;
protected $sendValue = null;
protected $beforeFirstYield = true;
Public function __construct ($taskId, generator $coroutine) {
$this->taskid = $taskId;
$this->coroutine = $coroutine;
}
Public Function GetTaskID () {
return $this->taskid;
}
Public Function Setsendvalue ($sendValue) {
$this->sendvalue = $sendValue;
}
Public Function run () {
if ($this->beforefirstyield) {
$this->beforefirstyield = false;
return $this->coroutine->current ();
} else {
$retval = $this->coroutine->send ($this->sendvalue);
$this->sendvalue = null;
return $retval;
}
}
Public Function isfinished () {
Return! $this->coroutine->valid ();
}
}
One task is to mark a coprocessor with a task ID. Using the Setsendvalue () method, you can specify which values will be sent to the next recovery (after which you will understand that we need this). The run () function does not do anything except to invoke a synergistic program of the Send () method. To understand why adding beforefirstyieldflag, you need to consider the following code fragment:
Copy Code code as follows:
<?php
Function Gen () {
Yield ' foo ';
Yield ' bar ';
}
$gen = Gen ();
Var_dump ($gen->send (' something '));
As the Send () happens before the yield there is a implicit rewind () call,
So what really happens are this:
$gen->rewind ();
Var_dump ($gen->send (' something '));
The rewind () would advance to the the-yield (and ignore its value), the Send () would
Advance to the second yield (and dump its value). Thus we loose the yielded value!
By adding beforefirstyieldcondition we can determine that the value of the yield is returned.
The scheduler now has to do a little bit more than a multitasking loop before running multiple tasks:
Copy Code code as follows:
<?php
Class Scheduler {
protected $maxTaskId = 0;
protected $TASKMAP = []; TaskId => Task
protected $taskQueue;
Public Function __construct () {
$this->taskqueue = new Splqueue ();
}
Public function NewTask (generator $coroutine) {
$tid = + + $this->maxtaskid;
$task = new Task ($tid, $coroutine);
$this->taskmap[$tid] = $task;
$this->schedule ($task);
return $tid;
}
Public Function Schedule (Task $task) {
$this->taskqueue->enqueue ($task);
}
Public Function run () {
while (! $this->taskqueue->isempty ()) {
$task = $this->taskqueue->dequeue ();
$task->run ();
if ($task->isfinished ()) {
unset ($this->taskmap[$task->gettaskid ()]);
} else {
$this->schedule ($task);
}
}
}
}
The NewTask () method creates a new task with the next idle task ID, and then puts the task in the task map array. Then it realizes the task scheduling by putting the task into the task queue. Then the run () method scans the task queue and runs the task. If a task is finished, it will be removed from the queue, or it will be scheduled again at the end of the queue.
Let's look at the following scheduler with two simple (and meaningless) tasks:
Copy Code code as follows:
<?php
function Task1 () {
for ($i = 1; $i <= + + $i) {
echo "This is Task 1 iteration $i. \ n";
Yield
}
}
function Task2 () {
for ($i = 1; $i <= 5; + + $i) {
echo "This is Task 2 iteration $i. \ n";
Yield
}
}
$scheduler = new Scheduler;
$scheduler->newtask (Task1 ());
$scheduler->newtask (Task2 ());
$scheduler->run ();
Both tasks echo only one piece of information and then use yield to pass control back to the scheduler. The output results are as follows:
Copy Code code as follows:
This is task 1 iteration 1.
This is task 2 iteration 1.
This is Task 1 iteration 2.
This is Task 2 iteration 2.
This is Task 1 Iteration 3.
This is Task 2 iteration 3.
This is Task 1 iteration 4.
This is Task 2 iteration 4.
This is Task 1 iteration 5.
This is Task 2 iteration 5.
This is Task 1 iteration 6.
This is Task 1 iteration 7.
This is Task 1 iteration 8.
This is Task 1 Iteration 9.
This is Task 1 iteration 10.
The output is indeed as we expected: for the first five iterations, the two tasks are run alternately, and after the second task, only the first task continues to run.
communicating with the scheduler
Now that the scheduler is running, we move on to the next item in the timeline: the communication between the task and the scheduler. We will use processes to communicate in the same way as the operating system session: System calls. The reason we need a system call is that the operating system is at a different level of authority than the process. So in order to perform privileged operations (such as killing another process), you have to somehow pass control back to the kernel so that the kernel can do what it says. Again, this behavior is implemented internally through the use of interrupt directives. Used to be generic int directives, now using more specific and faster syscall/sysenter directives.
Our Task Scheduler system will reflect this design: instead of simply passing the scheduler to the task (so long as it allows it to do whatever it wants), we will communicate with the system call by passing information to the yield expression. Here yield is a break, and a way to pass information to the scheduler (and to pass information from the scheduler).
To illustrate the system call, I'll make a small encapsulation of the callable system call:
Copy Code code as follows:
<?php
Class Systemcall {
protected $callback;
Public function __construct (callable $callback) {
$this->callback = $callback;
}
Public Function __invoke (Task $task, Scheduler $scheduler) {
$callback = $this->callback; Can ' t call it directly in PHP:/
Return $callback ($task, $scheduler);
}
}
It will run like any other callable (using _invoke), but it requires the scheduler to pass the calling task and itself to the function. To solve this problem we have to slightly modify the scheduler's Run method:
Copy Code code as follows:
<?php
Public Function run () {
while (! $this->taskqueue->isempty ()) {
$task = $this->taskqueue->dequeue ();
$retval = $task->run ();
if ($retval instanceof Systemcall) {
$retval ($task, $this);
Continue
}
if ($task->isfinished ()) {
unset ($this->taskmap[$task->gettaskid ()]);
} else {
$this->schedule ($task);
}
}
}
The first system call did nothing but return the task ID:
Copy Code code as follows:
<?php
function GetTaskID () {
return new Systemcall (function (Task $task, Scheduler $scheduler) {
$task->setsendvalue ($task->gettaskid ());
$scheduler->schedule ($task);
});
}
This function does set the task ID to the next value to be sent and dispatches the task again. Because a system call is used, the scheduler does not automatically invoke the task, and we need to manually schedule the task (you will see why this is done later). To use this new system call, we'll rewrite the previous example:
Copy Code code as follows:
<?php
function Task ($max) {
$tid = (yield gettaskid ()); <--here ' s the syscall!
for ($i = 1; $i <= $max; + + $i) {
echo "This is task $tid iteration $i. \ n";
Yield
}
}
$scheduler = new Scheduler;
$scheduler->newtask (Task (10));
$scheduler->newtask (Task (5));
$scheduler->run ();
This code will give the same output as the previous example. Note that the system call runs as normal as any other call, but adds a yield in advance. More than two system calls are required to create new tasks and then kill them:
Copy Code code as follows:
<?php
Function NewTask (generator $coroutine) {
return new Systemcall (
function (Task $task, Scheduler $scheduler) use ($coroutine) {
$task->setsendvalue ($scheduler->newtask ($coroutine));
$scheduler->schedule ($task);
}
);
}
function Killtask ($tid) {
return new Systemcall (
function (Task $task, Scheduler $scheduler) use ($tid) {
$task->setsendvalue ($scheduler->killtask ($tid));
$scheduler->schedule ($task);
}
);
}
The Killtask function needs to add a method to the scheduler:
Copy Code code as follows:
<?php
Public Function Killtask ($tid) {
if (!isset ($this->taskmap[$tid])) {
return false;
}
unset ($this->taskmap[$tid]);
This is a bit ugly and could being optimized so it does not have to walk the queue,
But assuming that killing tasks are rather rare I won ' t bother with it now
foreach ($this->taskqueue as $i => $task) {
if ($task->gettaskid () = = $tid) {
unset ($this->taskqueue[$i]);
Break
}
}
return true;
}
Current 1/2 page
12 Next read the full text