Analysis of PHP's multi-tasking coroutine processing-PHP Tutorial-php.cn

This article mainly introduces the multi-tasking coroutine processing of PHP, which has a certain reference value. Now I share it with everyone. Friends in need can refer to it

So, let’s get started!

Analysis of PHPs multi-tasking coroutine processing

This is what we are going to discuss in this article. But we'll start with a simpler and more familiar example.

Everything starts with an array

We can use arrays through simple traversal:

$array = ["foo", "bar", "baz"];
 
foreach ($array as $key => $value) {
    print "item: " . $key . "|" . $value . "\n";
}
 
for ($i = 0; $i <p>This is the basic implementation that we rely on for daily coding. You can get the key name and key value of each element by traversing the array. </p><p>Of course, if we want to be able to know when an array can be used. PHP provides a convenient built-in function: </p><pre class="brush:php;toolbar:false">print is_array($array) ? "yes" : "no"; // yes

Copy after login

Array-like processing

Sometimes, we need to traverse some data in the same way, but they are not array types. For example, processing the DOMDocument class:

$document = new DOMDocument();
$document->loadXML("<p></p>");

$elements = $document->getElementsByTagName("p");
print_r($elements); // DOMNodeList Object ( [length] => 1 )

Copy after login

This is obviously not an array, but it has a length attribute. Can we traverse it like an array? We can determine whether it implements the following special interface:

print ($elements instanceof Traversable) ? "yes" : "no"; // yes

Copy after login

This is really useful. It will not cause us to trigger an error when iterating over non-traversable data. We only need to test before processing.

However, this will raise another question: Can we make custom classes also have this function? The answer is yes! The first implementation method looks like this:

class MyTraversable implements Traversable
{
    //  在这里编码...
}

Copy after login

If we execute this class, we will see an error message:

PHP Fatal error: Class MyTraversable must implement interface Traversable as part of either Iterator or IteratorAggregate

Iterator(Iterator)

We cannot directly implement Traversable, but we can try the second option:

class MyTraversable implements Iterator
{
    //  在这里编码...
}

Copy after login

This interface requires We implement 5 methods. Let's improve our iterator:

class MyTraversable implements Iterator
{
    protected $data;

    protected $index = 0;

    public function __construct($data)
    {
        $this->data = $data;
    }

    public function current()
    {
        return $this->data[$this->index];
    }

    public function next()
    {
        return $this->data[$this->index++];
    }

    public function key()
    {
        return $this->index;
    }

    public function rewind()
    {
        $this->index = 0;
    }

    public function valid()
    {
        return $this->index data);
    }
}

Copy after login

We need to pay attention to a few things here:

We need to store the $data passed in by the constructor method Array so that we can get its elements from it later.
An internal index (or pointer) is also required to track the current or next element.
rewind() Only resets the index property, so current() and next() to work properly.
Key names are not limited to numeric types! Array indexing is used here to keep the example simple.

We can run this code like this:

$iterator = new MyIterator(["foo", "bar", "baz"]);
 
foreach ($iterator as $key => $value) {
    print "item: " . $key . "|" . $value . "\n";
}

Copy after login

This may seem like too much work, but it is possible to use it like an arrayforeach A concise implementation of the /for function.

IteratorAggregate (aggregation iterator)

Do you still remember the Traversable exception thrown by the second interface? Let's look at an implementation that is faster than implementing the Iterator interface:

class MyIteratorAggregate implements IteratorAggregate
{
    protected $data;

    public function __construct($data)
    {
        $this->data = $data;
    }

    public function getIterator()
    {
        return new ArrayIterator($this->data);
    }
}

Copy after login

We're cheating here. Rather than implementing a complete Iterator, we decorate it through ArrayIterator(). However, this simplifies the code a lot compared to implementing the complete Iterator.

Analysis of PHPs multi-tasking coroutine processing

Brother, don’t worry! First let's compare some code. First, we read each line of data from the file without using a generator:

$content = file_get_contents(__FILE__);

$lines = explode("\n", $content);

foreach ($lines as $i => $line) {
    print $i . ". " . $line . "\n";
}

Copy after login

This code reads the file itself and then prints out the line number and code of each line. So why don't we use generators!

function lines($file) {
    $handle = fopen($file, 'r');

    while (!feof($handle)) {
        yield trim(fgets($handle));
    }

    fclose($handle);
}

foreach (lines(__FILE__) as $i => $line) {
    print $i . ". " . $line . "\n";
}

Copy after login

I know this seems more complicated. Good, but that's because we're not using the file_get_contents() function. A generator looks like a function, but it stops running every time it gets the yield keyword.

Generators look a bit like iterators:

print_r(lines(__FILE__)); // Generator Object ( )

Copy after login

Although it's not an iterator, it's a Generator. What methods are defined internally?

print_r(get_class_methods(lines(__FILE__)));
 
// Array
// (
//     [0] => rewind
//     [1] => valid
//     [2] => current
//     [3] => key
//     [4] => next
//     [5] => send
//     [6] => throw
//     [7] => __wakeup
// )

Copy after login

If you read a large file and then use memory_get_peak_usage(), you will notice that the generator code will use a fixed amount of memory, no matter how big the file is. It progresses one line at a time. Instead, use the file_get_contents() function to read the entire file, which will use more memory. This is where generators give us an advantage when iterating on things like this!

Send (send data)

Can send data to the generator. Take a look at the following generator:

<?php $generator = call_user_func(function() {
    yield "foo";
});

print $generator->current() . "\n"; // foo

Copy after login

注意这里我们如何在 call_user_func() 函数中封装生成器函数的？这里仅仅是一个简单的函数定义，然后立即调用它获取一个新的生成器实例...

我们已经见过 yield 的用法。我们可以通过扩展这个生成器来接收数据：

$generator = call_user_func(function() {
    $input = (yield "foo");

    print "inside: " . $input . "\n";
});

print $generator->current() . "\n";

$generator->send("bar");

Copy after login

数据通过 yield 关键字传入和返回。首先，执行 current() 代码直到遇到 yield，返回 foo。send() 将输出传入到生成器打印输入的位置。你需要习惯这种用法。

抛出异常（Throw）

由于我们需要同这些函数进行交互，可能希望将异常推送到生成器中。这样这些函数就可以自行处理异常。

看看下面这个示例：

$multiply = function($x, $y) {
    yield $x * $y;
};

print $multiply(5, 6)->current(); // 30

Copy after login

现在让我们将它封装到另一个函数中：

$calculate = function ($op, $x, $y) use ($multiply) {
    if ($op === 'multiply') {
        $generator = $multiply($x, $y);

        return $generator->current();
    }
};

print $calculate("multiply", 5, 6); // 30

Copy after login

这里我们通过一个普通闭包将乘法生成器封装起来。现在让我们验证无效参数：

$calculate = function ($op, $x, $y) use ($multiply) {

    if ($op === "multiply") {
        $generator = $multiply($x, $y);

        if (!is_numeric($x) || !is_numeric($y)) {
            throw new InvalidArgumentException();
        }

        return $generator->current();
    }
};

print $calculate('multiply', 5, 'foo'); // PHP Fatal error...

Copy after login

如果我们希望能够通过生成器处理异常？我们怎样才能将异常传入生成器呢！

$multiply = function ($x, $y) {
    try {
        yield $x * $y;
    } catch (InvalidArgumentException $exception) {
        print "ERRORS!";
    }
};

$calculate = function ($op, $x, $y) use ($multiply) {

    if ($op === "multiply") {
        $generator = $multiply($x, $y);

        if (!is_numeric($x) || !is_numeric($y)) {
            $generator->throw(new InvalidArgumentException());
        }

        return $generator->current();
    }
};
print $calculate('multiply', 5, 'foo'); // PHP Fatal error...

Copy after login

棒呆了！我们不仅可以像迭代器一样使用生成器。还可以通过它们发送数据并抛出异常。它们是可中断和可恢复的函数。有些语言把这些函数叫做……

Analysis of PHPs multi-tasking coroutine processing

我们可以使用协程（coroutines）来构建异步代码。让我们来创建一个简单的任务调度程序。首先我们需要一个 Task 类：

class Task
{
    protected $generator;

    public function __construct(Generator $generator)
    {
        $this->generator = $generator;
    }

    public function run()
    {
        $this->generator->next();
    }

    public function finished()
    {
        return !$this->generator->valid();
    }
}

Copy after login

Task 是普通生成器的装饰器。我们将生成器赋值给它的成员变量以供后续使用，然后实现一个简单的 run() 和 finished() 方法。run() 方法用于执行任务，finished() 方法用于让调度程序知道何时终止运行。

然后我们需要一个 Scheduler 类：

class Scheduler
{
    protected $queue;

    public function __construct()
    {
        $this->queue = new SplQueue();
    }

    public function enqueue(Task $task)
    {
        $this->queue->enqueue($task);
    }

    pulic function run()
    {
        while (!$this->queue->isEmpty()) {
            $task = $this->queue->dequeue();
            $task->run();

            if (!$task->finished()) {
                $this->queue->enqueue($task);
            }
        }
    }
}

Copy after login

Scheduler 用于维护一个待执行的任务队列。run() 会弹出队列中的所有任务并执行它，直到运行完整个队列任务。如果某个任务没有执行完毕，当这个任务本次运行完成后，我们将再次入列。

SplQueue 对于这个示例来讲再合适不过了。它是一种 FIFO（先进先出：fist in first out）数据结构，能够确保每个任务都能够获取足够的处理时间。

我们可以像这样运行这段代码：

$scheduler = new Scheduler();

$task1 = new Task(call_user_func(function() {
    for ($i = 0; $i enqueue($task1);
$scheduler->enqueue($task2);

$scheduler->run();

Copy after login

运行时，我们将看到如下执行结果：

task 1: 0
task 1: 1
task 2: 0
task 2: 1
task 1: 2
task 2: 2
task 2: 3
task 2: 4
task 2: 5

Copy after login

这几乎就是我们想要的执行结果。不过有个问题发生在首次运行每个任务时，它们都执行了两次。我们可以对 Task 类稍作修改来修复这个问题：

class Task
{
    protected $generator;

    protected $run = false;

    public function __construct(Generator $generator)
    {
        $this->generator = $generator;
    }

    public function run()
    {
        if ($this->run) {
            $this->generator->next();
        } else {
            $this->generator->current();
        }

        $this->run = true;
    }

    public function finished()
    {
        return !$this->generator->valid();
    }
}

Copy after login

我们需要调整首次 run() 方法调用，从生成器当前有效的指针读取运行。后续调用可以从下一个指针读取运行...

Analysis of PHPs multi-tasking coroutine processing

有些人基于这个思路实现了一些超赞的类库。我们来看看其中的两个...

RecoilPHP

RecoilPHP 是一套基于协程的类库，它最令人印象深刻的是用于 ReactPHP 内核。可以将事件循环在 RecoilPHP 和 RecoilPHP 之间进行交换，而你的程序无需架构上的调整。

我们来看一下 ReactPHP 异步 DNS 解决方案：

function resolve($domain, $resolver) {
    $resolver
        ->resolve($domain)
        ->then(function ($ip) use ($domain) {
            print "domain: " . $domain . "\n";
            print "ip: " . $ip . "\n";
        }, function ($error) {            
            print $error . "\n";
        })
}

function run()
{
    $loop = React\EventLoop\Factory::create();
 
    $factory = new React\Dns\Resolver\Factory();
 
    $resolver = $factory->create("8.8.8.8", $loop);
 
    resolve("silverstripe.org", $resolver);
    resolve("wordpress.org", $resolver);
    resolve("wardrobecms.com", $resolver);
    resolve("pagekit.com", $resolver);
 
    $loop->run();
}
 
run();

Copy after login

resolve() 接收域名和 DNS 解析器，并使用 ReactPHP 执行标准的 DNS 查找。不用太过纠结与 resolve() 函数内部。重要的是这个函数不是生成器，而是一个函数！

run() 创建一个 ReactPHP 事件循环，DNS 解析器（这里是个工厂实例）解析若干域名。同样，这个也不是一个生成器。

想知道 RecoilPHP 到底有何不同？还希望掌握更多细节！

use Recoil\Recoil;
 
function resolve($domain, $resolver)
{
    try {
        $ip = (yield $resolver->resolve($domain));
 
        print "domain: " . $domain . "\n";
        print "ip: " . $ip . "\n";
    } catch (Exception $exception) {
        print $exception->getMessage() . "\n";
    }
}
 
function run()
{
    $loop = (yield Recoil::eventLoop());
 
    $factory = new React\Dns\Resolver\Factory();
 
    $resolver = $factory->create("8.8.8.8", $loop);
 
    yield [
        resolve("silverstripe.org", $resolver),
        resolve("wordpress.org", $resolver),
        resolve("wardrobecms.com", $resolver),
        resolve("pagekit.com", $resolver),
    ];
}
 
Recoil::run("run");

Copy after login

通过将它集成到 ReactPHP 来完成一些令人称奇的工作。每次运行 resolve() 时，RecoilPHP 会管理由 $resoler->resolve() 返回的 promise 对象，然后将数据发送给生成器。此时我们就像在编写同步代码一样。与我们在其他一步模型中使用回调代码不同，这里只有一个指令列表。

RecoilPHP 知道它应该管理一个有执行 run() 函数时返回的 yield 数组。RoceilPHP 还支持基于协程的数据库（PDO）和日志库。

IcicleIO

IcicleIO 为了一全新的方案实现 ReactPHP 一样的目标，而仅仅使用协程功能。相比 ReactPHP 它仅包含极少的组件。但是，核心的异步流、服务器、Socket、事件循环特性一个不落。

让我们看一个 socket 服务器示例：

use Icicle\Coroutine\Coroutine;
use Icicle\Loop\Loop;
use Icicle\Socket\Client\ClientInterface;
use Icicle\Socket\Server\ServerInterface;
use Icicle\Socket\Server\ServerFactory;
 
$factory = new ServerFactory();
 
$coroutine = Coroutine::call(function (ServerInterface $server) {
    $clients = new SplObjectStorage();
     
    $handler = Coroutine::async(
        function (ClientInterface $client) use (&$clients) {
            $clients->attach($client);
             
            $host = $client->getRemoteAddress();
            $port = $client->getRemotePort();
             
            $name = $host . ":" . $port;
             
            try {
                foreach ($clients as $stream) {
                    if ($client !== $stream) {
                        $stream->write($name . "connected.\n");
                    }
                }
 
                yield $client->write("Welcome " . $name . "!\n");
                 
                while ($client->isReadable()) {
                    $data = trim(yield $client->read());
                     
                    if ("/exit" === $data) {
                        yield $client->end("Goodbye!\n");
                    } else {
                        $message = $name . ":" . $data . "\n";
                        
                        foreach ($clients as $stream) {
                            if ($client !== $stream) {
                                $stream->write($message);
                            }
                        }
                    }
                }
            } catch (Exception $exception) {
                $client->close($exception);
            } finally {
                $clients->detach($client);
                foreach ($clients as $stream) {
                    $stream->write($name . "disconnected.\n");
                }
            }
        }
    );
     
    while ($server->isOpen()) {
        $handler(yield $server->accept());
    }
}, $factory->create("127.0.0.1", 6000));
 
Loop::run();

Copy after login

据我所知，这段代码所做的事情如下: