PHP introduced the "Generator" feature in version 5.5, but this feature did not attract people's attention. In the official migration from PHP 5.4.x to PHP 5.5.x, it is introduced that it can implement iterator (Iterator) in a simple way. But beyond that, in what scenarios can generators be used?
Generator implementation is completed through the yield keyword. Generators provide a simple way to implement iterators with almost no additional overhead or the complexity of implementing iterators through classes that implement the iterator interface.
The documentation provides a simple example demonstrating this simple iterator, please look at the following code:
function xrange($start, $limit, $step = 1) { for ($i = $start; $i <= $limit; $i += $step) { yield $i; } }
Let's compare it to an array without iterator support:
foreach xrange($start, $limit, $step = 1) { $elements = []; for ($i = $start; $i <= $limit; $i += $step) { $elements[] = $i; } return $elements; }
Both versions of the function supportforeachiterating over all elements:
foreach (xrange(1, 100) as $i) { print $i . PHP_EOL; }
So besides a shorter function definition, what else can we get?yieldWhat exactly did you do? Why can data be returned when the first function is defined, even without thereturnstatement?
Let’s start with the return value. Generators are a very special function in PHP. When a function containsyield, then the function is no longer an ordinary function, it always returns a "Generator" instance. The generator implements theIteratorinterface, which is why it is capable offoreachtraversal.
Next I use the methods in theIteratorinterface to rewrite the previousforeachloop. You can view the results at 3v4l.org.
$generator = xrange(1, 100); while($generator->valid()) { print $generator->current() . PHP_EOL; $generator->next(); }
We can clearly see that generators are a more advanced technology. Now let us write a new generator example to better understand how it is processed inside the generator.
function foobar() { print 'foobar - start' . PHP_EOL; for ($i = 0; $i < 5; $i++) { print 'foobar - yielding...' . PHP_EOL; yield $i; print 'foobar - continued...' . PHP_EOL; } print 'foobar - end' . PHP_EOL; } $generator = foobar(); print 'Generator created' . PHP_EOL; while ($generator->valid()) { print "Getting current value from the generator..." . PHP_EOL; print $generator->current() . PHP_EOL; $generator->next(); }
Huh? Why isGenerator createdprinted first? This is because the generator does nothing until it is used. In the above example, it is the code$generator->valid()** that starts executing the generator. We see that the generator runs until the first **yield** and returns the control flow to the caller **$generator->valid().$generator->next()When called, the generator execution is resumed and stops running again at the nextyield, and so on until there are no moreyielduntil. We now have terminal functions that can perform pause and resume at anyyield. This feature allows writing defer functions required by the client.
You can create a function that reads all users from the GitHub API. Pagination is supported, but you can hide these details and fetch the next page of data only when needed. You can useyieldto obtain each user data from the current page. Until all users on the current page are obtained, you can then obtain the next page of data.
Generator created foobar - start foobar - yielding... Getting current value from the generator... 1 foobar - continued foobar - yielding... Getting current value from the generator... 2 foobar - continued foobar - yielding... Getting current value from the generator... 3 foobar - continued foobar - yielding... Getting current value from the generator... 4 foobar - continued foobar - yielding... Getting current value from the generator... 5 foobar - continued foobar - end
The client can iterate out all users or stop the traversal at any time.
Yes, you have the right idea. All the explanations I have given above can be obtained by anyone from the PHP documentation. But as an iterator, these uses don't even use half of its powerful capabilities. Generators also providesend()andthrow()functions that are not part of theIteratorinterface. We talked earlier about the ability to pause and resume generator execution. When you need to restore the generator, you can not only use theGenerator::next()method, but also useGenerator::send()andGenerator::throw()method.
Generator::send()allows you to specify the return value ofyield, whileGenerator::throw()allows you toyieldthrows an exception. Through these methods we can not only get data from the generator, but also send new data to the generator.
Let's look at aLoggerlog example taken from Cooperative multitasking using coroutines (highly recommended to read this article).
class GitHubClient { function getUsers(): Iterator { $uri = '/users'; do { $response = $this->get($uri); foreach ($response->items as $user) { yield $user; } $uri = $response->nextUri; } while($uri !== null); } }
yieldis used as an expression here. When we send data, the data is returned fromyieldand then passed as a parameter tofwrite().
To be honest, this example is useless in actual projects. It is only used to demonstrate the usage principle ofGenerator::send(), but just being able to send data does not have much effect. It would be different if there was a class that supported ordinary functions.
使用生成器的乐趣来自于通过yield创建数据,然后由「生成器执行程序(generator runner)」依据这个数据来处理业务,然后再继续执行生成器。这就是「协程(coroutines)」和「状态流解析器(stateful streaming parsers)」实例。在讲解协程和状态流解析器之前,我们快速浏览一下如何在生成器中返回数据,我们还没有将接触这方面的知识。从 PHP 5.5 开始我们可以在生成器内部使用return;语句,但是不能返回任何值。执行return;语句的唯一目的是结束生成器执行。
不过从 PHP 7.0 起支持返回值。这个功能在用于迭代时可能有些奇怪,但是在其他使用场景如协程时将非常有用,例如,当我们在执行一个生成器时我们可以依据返回值处理,而无需直接对生成器进行操作。下一节我们将讲解return语句在协程中的使用。
Amp 是一款 PHP 异步编程的框架。支持异步协程功能,本质上是等待处理结果的占位符。「生成器执行程序」为Coroutine类。它会订阅异步生成器(yielded promise),当有执行结果可用时则继续生成器处理。如果处理失败,则会抛出异常给生成器。你可以到 amphp/amp 版本库查看实现细节。在 Amp 中的Coroutine本身就是一个Promise。如果这个协程抛出未经捕获的异常,这个协程就执行失败了。如果解析成功,那么就返回一个值。这个值看起来和普通函数的返回值并无二致,只不过它处于异步执行环境中。这就是需要生成器需要有返回值的意义,这也是为何我们将这个特性加入到 PHP 7.0 中的原因,我们会将最后执行的yield 值作为返回值,但这不是一个好的解决方案。
Amp 可以像编写阻塞代码一样编写非阻塞代码,同时允许在同一进程中执行其它非阻塞事件。一个使用场景是,同时对一个或多个第三方 API 并行的创建多个 HTTP 请求,但不限于此。得益于事件循环,可以同时处理多个 I/O 处理,而不仅仅是只能处理多个 HTTP请求这类操作。
Loop::run(function() { $uris = [ "https://google.com/", "https://github.com/", "https://stackoverflow.com/", ]; $client = new Amp\Artax\DefaultClient; $promises = []; foreach ($uris as $uri) { $promises[$uri] = $client->request($uri); } $responses = yield $promises; foreach ($responses as $uri => $response) { print $uri . " - " . $response->getStatus() . PHP_EOL; } });
但是,拥有异步功能的协程并非只能够在yield右侧出现变量,还可以在它的左侧。这就是我们前面提到的解析器。
$parse = new Parser((function(){ while (true) { $line = yield "\r\n"; if (trim($line) === "") { continue; } print "New item: {$line}" . PHP_EOL; } })()); for ($i = 0; $i < 100; $i++) { $parser->push("bar\r"); $parser->push("\nfoo"); }
解析器会缓存所有输入直到接收的是rn。这类生成器解析器并不能简化简单协议处理(如换行分隔符协议),但是对于复杂的解析器,如在服务器解析 HTTP 请求的 Aerys。
相关推荐:
The above is the detailed content of Let's talk about how to use PHP generator. For more information, please follow other related articles on the PHP Chinese website!