An async map function
Laravel has an excellent Collection class that has many useful operations. The class is also macroable. This means that you can add function to it at runtime by calling macro
on it and passing a name and a closure. In our projects we tend to code up the same macro's over and over again. That's why we have put those macros in a package called laravel-collection-macros so we, and the community, can reuse them. In this post I'd like to talk a bit about a new macro that we added today called parallelMap
.
parallelmap
is identical to map but each item in the collection will be processed in parallel. Let's take a look at an example:
$pageSources = collect($urls)->parallelMap(function($url) {
return file_get_contents($url);
});
The content of the given $urls
will be fetched at the same time. This will be much faster that fetching the content of one url after the other. Cool stuff!
Here's another piece of code taken from our tests:
/** @test */
public function it_can_perform_async_map_operations()
{
$this->startStopWatch();
$collection = Collection::make([1, 2, 3, 4, 5])->parallelMap(function (int $number) {
sleep(1);
return $number * 10;
});
$this->assertTookLessThanSeconds(2);
$this->assertEquals([10, 20, 30, 40, 50], $collection->toArray());
}
You're probably wondering how this magic works. Well, the hard part is done inside Amp's new package called parallel-functions
. Here's a short description of what it does taken from their docs:
amphp/parallel-functions is a simplifying layer on top of amphp/parallel. It allows parallel code execution by leveraging threads or processes, depending on the installed extensions. All data sent to / received from the child processes / threads must be serializable using PHP’s
serialize()
function.
Here's an example, again take from their docs, on how you can use the package directly:
use Amp\Promise;
use function Amp\ParallelFunctions\parallelMap;
$values = Promise\wait(parallelMap([1, 2, 3], function ($time) {
\sleep($time); // a blocking function call, might also do blocking I/O here
return $time * $time;
}));
The parallelMap
macro in our package simply uses their magic. Here's the definition of the macro:
Collection::macro('parallelMap', function (callable $callback): Collection {
$promises = parallelMap($this->items, $callback);
$this->items = wait($promises);
return $this;
});
Be aware that you shouldn't use parallelMap
if the work done in the closure is very simple. Using parallelMap
causes quite some overhead and is memory intensive. Don't use this for small operations or on a large collection.
Thanks Niklas Keller for Amp and that wonderful amphp/parallel-functions
.
What are your thoughts on "An async map function"?