I was working on a Drupal 8 project where I had to iterate through a huge chunk of data in a cron. Creating this huge chunk of data in an array was a great compromise on memory. Generators came to the rescue.
A little theory
According to PHP.net: “A generator function looks just like a normal function, except that instead of returning a value, a generator yields as many values as it needs to.
When a generator function is called, it returns an object that can be iterated over. When you iterate over that object (for instance, via a foreach loop), PHP will call the generator function each time it needs a value, then saves the state of the generator when the generator yields a value so that it can be resumed when the next value is required.
Once there are no more values to be yielded, then the generator function can simply return, and the calling code continues just as if an array has run out of values.”
If we consider the theory above, we can easily understand that: “A generator allows you to write code that uses foreach to iterate over a set of data without needing to build an array in memory...” (from PHP.net).
With this in mind, I’ll try to explain why generators are awesome, using some use cases taken from an application I worked on recently.
As I mentioned in the beginning, I needed to iterate through a huge dataset of approximately 100,000 nodes in a cron run. Here, I wanted to compare the current time with the node’s scheduled publishing date. If the scheduled publishing date had passed, I needed to switch the nodes to ‘Published’.
Iterating through a large dataset
For this use case, I am retrieving the dataset from a database query using the following function:
Traditionally, I would write something like this to iterate through the dataset to check the eligible nodes and to set the status to publish:
The problem here is easy to see: the more nodes I have, the more $nodes will consume memory.
A solution could be to create an iterator that would iterate through the $nodes and return the ones that are eligible. But we would have to create a new class just for that, and iterators are a bit tedious to write. Luckily for us, since PHP 5.5, we can use generators!
Now, we just need to refactor our cron function:
By introducing generators, PHP has given us an extraordinary tool. Through generators, PHP enables developers to write memory-efficient applications without difficulty.