Batch processing of events
A code-optimization strategy for avoiding chatty event topics.
If your shop has a trigger event that fires incessantly (looking at you products/update) due to some combination of high volume operations, extensive task usage of the same event topic, and tasks that make updates to the same object types they are listening to update events for, then your affected tasks are likely good candidates for batch processing optimization.
Since the aforementioned products/update event is typically the primary culprit in many shops, this technique will concentrate on mitigating its usage. However, this technique could be applied to any other objects that have an excessive amount of update events.
Original automation criteria:
Task should listen for product creation
Task should listen for product updates
Task will update the product in some way
~10 thousand active products
~500 orders / day
A 3rd party integration updates the products in bulk at irregular schedules
These updates often include information relevant to this automation
A single task developed with this criteria would very likely have no problem efficiently processing the expected volume of product/update events that would be generated by sales, the 3rd party integration, and its own updates to products (provided that techniques like Preventing action loops and Writing a high-quality task are adhered to).
Over time though, there may be more tasks added that operate with similar criteria, the sales volume hopefully increases (β€οΈ), the product catalog might expand, and additional apps and integrations will likely be making their own product updates.
This can lead to the dreaded jammed queue.

The main strategy to avoiding this traffic jam is to refactor the task(s) that are affected and/or have some culpability.
Some good questions to answer before refactoring a task:
What level of immediacy is actually needed by this task for processing updated products? (i.e. what is the longest acceptable interval between scheduled task runs?)
How many products on average would be updated in this interval?
If a task can get away with a daily scheduled run to process all recently updated products, then using bulk operations might be a good idea. With this approach the task could optionally continue to listen on products/create if that is useful (i.e. this specific task would do useful work on a newly created product).
Task scenario
Let's instead assume that a high-level of immediacy is desired for this exercise, and go with the most frequent 10 minute scheduler option. With this approach there generally isn't a need to include a products/create event due to the frequency of scheduled task runs.
Below is how a skeleton task might look for this scenario prior to refactoring. Note that this task already has a manually triggered event that includes paginated querying of up to 25 thousand products. Manually triggered events are typically used for initial task setup, or when massive bulk changes are expected across the product catalog (and the automation will proactively be disabled for that duration π).
Refactoring the task
To convert the above task code to a much more queue-friendly version, the following steps would be taken:
Remove the
products/createandproducts/updatelistener blockConvert the manual trigger block to an
ifstatement and add acontainscheck for anymechanic/schedulereventAdd Mechanic cache checking and setting using the last task run time
(Optionally) Add task configuration to allow manual runs to process all active products, or some subset larger than the amount in a typically interval
The caching step is the key in this technique. It allows the task to review a significantly smaller count of products on each (frequent) run. In fact, many runs might have no work to do, and that is but a blip in stream of queue happiness.
The refactored task is below, which includes an option to query all products on manual runs. Importantly, when that option is enabled, this task will not schedule events, to avoid potential race conditions between task runs.
When add a manual run option that can process a very large amount of products, you should consider disabling the scheduled events, to avoid potential race conditions between task runs.
Last updated
Was this helpful?