Monitoring Mechanic
Mechanic is frequently used for business-critical operations. The Mechanic team closely monitors the platform (and shares status information at status.mechanic.dev), but it can be useful to set up task-specific monitoring, specific to the visibility needs of each business.
Mechanic is a highly available system with built-in redundancies. It processes data asynchronously, using queues — this means that, generally, any failures simply result in a delay, rather than resulting in lost data.
Mechanic's platform status is available at status.mechanic.dev. The system metrics area contains realtime data indicating how quickly Mechanic is processing runs, as well as information about observed delays in events coming from Shopify. (These can be useful in diagnosing whether an emergent issue is caused by Mechanic, or if it's a problem on Shopify's end.)
Mechanic's status page also supports alert subscriptions, via email, SMS, Slack, and webhooks. If Mechanic is an important part of your operations, we strongly recommend subscribing to status alerts.
Mechanic does not have native alerting for task or action runs that return errors.
To monitor actions, subscribe to the mechanic/actions/perform event, which re-invokes a task with the results of each action run. Use this opportunity to inspect the status of the action, responding accordingly. To learn more, see Responding to action results.
To monitor tasks, use the HTTP action to ping a service like Cronitor when critical tasks run, configuring that service to send alerts should the pings ever miss their schedule.
It's generally preferable to use an external service for this sort of thing, rather than using Mechanic to monitor itself. Still, monitoring tasks are viable, by using a scheduler event to check on an expiring flag in the Mechanic cache. This way, by setting that flag during sensitive task runs (using the Cache action), a sort of dead man's switch can be created: if the scheduled run ever finds that the flag is not present, that task could then send an email (or post to Slack, or whatever's useful).
When run manually, this task sets an auto-expiring flag in the Mechanic cache, set to expire in ten minutes. Separately, this task checks for the presence of that flag every ten minutes. If the flag is missing (indicating that the task was not run manually in the last ten minutes), the task sends an email.
Subscriptions
Code
mechanic/scheduler/10min
mechanic/user/trigger
{% if event.topic contains "mechanic/scheduler" %}
{% if cache["monitor-10min"] %}
{% log ok: true, cache: cache["monitor-10min"] %}
{% else %}
{% action "email" %}
{
"to": "[email protected]",
"subject": "Monitor triggered!",
"body": "Problem!"
}
{% endaction %}
{% endif %}
{% elsif event.topic == "mechanic/user/trigger" %}
{% assign now = "now" | date: "%Y-%m-%d %H:%M %p" %}
{% action "cache", "setex", "monitor-10min", 600, now %}
{% endif %}
Last modified 1yr ago