An automation that fails silently for 3 days is worse than no automation. Here's the monitoring stack we put on every Nexora-built workflow.
The 3 layers of monitoring
- Error tracking (Sentry) — catches exceptions and groups them
- Metrics + logs (DataDog or Grafana Cloud) — tracks run frequency, latency, success rate
- Heartbeats (Better Stack or Healthchecks.io) — alerts if a cron didn't fire
Sentry: 10-minute setup
import sentry_sdk
sentry_sdk.init(dsn="https://abc@sentry.io/123", traces_sample_rate=0.1)
try: run_automation() except Exception as e: sentry_sdk.capture_exception(e) raise ```
Sentry's free tier covers 5K events/month — enough for most automation workloads.
DataDog: real-time metrics
from datadog import statsd
statsd.increment("nexora.automation.run", tags=["job:invoice_sync"])
statsd.timing("nexora.automation.duration", elapsed_ms, tags=["job:invoice_sync"])
DataDog is expensive ($31/host/month). For cost-conscious teams, Grafana Cloud free tier offers similar features.
Heartbeats: the most underrated layer
Healthchecks.io (free) gives you a unique URL per cron. Your automation pings it after every successful run. If the ping doesn't arrive within the expected interval → email/Slack alert.
curl https://hc-ping.com/your-uuid > /dev/null 2>&1
Add this as the last line of every cron job. Done.
What we monitor on every Nexora gig
- Run counts (per-day, per-week trend)
- Success rate (% non-error runs)
- Duration P50, P95, P99
- Last successful run timestamp
- Error rate by error type
Alert thresholds
- Success rate < 95% over 1 hour → page
- No run in 2x expected interval → page
- P95 duration > 3x normal → notify
Hire a monitoring + DevOps expert →
Need this built for you?
Hire a vetted Nexora expert. Escrow-protected. Fixed price. From $65.
Browse automation services →