מתי להשתמש
"Workflow נופל", "Errors", "Retry", "Try/catch", "Silent failures", "Alerts".
הוראות עבודה
1. למה זה קריטי
Workflow ללא error handling = silent failures = lost data = trust lost.
2. Types of Errors
א. Transient (זמני)
- Network timeout.
- API rate limit (429).
- Service temporarily down (503).
- Solution: Retry with backoff.
ב. Persistent (קבוע)
- Bad data (validation fail).
- Authentication broken.
- Missing required field.
- Solution: Alert + Manual fix.
ג. Catastrophic
- Tool itself down.
- Account suspended.
- Solution: Fallback to manual / Alternative tool.
3. Retry Strategy — Exponential Backoff
Attempt 1: immediate
Attempt 2: wait 1 second
Attempt 3: wait 5 seconds
Attempt 4: wait 30 seconds
Attempt 5: wait 5 minutes
After: give up, alert
Why
- Transient errors usually resolve quickly.
- Backoff prevents hammering recovering service.
4. Tool-Specific Implementation
Zapier
- Auto-Retry: 3 attempts in 30 min.
- Manual replay: from Zap History.
- Custom alerts: separate Zap that runs on failure.
Make.com
- Per-module error handlers (built-in!):
- Resume — continue with default.
- Commit — save state, retry next time.
- Rollback — undo previous.
- Break — stop.
- Custom error route → Slack/Email alert.
n8n
- Error workflows — separate workflow runs on error.
- Try-Catch nodes.
- Code node with try/catch.
5. Common Error Patterns + Solutions
A. Rate Limit (429)
- Strategy: Read
Retry-Afterheader. - Wait that long + retry.
- Long-term: Reduce request frequency.
B. Auth Token Expired (401)
- Strategy: Refresh token (OAuth).
- Retry with new token.
- Long-term: Auto-refresh before expiry.
C. Validation Error (400)
- Don't retry — same data will fail.
- Log the bad data.
- Alert for manual review.
D. Server Error (500/502/503)
- Retry with exponential backoff.
- Max retries: 5.
- After: alert + queue for manual.
E. Network Timeout
- Retry with backoff.
- Increase timeout if pattern.
6. Alerting
Channels
- Slack — most common.
- Email.
- PagerDuty for on-call (critical).
- SMS (very urgent).
Alert Content
- ✅ What workflow failed.
- ✅ When (timestamp).
- ✅ Error message.
- ✅ Affected record/data.
- ✅ Link to retry / debug.
Anti-pattern
- ❌ Alert on every minor error → ignored fatigue.
- ✅ Alert on aggregate (5+ failures in 10 min).
7. Dead Letter Queue (DLQ)
Concept
- Failed messages → special "queue" for review.
- Don't lose data.
- Manual replay possible.
Implementation
- Storage: Airtable / Google Sheets / Database.
- Schema: Original payload, error, timestamp, status.
- Process: Daily review.
8. Idempotency — Critical
Why
- Retries = same operation runs multiple times.
- If not idempotent → duplicates.
Patterns
- Use unique IDs in destination.
- Check before insert ("If exists, update; else create").
- Idempotency keys (Stripe, others support).
9. Fallback Paths
Concept
- Plan B if primary fails.
Examples
- Primary: Send via Twilio. Fallback: Send via SendGrid.
- Primary: Real-time API call. Fallback: Queue for batch.
10. Best Practices
- Every workflow has error handler — non-negotiable.
- Test failure paths in dev.
- Monitor error rates — alert if spike.
- Review DLQ weekly.
- Document what each error means + how to fix.
11. Israel Specifics
- Hebrew error messages — display correctly in alerts.
- Time zones in logs (Israel + UTC).
12. אסיים בהמלצה.
קלט נדרש
| פריט | תיאור |
|---|---|
| Tool | Zapier/Make/n8n |
| Workflow criticality | Low/Med/Critical |
| Volume | events/day |
| Current handling | מה יש |
פלט צפוי
| רכיב | תיאור |
|---|---|
| Error categories | what to expect |
| Retry strategy | per category |
| Alerting plan | channels + threshold |
| DLQ design | אם רלוונטי |
| Fallback paths | אם critical |
| המלצה | פעולה אחת |
דגלים אדומים
- 🚨 No error handling — silent failures.
- 🚨 Retry without backoff — hammers recovering service.
- 🚨 Alert on every error — alert fatigue.
- 🚨 No idempotency — retries = duplicates.
הערות חשובות
- Make's error handlers = best in class for non-developers.
- Logs are gold — keep at least 30 days.
- Postmortems for major failures — learning.
פרומפט לדוגמה
Critical workflow ב-Make. Error handling design.
Zapier workflow נופל 5 פעמים בשבוע. איך לאבחן?
DLQ ב-n8n. Build it.
© 2026 Automation Expert Pro | גרסה 1.0.0