מתי להשתמש
"ETL", "Data pipeline", "BigQuery", "Snowflake", "Fivetran", "Reverse ETL", "Data warehouse".
הוראות עבודה
1. ETL vs ELT vs Reverse ETL
ETL (Old)
- Extract → Transform → Load.
- Transform before loading.
- Slower, harder to debug.
ELT (Modern)
- Extract → Load → Transform.
- Raw data to warehouse, transform later.
- Better for cloud warehouses (compute is cheap).
- Easier debugging.
Reverse ETL
- Warehouse → Operational tools (CRM, marketing).
- "Activate" your warehouse data.
- Tool: Hightouch, Census.
2. Stack 2026
| Layer | Tools |
|---|---|
| Sources | SaaS apps, DBs, APIs |
| Ingestion (EL) | Fivetran, Airbyte, Stitch |
| Warehouse | BigQuery, Snowflake, Redshift, Databricks |
| Transform | dbt (Standard), SQL |
| Reverse ETL | Hightouch, Census |
| BI | Looker, Tableau, Mode, Metabase |
3. Fivetran vs Airbyte
| Fivetran | Airbyte | |
|---|---|---|
| Type | Managed SaaS | Open-source + Cloud |
| Connectors | 500+ | 350+ |
| Pricing | Expensive ($) | Free (self-host) or $ |
| Setup | Easiest | Easy |
| Custom connectors | ⚠️ Limited | ✅ Open SDK |
| Best for | Big budget | Small-mid budget |
4. dbt — Transformation Standard
What
- SQL-based transformation framework.
- Version-controlled (Git).
- Testing, documentation built-in.
Why
- Replaces fragile SQL scripts.
- Enables data team workflow.
- Industry standard 2020+.
Pricing
- dbt Core — free (self-managed).
- dbt Cloud — $0-100/seat/m.
5. Sample Data Pipeline
1. Sources:
- HubSpot (CRM)
- Stripe (payments)
- GA4 (web analytics)
- PostgreSQL (app DB)
2. EL via Fivetran (every 6 hours):
- HubSpot → BigQuery raw schema
- Stripe → BigQuery raw
- GA4 → BigQuery raw
- Postgres → BigQuery raw
3. Transform via dbt:
- Stage tables (cleanup)
- Mart tables (business logic)
- Customer 360 view
- Daily metrics
4. Reverse ETL via Hightouch:
- Customer 360 → HubSpot (enriched contacts)
- Health scores → CRM
- Revenue data → Slack channel
5. BI via Looker:
- Executive dashboard
- Marketing dashboard
- Sales pipeline
6. Costs
Small (Startup, < $5M ARR)
- Fivetran: ~$300-500/m.
- BigQuery: ~$100-300/m.
- dbt Core: free.
- Hightouch: $0-300/m.
- Total: $400-1,100/m.
Mid ($5-50M ARR)
- Fivetran: $1K-5K/m.
- Snowflake: $1K-10K/m.
- dbt Cloud: $300-1K/m.
- Hightouch: $300-1K/m.
- Total: $3K-17K/m.
Large ($50M+ ARR)
- $20K-200K+/m total.
7. Scheduling
- Fivetran: 1 hour - 24 hours sync intervals.
- dbt: Cron schedule (typically nightly + hourly for hot tables).
- Hightouch: Real-time / hourly.
8. Monitoring
What to Monitor
- Sync failures (Fivetran).
- dbt test failures (data quality).
- Schema changes (source app updated).
- Cost spikes.
Tools
- Datafold — data observability.
- Monte Carlo — data observability.
- Built-in Fivetran/dbt alerts.
9. Common Pitfalls
❌ No data quality tests (dbt tests skipped). ❌ Schema changes break downstream — alerting needed. ❌ Cost runaway — BigQuery / Snowflake compute. ❌ No documentation — analysts can't use. ❌ Reverse ETL without governance — wrong data to CRM.
10. Israel Specifics
- Hebrew columns — UTF-8 throughout.
- Israeli SaaS sources — Fivetran/Airbyte may not support → Custom.
- GDPR + Israeli Privacy Law: PII handling careful.
- Israeli BI tools: monday.com BI emerging.
11. אסיים בהמלצה.
קלט נדרש
| פריט | תיאור |
|---|---|
| Sources | אילו apps |
| Volume | rows/day |
| Budget | $/m |
| Team | analysts available? |
| Use case | BI / ML / Activation |
פלט צפוי
| רכיב | תיאור |
|---|---|
| Stack recommendation | Fivetran/dbt/etc |
| Cost estimate | monthly |
| Pipeline design | high-level |
| Monitoring plan | alerts |
| המלצה | פעולה אחת |
דגלים אדומים
- 🚨 Real-time when not needed — costs explode.
- 🚨 No tests — bad data downstream.
- 🚨 Reverse ETL without privacy — PII leak.
הערות חשובות
- Data warehouse first, then activate via reverse ETL.
- dbt = standard — invest learning.
- Pre-built connectors > custom — time = money.
פרומפט לדוגמה
SaaS B2B, 5 sources, want unified view. Stack?
Fivetran או Airbyte ל-startup?
Hightouch use case ל-marketing?
© 2026 Automation Expert Pro | גרסה 1.0.0