Snapshot
| Detail | Value |
|---|---|
| Client | B2B SaaS company, $6M ARR, 85 employees |
| Industry | SaaS / Project Management |
| Engagement | Data Assessment → Data Pipeline Build → Ongoing Support |
| Team | 1 data engineer, 1 backend developer, tech lead oversight |
| Duration | 7 weeks (1-week assessment + 6-week build) |
| Tech Stack | Fivetran, Snowflake, dbt, Metabase, Airflow, Python |
| Key Result | 56 hours/month of manual reporting eliminated; month-end close from 5 days to 2 days |
The Challenge
Our client builds a project management SaaS platform used by 1,200+ companies. At $6M ARR and 40% year-over-year growth, the company was generating data across 7 systems — but couldn’t answer basic business questions without 2 analysts spending days in spreadsheets.
The data reality was frustratingly common. Customer data lived in HubSpot CRM. Subscription and payment data lived in Stripe. Product usage data lived in their PostgreSQL production database. Support data lived in Intercom. Marketing data lived in Google Analytics and LinkedIn Ads. Financial data lived in QuickBooks. Employee data lived in Gusto.
Answering “what is our Net Revenue Retention?” required an analyst to export data from Stripe (subscription changes), HubSpot (customer segmentation), and the product database (usage patterns), paste them into a Google Sheet with VLOOKUPs, manually reconcile customer IDs across systems (which used different identifiers), and produce a number that nobody fully trusted.
The CEO’s request was specific: “I want a dashboard I can open Monday morning that shows me ARR, NRR, churn, expansion, and usage trends — updated automatically, trustworthy, no analyst intervention.”
The VP of Finance had a parallel need: month-end close took 5 days because financial reporting required manual data aggregation from Stripe, QuickBooks, and HubSpot. She wanted automated revenue recognition, subscription analytics, and cohort analysis without spreadsheet gymnastics.
Our Approach
Week 1: Data Assessment ($4,000)
We mapped every data source, every reporting need, and every manual process.
Data sources audited:
HubSpot CRM (contacts, companies, deals), Stripe (subscriptions, invoices, payments, refunds), PostgreSQL production DB (users, workspaces, feature usage events), Intercom (conversations, tags, resolution data), Google Analytics (traffic, acquisition), QuickBooks (GL, AP, AR), and Gusto (employee count, department structure).
Reporting needs identified (prioritized):
| Report | Current Process | Time Cost | Frequency |
|---|---|---|---|
| ARR/MRR dashboard | Manual Stripe export + spreadsheet | 4 hours | Weekly |
| Net Revenue Retention | Multi-source spreadsheet reconciliation | 8 hours | Monthly |
| Customer health score | Not calculated (no cross-system view) | N/A | Needed |
| Monthly financial close pack | Manual aggregation from 4 systems | 20 hours | Monthly |
| Product usage analytics | Production DB queries by engineer | 6 hours | Weekly |
| Support volume and resolution | Intercom export + spreadsheet | 3 hours | Weekly |
| Marketing attribution | GA + HubSpot manual correlation | 4 hours | Monthly |
Assessment deliverable: A prioritized data architecture plan: ingest all 7 sources into a Snowflake data warehouse, build dbt transformation models for key metrics, and deploy self-serve dashboards in Metabase.
Week 2–3: Data Infrastructure
Data ingestion (Fivetran):
Connected all 7 source systems to Fivetran with automated sync schedules. Fivetran handles extraction, schema detection, and incremental loading — replicating source data into Snowflake without custom ETL code.
| Source | Sync Frequency | Rationale |
|---|---|---|
| Stripe | Every 6 hours | Revenue data needs near-real-time for dashboards |
| HubSpot | Every 6 hours | CRM data changes throughout the day |
| PostgreSQL (product DB) | Every 1 hour | Usage data powers real-time product analytics |
| Intercom | Every 6 hours | Support metrics need same-day visibility |
| Google Analytics | Daily | Traffic data is analyzed daily, not hourly |
| QuickBooks | Daily | Financial data changes during business hours |
| Gusto | Weekly | Employee data changes infrequently |
Snowflake warehouse — three-layer architecture:
- •Raw layer: Direct replica of source system data. Never modified. This is the audit trail.
- •Staging layer: Cleaned, standardized, and deduplicated data. Customer IDs unified across systems (HubSpot company_id → Stripe customer_id → product DB workspace_id).
- •Marts layer: Business-ready tables by domain: finance mart (MRR, ARR, churn, expansion), customer mart (health scores, segments, lifecycle), product mart (usage metrics, feature adoption), support mart (ticket volume, resolution times, CSAT).
Week 4–5: dbt Transformation Models
The dbt models are where raw data becomes business intelligence. We built 34 models organized into four marts:
Finance mart (key models):
- •mrr_by_customer: Calculates MRR per customer by analyzing Stripe subscription states, handling upgrades, downgrades, and mid-month changes with proper proration logic.
- •arr_waterfall: Decomposes ARR changes into new business, expansion, contraction, and churn — the standard SaaS metrics waterfall.
- •net_revenue_retention: Cohort-based NRR calculation — “of the customers we had 12 months ago, how much are they paying now?”
Customer mart (key models):
- •customer_health_score: A composite score (0–100) combining product usage frequency, feature adoption breadth, support ticket velocity, billing status, and engagement trend.
- •customer_segments: Rule-based segmentation — Enterprise (>50 seats), Mid-Market (10–50 seats), SMB (<10 seats) — combined with behavioral segments (Power User, At-Risk, New, Growing, Stable).
Product mart (key models):
- •feature_adoption: For each core feature, what percentage of active workspaces used it in the last 30 days. Answers “which features drive retention?”
- •workspace_engagement: Daily/weekly/monthly active user calculations with trend analysis. Identifies workspaces with declining engagement before they churn.
All models include data quality tests: dbt tests validate uniqueness, not-null constraints, referential integrity, and accepted values. If a source system sends corrupt data, the pipeline fails loudly rather than producing wrong metrics.
Week 6–7: Dashboards and Rollout
Metabase dashboards — 6 self-serve dashboards with drill-down capability:
- •Executive Dashboard — ARR, MRR, NRR, customer count, churn rate, expansion rate. Updated every 6 hours. The “Monday morning” dashboard the CEO requested.
- •Financial Close Dashboard — Revenue by category, subscription analytics, cohort analysis, and reconciliation checks. Replaces 20 hours of manual month-end work.
- •Customer Health Dashboard — All customers ranked by health score with color coding (green/yellow/red). Customer success team uses this for proactive outreach.
- •Product Analytics Dashboard — DAU/WAU/MAU, feature adoption rates, engagement trends by customer segment.
- •Support Analytics Dashboard — Ticket volume, resolution times, ticket categories, CSAT trends.
- •Marketing Analytics Dashboard — Traffic by source, lead conversion rates, cost per lead, and marketing-attributed pipeline.
User training: 2-hour training session with the analytics, finance, and leadership teams covering how to read the dashboards, apply filters, create saved questions, and request new metrics.
The Results
| Metric | Before | After | Change |
|---|---|---|---|
| Hours/month on manual reporting | 56+ hours | 4 hours (review only) | -93% |
| Month-end financial close | 5 days | 2 days | -60% |
| Time to answer a business question | 4–8 hours (manual analysis) | 30 seconds (dashboard) | -99% |
| Data sources unified | 0 (siloed) | 7 (all in Snowflake) | |
| Customer health visibility | None | Real-time scoring for all 1,200 customers | |
| Trusted metrics | Disputed (“my spreadsheet says different”) | Single source of truth |
Downstream business impact (first 6 months):
The customer health score dashboard enabled proactive outreach to at-risk accounts. The customer success team identified 47 accounts trending toward churn and intervened — retaining 31 of them (66% save rate). At the company’s average contract value, this preserved approximately $186,000 in ARR.
The financial close time reduction freed the VP of Finance to focus on financial planning and analysis rather than data aggregation — directly supporting the company’s fundraising preparation.
The product analytics dashboard revealed that one underused feature (time tracking) had a 2.4x higher retention correlation than any other feature. The product team prioritized improving time tracking UX, which became the top feature in the next release cycle.
Client Quote
“Before Gigabit, every Monday started with ‘does anyone have the latest numbers?’ Now every Monday starts with ‘here’s what the numbers are telling us.’ That shift — from gathering data to acting on data — is worth more than the time savings alone.”
— CEO, [Client]
Investment Summary
| Phase | Investment | Timeline |
|---|---|---|
| Data Assessment | $4,000 | 1 week |
| Data infrastructure (Fivetran + Snowflake + dbt) | $28,000 | 4 weeks |
| Dashboards + training | $12,000 | 2 weeks |
| Total build | $44,000 | 7 weeks |
| Monthly platform costs (Fivetran + Snowflake + Metabase) | $850/month | Ongoing |
| Ongoing data engineering support | $2,000/month | Retainer |
Drowning In Spreadsheets And Manual Reporting?
We build modern data stacks that turn your scattered data into trusted, automated business intelligence. From assessment to live dashboards in 6–8 weeks.