Metrics & Aggregations

LakeSentry pre-computes metrics from the ledger into aggregated tables that power dashboards, reports, and insights. This page explains how the metric system works, what types of metrics exist, and how data accuracy is maintained.

Why pre-aggregated metrics?

Querying raw ledger data for every dashboard load would be slow. A question like “what’s my daily cost by team for the last 90 days?” requires joining usage line items with attribution rules, organizational hierarchy, and pricing data — potentially millions of rows.

Pre-aggregated metrics compute these joins once and store the results in dedicated tables. Dashboard queries then read from these tables directly, making page loads fast regardless of data volume.

Metric categories

LakeSentry computes metrics across several categories:

Cost metrics

Track spend patterns over time, broken down by attribution dimensions.

Metric	What it measures	Refresh
Cost attribution summary	Daily cost by attribution path, team, department, org unit	Daily (60-day lookback)
Pipeline spend daily	Daily cost per DLT pipeline	Daily
Team attribution daily	Daily cost per team with breakdown	Daily
User activity daily	Daily cost and activity per user	Daily
Weekend spend weekly	Weekend vs. weekday spend per workspace	Weekly
Weekly spend	Week-over-week spend trends with percentage changes	Daily

The cost attribution summary is the most important cost metric — it’s the source of truth for chargeback reports, cost explorer views, and budget tracking. It aggregates costs along all attribution dimensions: workspace, date, attribution path, org unit, department, team, principal, shared bucket, and project.

Utilization metrics

Track how efficiently compute resources are being used.

Metric	What it measures	Refresh
Cluster utilization daily	CPU, memory, idle time per cluster per day	Daily (14-day lookback)
Warehouse timeline minute	SQL warehouse state changes at minute granularity	Hourly

Cluster utilization tracks:

Average and P95 CPU/memory usage
Idle minutes (time with no activity)
Whether auto-termination is enabled
Recommended worker count (median-based sizing)

Performance metrics

Track job execution, query performance, and serving efficiency.

Metric	What it measures	Refresh
Work unit run cost	Per-run cost with percentile flagging	Daily (90-day lookback)
Job run diagnostics	Per-run diagnostic flags (duration, cost, failures)	Daily
Query daily aggregate	Query count, duration, error rates by warehouse	Daily
Query fact	Individual query metrics with cost allocation and percentile ranking	Daily
Spill analysis daily	Disk spill events by user per day	Daily
Cold start daily	Warehouse cold start latency metrics	Daily

Data quality metrics

Track data platform health and efficiency.

Metric	What it measures	Refresh
Pruning effectiveness daily	File pruning success rate per table	Daily
Scanzilla daily	Queries reading excessive data relative to output	Daily
Lineage utilization daily	Table reference patterns and access frequency	Daily
Token efficiency daily	LLM token usage in model serving	Daily

Entity metrics

Track growth and change patterns across resources.

Metric	What it measures	Refresh
Entity velocity	Growth/decline detection for workspaces, work units (jobs/pipelines), warehouses	Daily (30-day window)
Serving endpoint daily	Model serving endpoint cost and traffic	Daily
Serving requester daily	Traffic breakdown by requester per serving endpoint	Daily

Refresh cadence

Metrics are refreshed on different schedules depending on how frequently the underlying data changes and how time-sensitive the metric is:

Schedule	When it runs	Metric types
Hourly	Every hour	Warehouse timeline
Daily	Once per day	Cost attribution, utilization, query metrics, run diagnostics, entity velocity, serving metrics, weekly spend trends
Weekly	Once per week	Weekend spend

Lookback windows

Each metric has a configured lookback window that determines how much historical data is recomputed on each refresh. This handles late-arriving data and corrections:

Metric	Lookback	Why
Cost attribution summary	60 days	Attribution rules may change, requiring recalculation
Cluster utilization	14 days	Handles late-arriving utilization data
Work unit run cost	90 days	Long lookback for accurate percentile computation
Entity velocity	30 days	Growth trends need a 30-day comparison window

Refresh strategies

Metrics use different strategies for updating data:

Delete and insert by window — Delete all data for the lookback window, then recompute and insert. Used for metrics where the entire window may change (like cost attribution after a rule change).
Upsert — Insert new records or update existing ones. Used for metrics where most data is stable and only new data needs adding.
Full refresh — Truncate and recompute the entire table. Used rarely, for small lookup tables.

Golden source specifications

Each metric is defined by a YAML specification file that serves as the authoritative definition of:

What data sources the metric reads from
What columns it produces
What grain (unique key) the metric is computed at
The refresh schedule and lookback window
The refresh strategy

These specifications are the “golden source” — the metric implementation in SQL must match the spec. This approach ensures metrics are documented, testable, and consistent.

How metrics affect what you see

Different parts of LakeSentry read from different metric tables:

Feature	Primary metric source
Overview dashboard	Cost attribution summary, entity velocity
Cost Explorer	Cost attribution summary, weekly spend
Work unit detail	Work unit run cost, job run diagnostics
Cluster detail	Cluster utilization daily
SQL analysis	Query daily aggregate, spill analysis, scanzilla
Budgets	Cost attribution summary (for actual spend tracking)
Insights	Multiple metrics depending on insight type

Data accuracy

LakeSentry maintains data accuracy through several mechanisms:

Immutable raw layer — Original data from Databricks is never modified. Metrics can always be traced back to source.
Idempotent computation — Running a metric refresh twice produces identical results. There are no race conditions or ordering dependencies within a metric.
Lookback recomputation — Each refresh recomputes a window of historical data, catching any corrections or late-arriving records.
Dependency ordering — Metrics that depend on other metrics are computed in the correct order. Cost attribution runs before the attribution summary metric is refreshed.

If you ever notice a discrepancy, LakeSentry’s pipeline design means the fix is straightforward: recompute the metric from ledger data. No data is lost because the raw layer is append-only.

Next steps

How LakeSentry Works — The full data pipeline that feeds metrics
Cost Attribution & Confidence Tiers — How the cost attribution summary is computed
Overview Dashboard — Where metrics power your daily cost view