Metrics & Aggregations
LakeSentry pre-computes metrics from the ledger into aggregated tables that power dashboards, reports, and insights. This page explains how the metric system works, what types of metrics exist, and how data accuracy is maintained.
Why pre-aggregated metrics?
Section titled “Why pre-aggregated metrics?”Querying raw ledger data for every dashboard load would be slow. A question like “what’s my daily cost by team for the last 90 days?” requires joining usage line items with attribution rules, organizational hierarchy, and pricing data — potentially millions of rows.
Pre-aggregated metrics compute these joins once and store the results in dedicated tables. Dashboard queries then read from these tables directly, making page loads fast regardless of data volume.
Metric categories
Section titled “Metric categories”LakeSentry computes metrics across several categories:
Cost metrics
Section titled “Cost metrics”Track spend patterns over time, broken down by attribution dimensions.
| Metric | What it measures | Refresh |
|---|---|---|
| Cost attribution summary | Daily cost by attribution path, team, department, org unit | Daily (60-day lookback) |
| Pipeline spend daily | Daily cost per DLT pipeline | Daily |
| Team attribution daily | Daily cost per team with breakdown | Daily |
| User activity daily | Daily cost and activity per user | Daily |
| Weekend spend weekly | Weekend vs. weekday spend per workspace | Weekly |
| Weekly spend | Week-over-week spend trends with percentage changes | Daily |
The cost attribution summary is the most important cost metric — it’s the source of truth for chargeback reports, cost explorer views, and budget tracking. It aggregates costs along all attribution dimensions: workspace, date, attribution path, org unit, department, team, principal, shared bucket, and project.
Utilization metrics
Section titled “Utilization metrics”Track how efficiently compute resources are being used.
| Metric | What it measures | Refresh |
|---|---|---|
| Cluster utilization daily | CPU, memory, idle time per cluster per day | Daily (14-day lookback) |
| Warehouse timeline minute | SQL warehouse state changes at minute granularity | Hourly |
Cluster utilization tracks:
- Average and P95 CPU/memory usage
- Idle minutes (time with no activity)
- Whether auto-termination is enabled
- Recommended worker count (median-based sizing)
Performance metrics
Section titled “Performance metrics”Track job execution, query performance, and serving efficiency.
| Metric | What it measures | Refresh |
|---|---|---|
| Work unit run cost | Per-run cost with percentile flagging | Daily (90-day lookback) |
| Job run diagnostics | Per-run diagnostic flags (duration, cost, failures) | Daily |
| Query daily aggregate | Query count, duration, error rates by warehouse | Daily |
| Query fact | Individual query metrics with cost allocation and percentile ranking | Daily |
| Spill analysis daily | Disk spill events by user per day | Daily |
| Cold start daily | Warehouse cold start latency metrics | Daily |
Data quality metrics
Section titled “Data quality metrics”Track data platform health and efficiency.
| Metric | What it measures | Refresh |
|---|---|---|
| Pruning effectiveness daily | File pruning success rate per table | Daily |
| Scanzilla daily | Queries reading excessive data relative to output | Daily |
| Lineage utilization daily | Table reference patterns and access frequency | Daily |
| Token efficiency daily | LLM token usage in model serving | Daily |
Entity metrics
Section titled “Entity metrics”Track growth and change patterns across resources.
| Metric | What it measures | Refresh |
|---|---|---|
| Entity velocity | Growth/decline detection for workspaces, work units (jobs/pipelines), warehouses | Daily (30-day window) |
| Serving endpoint daily | Model serving endpoint cost and traffic | Daily |
| Serving requester daily | Traffic breakdown by requester per serving endpoint | Daily |
Refresh cadence
Section titled “Refresh cadence”Metrics are refreshed on different schedules depending on how frequently the underlying data changes and how time-sensitive the metric is:
| Schedule | When it runs | Metric types |
|---|---|---|
| Hourly | Every hour | Warehouse timeline |
| Daily | Once per day | Cost attribution, utilization, query metrics, run diagnostics, entity velocity, serving metrics, weekly spend trends |
| Weekly | Once per week | Weekend spend |
Lookback windows
Section titled “Lookback windows”Each metric has a configured lookback window that determines how much historical data is recomputed on each refresh. This handles late-arriving data and corrections:
| Metric | Lookback | Why |
|---|---|---|
| Cost attribution summary | 60 days | Attribution rules may change, requiring recalculation |
| Cluster utilization | 14 days | Handles late-arriving utilization data |
| Work unit run cost | 90 days | Long lookback for accurate percentile computation |
| Entity velocity | 30 days | Growth trends need a 30-day comparison window |
Refresh strategies
Section titled “Refresh strategies”Metrics use different strategies for updating data:
- Delete and insert by window — Delete all data for the lookback window, then recompute and insert. Used for metrics where the entire window may change (like cost attribution after a rule change).
- Upsert — Insert new records or update existing ones. Used for metrics where most data is stable and only new data needs adding.
- Full refresh — Truncate and recompute the entire table. Used rarely, for small lookup tables.
Golden source specifications
Section titled “Golden source specifications”Each metric is defined by a YAML specification file that serves as the authoritative definition of:
- What data sources the metric reads from
- What columns it produces
- What grain (unique key) the metric is computed at
- The refresh schedule and lookback window
- The refresh strategy
These specifications are the “golden source” — the metric implementation in SQL must match the spec. This approach ensures metrics are documented, testable, and consistent.
How metrics affect what you see
Section titled “How metrics affect what you see”Different parts of LakeSentry read from different metric tables:
| Feature | Primary metric source |
|---|---|
| Overview dashboard | Cost attribution summary, entity velocity |
| Cost Explorer | Cost attribution summary, weekly spend |
| Work unit detail | Work unit run cost, job run diagnostics |
| Cluster detail | Cluster utilization daily |
| SQL analysis | Query daily aggregate, spill analysis, scanzilla |
| Budgets | Cost attribution summary (for actual spend tracking) |
| Insights | Multiple metrics depending on insight type |
Data accuracy
Section titled “Data accuracy”LakeSentry maintains data accuracy through several mechanisms:
- Immutable raw layer — Original data from Databricks is never modified. Metrics can always be traced back to source.
- Idempotent computation — Running a metric refresh twice produces identical results. There are no race conditions or ordering dependencies within a metric.
- Lookback recomputation — Each refresh recomputes a window of historical data, catching any corrections or late-arriving records.
- Dependency ordering — Metrics that depend on other metrics are computed in the correct order. Cost attribution runs before the attribution summary metric is refreshed.
If you ever notice a discrepancy, LakeSentry’s pipeline design means the fix is straightforward: recompute the metric from ledger data. No data is lost because the raw layer is append-only.
Next steps
Section titled “Next steps”- How LakeSentry Works — The full data pipeline that feeds metrics
- Cost Attribution & Confidence Tiers — How the cost attribution summary is computed
- Overview Dashboard — Where metrics power your daily cost view