Skip to content

What is LakeSentry

LakeSentry is a Databricks cost investigation and workload optimization platform. It helps platform teams answer “why did our bill spike?” and safely reduce waste — without risking production stability.

LakeSentry connects to your Databricks account, ingests data from system tables, and transforms it into an understandable cost model. From there, it surfaces insights about waste and anomalies, and can execute optimization actions with safety guardrails.

CapabilityWhat it means
Cost visibilitySee where money goes — by workspace, team, job, warehouse, or SKU
AttributionConnect costs to owners using rules, tags, and identity mapping
InvestigationDrill down from an anomaly to root cause in a few clicks
AutomationTerminate idle clusters, resize warehouses — only after earning trust

Finance asks about a $50K increase. Your team manually queries system tables, cross-references billing exports, and pieces together the story.

With LakeSentry, you get time-range comparison, cost breakdown by dimension, and anomaly detection — answering the question in minutes instead of hours.

Shared clusters, cross-team jobs, no clear ownership. Chargeback reports are guesswork.

LakeSentry provides attribution rules with confidence tiers (exact, strong, estimated, unattributed). It’s transparent about what it can and can’t attribute, so your chargeback numbers hold up under scrutiny.

”We’re wasting money on idle resources”

Section titled “”We’re wasting money on idle resources””

Clusters running 24/7 for jobs that run once a day. Warehouses oversized for actual query load.

LakeSentry detects waste and suggests actions with estimated savings. You review what would be saved before approving execution.

”I don’t trust automation with production infrastructure”

Section titled “”I don’t trust automation with production infrastructure””

Previous automation tools caused outages or unexpected behavior.

LakeSentry runs read-only by default. All optimization actions require explicit approval. You can enable autopilot for selected safe actions — with guardrails, rate limits, and a kill switch.

You manage Databricks infrastructure and own the bill. You need forensic investigation tools that help you trace cost back to specific jobs, clusters, and users — not executive summary charts.

You handle chargeback and showback reporting. You need attribution you can trust and rules you can configure, not opaque algorithms you can’t explain to stakeholders.

You run training jobs, experiments, and ML pipelines. You need visibility into compute spend per experiment and serving endpoint so you can optimize within your budget.

LakeSentry follows a three-step flow:

  1. Connect — Add a read-only service principal and connect your Databricks account. Takes minutes, not days.
  2. Collect — LakeSentry ingests system tables on a schedule to build a normalized cost ledger.
  3. Act safely — Review insights, approve changes — or enable autopilot for selected safe actions with guardrails.

For the detailed setup process, see the Quick Start Guide.

LakeSentry is built around a few key design decisions:

  • Conservative attribution — LakeSentry shows “unattributed” rather than guessing wrong. Confidence tiers (exact, strong, estimated, unattributed) tell you how much to trust each number.
  • Trust-building automation — All actions require manual approval before execution. You opt-in to escalating automation tiers as you build confidence.
  • Financial forensics, not real-time ops — Designed for “why did this happen?” rather than “what’s happening right now?” Time-range selectors, drill-down paths, and historical trends.
  • Low noise, high signal — Significance scoring instead of alert storms. Every insight is worth reading.

LakeSentry reads from Databricks system tables — billing, compute, jobs, queries, serving, and access metadata. It never accesses your business data, notebooks, or query results (beyond query text for insight quality in the current version).

Data sourceWhat it provides
system.billing.*Billable usage and list prices
system.compute.*Cluster and warehouse configuration and utilization
system.lakeflow.*Job and pipeline definitions and run history
system.query.historySQL statements on warehouses and serverless
system.serving.*Model serving endpoints and usage
system.access.*Workspace metadata, lineage, and network events