Skip to content

FAQ

Answers to common questions about LakeSentry from platform engineering, FinOps, and data teams.

LakeSentry is a Databricks cost observability and optimization platform. It connects to your Databricks account, ingests system table data, and provides cost breakdowns, anomaly detection, waste identification, and safe optimization actions — all through a single dashboard. See What is LakeSentry for a full overview.

How is this different from Databricks’ built-in cost tools?

Section titled “How is this different from Databricks’ built-in cost tools?”

Built-in tools show raw usage data per workspace. LakeSentry normalizes cost across all your workspaces, detects anomalies and waste automatically, and gives you one-click actions to fix what it finds — no SQL or manual analysis required.

Key differences:

  • Cross-workspace visibility — See all workspaces and regions in one view
  • Automatic detection — Anomalies, waste, and optimization opportunities surfaced without manual queries
  • Attribution — Costs attributed to teams, projects, and owners with confidence tiers
  • Safe actions — Approve and execute optimizations with guardrails and audit trail

How does LakeSentry compare to other optimization platforms?

Section titled “How does LakeSentry compare to other optimization platforms?”

LakeSentry is read-only by default, charges no per-workspace fees, and focuses on safe automation with approval workflows and guardrails. Most alternatives require broad write access from day one or charge per workspace, which penalizes teams with many environments.

LakeSentry serves three primary audiences:

  • Platform engineering teams — Pinpoint runaway jobs, warehouse drift, and regressions before they impact reliability
  • FinOps teams — Get normalized cost visibility across workspaces with evidence-backed optimization priorities
  • Data and ML teams — See where experiments and serving endpoints burn budget, then act with clear guardrails

Yes. Unity Catalog is required for reliable access to the system catalog system tables used by LakeSentry. If your Databricks account doesn’t have Unity Catalog enabled, you’ll need to enable it before connecting. See Connecting Your Databricks Account for prerequisites.

All plans include unlimited workspaces with no per-workspace fees. You connect at the account level, then add region connectors for each region where you operate.

We operate in multiple regions — does that matter?

Section titled “We operate in multiple regions — does that matter?”

Yes. You’ll connect each region you operate in by creating a Region Connector and deploying a collector for each region. LakeSentry shows connector health per region and continues monitoring wherever connectors are healthy.

Most teams complete the initial setup in under 30 minutes:

  1. Create a service principal and grant permissions (10–15 minutes)
  2. Create an account connector in LakeSentry (2 minutes)
  3. Deploy the collector in Databricks (10–15 minutes)

After the first collector run, data begins appearing in dashboards within 15–30 minutes. See the Quick Start Guide for the full walkthrough.

LakeSentry requires a Databricks service principal with SELECT access to system tables. The minimum required tables cover billing, compute, and workload data. Optional tables (MLflow, serving, storage) unlock additional features.

The full permission requirements are documented in Account & Connector Setup.

Yes. LakeSentry can run fully read-only for reporting and detection. Write permissions are only required if you choose to execute actions (manual approvals or autopilot). See Action Plans & Automation for details on the safety model.

No. LakeSentry queries Databricks system tables for usage and cost metadata only. It never accesses your business data, notebooks, or query results. The system tables contain billing records, compute metadata, job run history, and similar operational data.

In the current release, yes. LakeSentry stores raw query text because it improves insight quality — for example, identifying which queries contribute most to warehouse costs. If this is a concern, contact us to discuss deployment options while self-serve controls for query text handling are being developed.

LakeSentry runs read-only by default. Automation is opt-in and layered:

  • Tier 2 (Manual recommendations) — LakeSentry suggests changes with instructions. You execute them yourself in Databricks.
  • Tier 1 (Approval required) — LakeSentry can execute changes, but an admin must explicitly approve each action.
  • Tier 0 (Autopilot) — Selected safe actions run automatically, governed by allowlists, denylists, cooldowns, rate limits, and a kill switch.

Every action is logged in the Audit Log. See Action Plans & Automation for the complete safety model.

Yes. The kill switch immediately halts all in-progress and pending automated actions across all workspaces. It’s accessible from the Actions page header and the global navigation bar. See Insights & Actions for details.

LakeSentry offers Free, Standard, and Pro tiers. All plans include unlimited workspaces and the full feature set — the differences are user count, history depth, and advanced capabilities. See lakesentry.io/pricing for current plan details and pricing.

Yes. LakeSentry uses Databricks list prices by default and supports optional DBU price overrides so your dashboards reflect your contract reality. Configure price overrides in Settings.

The Free plan is free forever with no credit card required. It includes unlimited workspaces, 1 user, and 3 months of history — enough to evaluate the platform with your full Databricks environment.

Why don’t my costs match the Databricks console?

Section titled “Why don’t my costs match the Databricks console?”

This is the most common question we receive. Differences are usually caused by:

  • Time zone differences — LakeSentry uses UTC for all calculations
  • Cost model — LakeSentry uses list prices by default; your Databricks console may show negotiated pricing
  • Scope — LakeSentry only shows costs for connected workspaces and regions

See Cost Discrepancies for a detailed breakdown of common causes.

Unattributed cost is spend that couldn’t be assigned to a specific team, project, or owner. This happens when resources lack tags, ownership signals, or matching attribution rules.

LakeSentry uses a tiered attribution model with four confidence levels — Exact, Strong, Estimated, and Unattributed. See Cost Attribution & Confidence Tiers for how to reduce unattributed costs.

This depends on two factors:

  • Databricks system table retention — Billing data typically goes back 30–90 days. Compute and query history retention varies.
  • LakeSentry plan limits — Free retains 3 months, Standard retains 12 months, Pro retains unlimited history.

LakeSentry captures whatever historical data is available in system tables on the first collector run. Connect early to maximize your historical data.

LakeSentry uses Z-score statistical analysis to compare recent cost values against a historical baseline. When a value deviates significantly from the norm, it’s flagged as an anomaly with evidence (baseline, recent average, cost delta, Z-score, and confidence level).

Anomaly detection requires at least 5 data points to establish a baseline and applies minimum thresholds ($10 baseline, $50 delta) to avoid noise. See Anomaly Detection for the full methodology.

What types of waste does LakeSentry detect?

Section titled “What types of waste does LakeSentry detect?”

LakeSentry identifies several types of waste:

  • Idle clusters — Running clusters with no active workloads
  • Overprovisioned workers — Clusters with consistently low utilization
  • Zombie endpoints — Serving endpoints with no traffic
  • Expensive queries — Queries consuming disproportionate resources
  • Data retention waste — Storage costs for unused or rarely accessed data

See Waste Detection & Insights for details on each waste type.

Yes. LakeSentry tracks:

  • MLflow experiments — Cost attribution per experiment and run. See MLflow.
  • Model serving endpoints — Cost tracking and traffic-to-cost correlation. See Model Serving.
  • Training jobs — Cost per training run via the standard Work Units tracking.

Does LakeSentry support budgets and alerts?

Section titled “Does LakeSentry support budgets and alerts?”

Yes. You can create budgets at the workspace, organization unit, department, or team level with threshold alerts. See Budgets and Organizational Hierarchy & Budgets for details.

Where do I start if something isn’t working?

Section titled “Where do I start if something isn’t working?”

Start with the Common Issues page, which has a quick reference table mapping symptoms to solutions. For collector-specific problems, see Collector Issues. For data latency questions, see Data Freshness & Pipeline Status.

  1. Gather relevant details (error messages, connector status, timestamps).
  2. Contact LakeSentry support through the app — click the help icon in the bottom-right corner.
  3. For urgent issues, email support directly with the details listed in the escalation checklist.