FAQ

Answers to common questions about LakeSentry from platform engineering, FinOps, and data teams.

General

What is LakeSentry?

LakeSentry is a Databricks cost observability and optimization platform. It connects to your Databricks account, ingests system table data, and provides cost breakdowns, anomaly detection, waste identification, and safe optimization actions — all through a single dashboard. See What is LakeSentry for a full overview.

How is this different from Databricks’ built-in cost tools?

Built-in tools show raw usage data per workspace. LakeSentry normalizes cost across all your workspaces, detects anomalies and waste automatically, and gives you one-click actions to fix what it finds — no SQL or manual analysis required.

Key differences:

Cross-workspace visibility — See all workspaces and regions in one view
Automatic detection — Anomalies, waste, and optimization opportunities surfaced without manual queries
Attribution — Costs attributed to teams, projects, and owners with confidence tiers
Safe actions — Approve and execute optimizations with guardrails and audit trail

How does LakeSentry compare to other optimization platforms?

LakeSentry is read-only by default, charges no per-workspace fees, and focuses on safe automation with approval workflows and guardrails. Most alternatives require broad write access from day one or charge per workspace, which penalizes teams with many environments.

Who is LakeSentry for?

LakeSentry serves three primary audiences:

Platform engineering teams — Pinpoint runaway jobs, warehouse drift, and regressions before they impact reliability
FinOps teams — Get normalized cost visibility across workspaces with evidence-backed optimization priorities
Data and ML teams — See where experiments and serving endpoints burn budget, then act with clear guardrails

Setup and requirements

Does LakeSentry require Unity Catalog?

Yes. Unity Catalog is required for reliable access to the system catalog system tables used by LakeSentry. If your Databricks account doesn’t have Unity Catalog enabled, you’ll need to enable it before connecting. See Connecting Your Databricks Account for prerequisites.

How many workspaces can I connect?

All plans include unlimited workspaces with no per-workspace fees. You connect at the account level, then add region connectors for each region where you operate.

We operate in multiple regions — does that matter?

Yes. You’ll connect each region you operate in by creating a Region Connector and deploying a collector for each region. LakeSentry shows connector health per region and continues monitoring wherever connectors are healthy.

How long does setup take?

Most teams complete the initial setup in under 30 minutes:

Create a service principal and grant permissions (10–15 minutes)
Create an account connector in LakeSentry (2 minutes)
Deploy the collector in Databricks (10–15 minutes)

After the first collector run, data begins appearing in dashboards within 15–30 minutes. See the Quick Start Guide for the full walkthrough.

What permissions does LakeSentry need?

LakeSentry requires a Databricks service principal with SELECT access to system tables. The minimum required tables cover billing, compute, and workload data. Optional tables (MLflow, serving, storage) unlock additional features.

The full permission requirements are documented in Account & Connector Setup.

Security and data access

Can we run read-only?

Yes. LakeSentry can run fully read-only for reporting and detection. Write permissions are only required if you choose to execute actions (manual approvals or autopilot). See Action Plans & Automation for details on the safety model.

Do you access our business data?

No. LakeSentry queries Databricks system tables for usage and cost metadata only. It never accesses your business data, notebooks, or query results. The system tables contain billing records, compute metadata, job run history, and similar operational data.

Do you store query text?

In the current release, yes. LakeSentry stores raw query text because it improves insight quality — for example, identifying which queries contribute most to warehouse costs. If this is a concern, contact us to discuss deployment options while self-serve controls for query text handling are being developed.

How safe is automation?

LakeSentry runs read-only by default. Automation is opt-in and layered:

Tier 2 (Manual recommendations) — LakeSentry suggests changes with instructions. You execute them yourself in Databricks.
Tier 1 (Approval required) — LakeSentry can execute changes, but an admin must explicitly approve each action.
Tier 0 (Autopilot) — Selected safe actions run automatically, governed by allowlists, denylists, cooldowns, rate limits, and a kill switch.

Every action is logged in the Audit Log. See Action Plans & Automation for the complete safety model.

Is there a kill switch?

Yes. The kill switch immediately halts all in-progress and pending automated actions across all workspaces. It’s accessible from the Actions page header and the global navigation bar. See Insights & Actions for details.

Pricing and plans

How does pricing work?

LakeSentry offers Free, Standard, and Pro tiers. All plans include unlimited workspaces and the full feature set — the differences are user count, history depth, and advanced capabilities. See lakesentry.io/pricing for current plan details and pricing.

Can we customize DBU prices?

Yes. LakeSentry uses Databricks list prices by default and supports optional DBU price overrides so your dashboards reflect your contract reality. Configure price overrides in Settings.

Is there a free trial?

The Free plan is free forever with no credit card required. It includes unlimited workspaces, 1 user, and 3 months of history — enough to evaluate the platform with your full Databricks environment.

Cost data and accuracy

Why don’t my costs match the Databricks console?

This is the most common question we receive. Differences are usually caused by:

Time zone differences — LakeSentry uses UTC for all calculations
Cost model — LakeSentry uses list prices by default; your Databricks console may show negotiated pricing
Scope — LakeSentry only shows costs for connected workspaces and regions

See Cost Discrepancies for a detailed breakdown of common causes.

What does “unattributed” cost mean?

Unattributed cost is spend that couldn’t be assigned to a specific team, project, or owner. This happens when resources lack tags, ownership signals, or matching attribution rules.

LakeSentry uses a tiered attribution model with four confidence levels — Exact, Strong, Estimated, and Unattributed. See Cost Attribution & Confidence Tiers for how to reduce unattributed costs.

How far back does historical data go?

This depends on two factors:

Databricks system table retention — Billing data typically goes back 30–90 days. Compute and query history retention varies.
LakeSentry plan limits — Free retains 3 months, Standard retains 12 months, Pro retains unlimited history.

LakeSentry captures whatever historical data is available in system tables on the first collector run. Connect early to maximize your historical data.

Features and capabilities

How does anomaly detection work?

LakeSentry uses Z-score statistical analysis to compare recent cost values against a historical baseline. When a value deviates significantly from the norm, it’s flagged as an anomaly with evidence (baseline, recent average, cost delta, Z-score, and confidence level).

Anomaly detection requires at least 5 data points to establish a baseline and applies minimum thresholds ($10 baseline, $50 delta) to avoid noise. See Anomaly Detection for the full methodology.

What types of waste does LakeSentry detect?

LakeSentry identifies several types of waste:

Idle clusters — Running clusters with no active workloads
Overprovisioned workers — Clusters with consistently low utilization
Zombie endpoints — Serving endpoints with no traffic
Expensive queries — Queries consuming disproportionate resources
Data retention waste — Storage costs for unused or rarely accessed data

See Waste Detection & Insights for details on each waste type.

Can LakeSentry track ML and AI workloads?

Yes. LakeSentry tracks:

MLflow experiments — Cost attribution per experiment and run. See MLflow.
Model serving endpoints — Cost tracking and traffic-to-cost correlation. See Model Serving.
Training jobs — Cost per training run via the standard Work Units tracking.

Does LakeSentry support budgets and alerts?

Yes. You can create budgets at the workspace, organization unit, department, or team level with threshold alerts. See Budgets and Organizational Hierarchy & Budgets for details.

Troubleshooting

Where do I start if something isn’t working?

Start with the Common Issues page, which has a quick reference table mapping symptoms to solutions. For collector-specific problems, see Collector Issues. For data latency questions, see Data Freshness & Pipeline Status.

How do I contact support?

Gather relevant details (error messages, connector status, timestamps).
Contact LakeSentry support through the app — click the help icon in the bottom-right corner.
For urgent issues, email support directly with the details listed in the escalation checklist.

Next steps

What is LakeSentry — Product overview and value proposition
Quick Start Guide — Get up and running in minutes
Common Issues — Solutions to frequent problems