Skip to content

Storage

The Storage page tracks costs from Databricks’ predictive optimization system — the automated operations that compact, vacuum, analyze, and cluster your Delta tables. These system-initiated operations consume DBUs and generate cost, often without visibility into what triggered them or how much they cost.

The top of the page shows headline metrics:

MetricWhat it shows
Total CostAggregate estimated cost of predictive optimization operations for the selected period
OperationsTotal number of optimization operations performed
DBUs UsedTotal DBU consumption across all operations
FailedCount of operations that did not complete successfully

Two bar charts provide a visual breakdown:

  • Operations by Type — How many operations of each type were performed
  • Cost by Operation Type — How much each operation type cost

The Overview tab shows a daily operations table with the following columns:

ColumnWhat it shows
DateDay the operations were performed
WorkspaceWhich Databricks workspace
OperationOperation type (COMPACT, VACUUM, ANALYZE, CLUSTERING)
CountNumber of operations on that day
SuccessCount of successful operations
FailedCount of failed operations
DBUsTotal DBU consumption
Est. CostEstimated cost in USD

The table is sorted by estimated cost (descending) by default.

The By Table tab shows optimization activity grouped by individual table:

ColumnWhat it shows
DateDay the operations were performed
TableFully qualified table name (catalog.schema.table)
WorkspaceWhich Databricks workspace
Total OpsTotal number of operations on this table
CompactNumber of compaction operations
VacuumNumber of vacuum operations
AnalyzeNumber of analyze operations
ClusterNumber of clustering operations
Est. CostEstimated cost in USD

This view helps identify which tables are most expensive to maintain through predictive optimization.

LakeSentry tracks operations performed by Databricks’ predictive optimization system:

Operation typeWhat it does
COMPACTIONCompacts small files into larger ones for better read performance. Reduces the number of files scanned during queries.
VACUUMRemoves old file versions that are no longer needed. Reduces storage cost by cleaning up files left behind by Delta operations.
ANALYZECollects statistics on table data to improve query planning and reduce staleness.
CLUSTERINGRe-clusters data by frequently filtered columns. Improves partition pruning and reduces bytes scanned. Includes AUTO_CLUSTERING_COLUMN_SELECTION operations.

Cost is estimated using DBU consumption and a standard DBU rate. The raw data comes from the Databricks system.storage.predictive_optimization_operations_history system table.

The Storage page respects the global workspace filter and time range selector. Operations can also be filtered by operation type via the API.

Databricks storage costs appear in billing data under several usage types:

Usage typeWhat it covers
STORAGE_SPACECloud storage for Delta tables and volumes
NETWORK_BYTENetwork transfer costs for cross-region reads

These usage types are visible in the Cost Explorer compute types breakdown, not on the Storage page itself. The Storage page focuses specifically on predictive optimization operation costs (which are DBU-based), not on the underlying cloud storage charges.

Table-level cost attribution (which tables cost the most to write to, which are unused) is available in the Cost Explorer Tables tab, not on this page. That view shows cost attributed through work units that write to each table, along with unused table detection.

Identifying expensive optimization operations

Section titled “Identifying expensive optimization operations”
  1. Check the Cost by Operation Type chart to see which operation types drive the most spend.
  2. Switch to the By Table tab and sort by Est. Cost (descending).
  3. Look for tables with disproportionately high optimization costs relative to their value.
  1. Check the Failed metric in the page header. Any non-zero value warrants investigation.
  2. In the Overview tab, look for rows with high failed counts.
  3. Failed operations may indicate table configuration issues or resource constraints.
  1. Switch to the By Table tab.
  2. Look for tables with unusually high operation counts — frequent compaction may indicate a write pattern that produces many small files.
  3. Review CLUSTERING operations on frequently queried tables to verify they are improving query performance (cross-reference with SQL Analysis).