Skip to content

Tag Governance

The Tag Governance page helps you enforce consistent tagging across your Databricks resources. Tags are key-value labels attached to clusters, warehouses, jobs, and other resources — they’re the foundation for cost attribution, chargeback, and organizational reporting. When tags are inconsistent or missing, attribution suffers and cost reports become unreliable.

Tags serve multiple purposes in LakeSentry:

  • Cost attributionPattern rules match resources by tag values to assign costs to teams
  • Filtering — Tags are available as filters throughout the Cost Explorer, Insights, and other pages
  • Reporting — Tags enable grouping costs by project, environment, cost center, or any custom dimension
  • Compliance — Tag policies ensure resources are properly labeled for governance

Without consistent tags, a significant portion of your costs may end up as “unattributed” — see Cost Attribution & Confidence Tiers for how this affects reporting quality.

The Tag Governance page is organized into three tabs: Quality Dashboard, Policies, and Tag Hygiene.

The page header shows three summary stats:

MetricWhat it shows
ViolationsTotal count of tag violations across all workspaces
Cost at RiskTotal 30-day cost of resources with at least one violation
Duplicate ClustersNumber of unresolved tag key clusters (similar keys that may need consolidation)

The Quality Dashboard tab shows a summary of tag violations across all workspaces, with five stat cards:

StatWhat it shows
Quality ScorePercentage of resources with zero violations
Missing RequiredCount of missing-required-tag violations
Invalid ValueCount of invalid-value violations
Misspelled KeyCount of misspelled-key violations
Orphaned TagCount of orphaned-tag violations

Below the stats, a violations table lists individual violations with filtering by violation type and resource type:

ColumnWhat it shows
TypeViolation type badge (Missing Required, Invalid Value, Misspelled Key, Orphaned Tag)
ResourceResource name and type
TagThe tag key (and value, if applicable)
30-Day CostCost of the violating resource, split by DBU and cloud costs

The table defaults to sorting by cost descending, so the most expensive violations appear first. Use the Refresh button to trigger an on-demand quality recomputation.

Tag policies define which tags should be present on which resource types. They help you enforce organizational tagging standards.

  1. Click Create Policy on the Policies tab.
  2. Name — A human-readable name for the policy (e.g., “Production Cluster Tags”).
  3. Description (optional) — Details about the policy’s purpose.
  4. Resource type — Which resource type the policy applies to: cluster, warehouse, job, pipeline, endpoint, app, or all resource types.
  5. Required tag keys — Tag keys that must be present on matching resources.
  6. Allowed values (optional) — Restrict specific tag keys to a set of permitted values.
  7. Canonical key mappings (optional) — Map common misspellings or variant tag keys to their canonical form.
  8. Cost threshold (optional) — Minimum 30-day cost in USD; resources below this threshold are excluded from violation checks.

Workspace scoping is available via the API but not in the UI form — policies created through the UI apply to all workspaces.

Each policy is displayed as a card showing its name, description, resource type, the number of required keys, constrained keys, and canonical key mappings. Each card includes:

  • An active/inactive toggle to enable or disable the policy without deleting it
  • Edit and Delete buttons for managing the policy
  • The cost threshold (if configured)

Deleting a policy also removes all associated violations.

The quality worker detects four types of tag violations:

Violation typeWhat it means
Missing requiredA resource lacks a tag key that a policy lists as required
Invalid valueA tag value is not in the policy’s allowed values list for that key
Misspelled keyA tag key is similar to (but not exactly) a known key from policies or attribution rules, detected via Levenshtein distance
Orphaned tagA tag key is not referenced in any active policy or attribution rule

The quality worker runs daily at 2:00 AM UTC and can also be triggered on-demand from the Quality Dashboard.

The Tag Hygiene tab groups similar tag keys into clusters using three-layer matching: exact normalized match (case-insensitive, ignoring hyphens and underscores), Levenshtein distance, and token Jaccard similarity. For example, Team, team, and TEAM would be grouped into a single cluster.

Each cluster card shows the variant keys with their resource counts and costs. You can resolve a cluster by selecting one of the variant keys as canonical and clicking Set as canonical. Resolving a cluster creates a tag policy with canonical key mappings and links it to the cluster.

The tab shows summary stats for unresolved clusters: how many exist, how many resources are affected, and the total cost impact. You can toggle between viewing only unresolved clusters or all clusters (including resolved ones).

These findings help you clean up tag sprawl and improve the reliability of tag-based attribution.

  1. Look at the Cost Explorer Attribution tab to see how much spend is unattributed.
  2. Go to Attribution Rules and open the Tags tab to review well-populated tags (like team or cost_center).
  3. Map tag values to teams, departments, or org units using the tag-to-team mapping feature on that tab.
  4. Monitor the attribution coverage improvement over the following days.
  1. Create tag policies on the Policies tab for required tags (e.g., team on all clusters, environment on all warehouses).
  2. Review the Quality Dashboard to identify violations.
  3. The violations table defaults to sorting by cost descending — prioritize the most expensive violations for remediation.
  4. Work with resource owners to add missing tags or fix invalid values.
  5. Monitor the quality score over time.
  1. Open the Tag Hygiene tab to review tag key clusters with casing and naming inconsistencies.
  2. For each cluster, select the preferred canonical key and click Set as canonical to create a policy with canonical key mappings.
  3. Standardize on a single casing convention (lowercase with hyphens is common).
  4. Review orphaned tag violations on the Quality Dashboard and update resources to use standard tag keys.