Tag Governance
The Tag Governance page helps you enforce consistent tagging across your Databricks resources. Tags are key-value labels attached to clusters, warehouses, jobs, and other resources — they’re the foundation for cost attribution, chargeback, and organizational reporting. When tags are inconsistent or missing, attribution suffers and cost reports become unreliable.
Why tags matter
Section titled “Why tags matter”Tags serve multiple purposes in LakeSentry:
- Cost attribution — Pattern rules match resources by tag values to assign costs to teams
- Filtering — Tags are available as filters throughout the Cost Explorer, Insights, and other pages
- Reporting — Tags enable grouping costs by project, environment, cost center, or any custom dimension
- Compliance — Tag policies ensure resources are properly labeled for governance
Without consistent tags, a significant portion of your costs may end up as “unattributed” — see Cost Attribution & Confidence Tiers for how this affects reporting quality.
Page overview
Section titled “Page overview”The Tag Governance page is organized into three tabs: Quality Dashboard, Policies, and Tag Hygiene.
The page header shows three summary stats:
| Metric | What it shows |
|---|---|
| Violations | Total count of tag violations across all workspaces |
| Cost at Risk | Total 30-day cost of resources with at least one violation |
| Duplicate Clusters | Number of unresolved tag key clusters (similar keys that may need consolidation) |
Quality Dashboard
Section titled “Quality Dashboard”The Quality Dashboard tab shows a summary of tag violations across all workspaces, with five stat cards:
| Stat | What it shows |
|---|---|
| Quality Score | Percentage of resources with zero violations |
| Missing Required | Count of missing-required-tag violations |
| Invalid Value | Count of invalid-value violations |
| Misspelled Key | Count of misspelled-key violations |
| Orphaned Tag | Count of orphaned-tag violations |
Below the stats, a violations table lists individual violations with filtering by violation type and resource type:
| Column | What it shows |
|---|---|
| Type | Violation type badge (Missing Required, Invalid Value, Misspelled Key, Orphaned Tag) |
| Resource | Resource name and type |
| Tag | The tag key (and value, if applicable) |
| 30-Day Cost | Cost of the violating resource, split by DBU and cloud costs |
The table defaults to sorting by cost descending, so the most expensive violations appear first. Use the Refresh button to trigger an on-demand quality recomputation.
Tag policies
Section titled “Tag policies”Tag policies define which tags should be present on which resource types. They help you enforce organizational tagging standards.
Creating a policy
Section titled “Creating a policy”- Click Create Policy on the Policies tab.
- Name — A human-readable name for the policy (e.g., “Production Cluster Tags”).
- Description (optional) — Details about the policy’s purpose.
- Resource type — Which resource type the policy applies to: cluster, warehouse, job, pipeline, endpoint, app, or all resource types.
- Required tag keys — Tag keys that must be present on matching resources.
- Allowed values (optional) — Restrict specific tag keys to a set of permitted values.
- Canonical key mappings (optional) — Map common misspellings or variant tag keys to their canonical form.
- Cost threshold (optional) — Minimum 30-day cost in USD; resources below this threshold are excluded from violation checks.
Workspace scoping is available via the API but not in the UI form — policies created through the UI apply to all workspaces.
Policy cards
Section titled “Policy cards”Each policy is displayed as a card showing its name, description, resource type, the number of required keys, constrained keys, and canonical key mappings. Each card includes:
- An active/inactive toggle to enable or disable the policy without deleting it
- Edit and Delete buttons for managing the policy
- The cost threshold (if configured)
Deleting a policy also removes all associated violations.
Violation types
Section titled “Violation types”The quality worker detects four types of tag violations:
| Violation type | What it means |
|---|---|
| Missing required | A resource lacks a tag key that a policy lists as required |
| Invalid value | A tag value is not in the policy’s allowed values list for that key |
| Misspelled key | A tag key is similar to (but not exactly) a known key from policies or attribution rules, detected via Levenshtein distance |
| Orphaned tag | A tag key is not referenced in any active policy or attribution rule |
The quality worker runs daily at 2:00 AM UTC and can also be triggered on-demand from the Quality Dashboard.
Tag Hygiene
Section titled “Tag Hygiene”The Tag Hygiene tab groups similar tag keys into clusters using three-layer matching: exact normalized match (case-insensitive, ignoring hyphens and underscores), Levenshtein distance, and token Jaccard similarity. For example, Team, team, and TEAM would be grouped into a single cluster.
Each cluster card shows the variant keys with their resource counts and costs. You can resolve a cluster by selecting one of the variant keys as canonical and clicking Set as canonical. Resolving a cluster creates a tag policy with canonical key mappings and links it to the cluster.
The tab shows summary stats for unresolved clusters: how many exist, how many resources are affected, and the total cost impact. You can toggle between viewing only unresolved clusters or all clusters (including resolved ones).
These findings help you clean up tag sprawl and improve the reliability of tag-based attribution.
Common workflows
Section titled “Common workflows”Improving attribution coverage
Section titled “Improving attribution coverage”- Look at the Cost Explorer Attribution tab to see how much spend is unattributed.
- Go to Attribution Rules and open the Tags tab to review well-populated tags (like
teamorcost_center). - Map tag values to teams, departments, or org units using the tag-to-team mapping feature on that tab.
- Monitor the attribution coverage improvement over the following days.
Enforcing tagging standards
Section titled “Enforcing tagging standards”- Create tag policies on the Policies tab for required tags (e.g.,
teamon all clusters,environmenton all warehouses). - Review the Quality Dashboard to identify violations.
- The violations table defaults to sorting by cost descending — prioritize the most expensive violations for remediation.
- Work with resource owners to add missing tags or fix invalid values.
- Monitor the quality score over time.
Cleaning up tag sprawl
Section titled “Cleaning up tag sprawl”- Open the Tag Hygiene tab to review tag key clusters with casing and naming inconsistencies.
- For each cluster, select the preferred canonical key and click Set as canonical to create a policy with canonical key mappings.
- Standardize on a single casing convention (lowercase with hyphens is common).
- Review orphaned tag violations on the Quality Dashboard and update resources to use standard tag keys.
Next steps
Section titled “Next steps”- Attribution Rules — Create rules that use tags for cost attribution
- Cost Attribution & Confidence Tiers — How tags feed into the attribution model
- Cost Explorer — See the impact of better tagging on attribution quality