MLflow
The MLflow page tracks your machine learning experiment activity in Databricks. It aggregates MLflow experiments and runs with their metrics, giving ML teams visibility into experiment activity, success rates, run durations, and user engagement.
Experiment list
Section titled “Experiment list”The Experiments tab shows MLflow experiments with their aggregated run statistics:
| Column | What it shows |
|---|---|
| Experiment | Experiment name and ID |
| Runs | Total number of runs, with a badge showing how many are currently running |
| Success | Number of successful runs, with a badge showing failed runs if any |
| Users | Number of distinct users who ran experiments |
| Avg Duration | Average run duration across all runs |
| Last Run | When the most recent run completed |
Sorting
Section titled “Sorting”- Runs (descending) — Find the most active experiments
- Experiment (ascending) — Browse experiments alphabetically
- Success (descending) — Find the most successful experiments
Headline stats
Section titled “Headline stats”The page header shows aggregate statistics across all experiments:
| Metric | What it shows |
|---|---|
| Experiments | Total number of MLflow experiments |
| Total Runs | Total number of runs across all experiments |
| Successful | Number of runs that completed successfully |
| Failed | Number of runs that failed |
Runs by experiment chart
Section titled “Runs by experiment chart”A bar chart showing the top 10 experiments by run count. Use this to quickly identify which experiments have the most activity.
Runs tab
Section titled “Runs tab”The Runs tab shows individual MLflow runs across all experiments:
| Column | What it shows |
|---|---|
| Run | Run name and ID |
| Status | FINISHED, FAILED, or RUNNING |
| User | Who initiated the run |
| Duration | How long the run took |
| Metrics | Number of distinct metrics logged in the run |
| Started | When the run started |
Daily Activity tab
Section titled “Daily Activity tab”The Daily Activity tab shows per-day aggregated MLflow activity for a selected time range:
| Column | What it shows |
|---|---|
| Date | The date of recorded activity |
| Experiment | Experiment name or ID |
| Runs | Number of completed runs, with failed count if any |
| Users | Number of distinct users active that day |
| Avg Duration | Average run duration for that day |
Filtering
Section titled “Filtering”| Filter | Options |
|---|---|
| Time range | Analysis period (applies to Daily Activity tab) |
Work unit integration
Section titled “Work unit integration”LakeSentry maps MLflow data into the work unit model. Each MLflow experiment becomes a work unit of type mlflow_experiment, and each MLflow run becomes a work unit run. This enables MLflow experiments and runs to participate in the same attribution and tracking framework as jobs and pipelines.
Common workflows
Section titled “Common workflows”Finding high-activity experiments
Section titled “Finding high-activity experiments”- Sort the experiment list by Runs (descending).
- Look at the top experiments and their success/failure ratios.
- Experiments with high failure rates may benefit from investigation.
- Check average duration to spot experiments with unusually long runs.
Investigating ML activity trends
Section titled “Investigating ML activity trends”- Switch to the Daily Activity tab and set the time range to the last 30 or 90 days.
- Look for experiments with increasing daily run counts.
- Use the Users column to understand whether activity is from one user or a team.
- Cross-reference with Compute to see if the underlying clusters are efficiently utilized.
Monitoring experiment health
Section titled “Monitoring experiment health”- Check the page header stats for the overall Failed count.
- Sort the experiment list by Success to see failure ratios.
- Switch to the Runs tab to find specific failed runs and their users.
Next steps
Section titled “Next steps”- Work Units (Jobs & Pipelines) — Job-level tracking
- Compute (Clusters & Warehouses) — Cluster utilization for training workloads
- Model Serving — Tracking for deployed model endpoints
- Cost Attribution & Confidence Tiers — How costs are attributed across workloads