Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.expanse.sh/llms.txt

Use this file to discover all available pages before exploring further.

Expanse treats a Databricks workspace as a databricks compute. The daemon starts from the job bootstrap on the driver, so Databricks Jobs and Mosaic AI Training runs are captured without changing notebooks, Python entrypoints, JARs, or training scripts.

What gets captured

For each Databricks run, Expanse records:
  • Workspace, job, run, task, and cluster identifiers.
  • Submitter, owner, tags, and job parameters.
  • Driver and worker resource shape, including CPU, memory, GPU, and instance type where available.
  • Runtime, status transitions, exit state, logs, and error context.
  • Live utilisation metrics for CPU, memory, GPU, and I/O where the cluster exposes them.
Those records appear in the Console alongside SLURM, Kubernetes, SkyPilot, and other computes. They also feed expanse analyse, expanse diagnose, and the intelligence layer.

Register Databricks

Register a compute and choose databricks when prompted:
expanse compute register
The CLI prints the Databricks bootstrap command and the key material to store in the workspace. Install that bootstrap wherever Databricks starts user code, typically through a cluster policy, init script, or job setup step.

Supported Databricks surfaces

  • Databricks Jobs: single-task and multi-task jobs, job clusters, and all-purpose clusters used by jobs.
  • Mosaic AI Training: training runs launched from Databricks-managed training workflows.

Verify capture

Run any Databricks Job or Mosaic AI Training run after the bootstrap is installed. Within a minute of the run starting, the databricks compute and the run appear in console.expanse.sh.

Next steps

Computes

How Databricks maps to the Expanse compute model.

Telemetry

What Expanse captures before, during, and after each run.