Expanse treats a cloud account, project, subscription, or workspace as aDocumentation Index
Fetch the complete documentation index at: https://docs.expanse.sh/llms.txt
Use this file to discover all available pages before exploring further.
cloud-batch compute. The daemon observes managed batch and training services through provider control planes, so jobs launched through native cloud APIs appear beside the rest of your Expanse telemetry.
What gets captured
For each managed batch or training job, Expanse records:- Job definition, image, command, entrypoint, queue, pool, region, and project context.
- Submitter, service account, tags, labels, and parameters.
- Requested CPU, memory, GPU, accelerator, and machine shape.
- Status transitions, runtime, exit status, retry state, logs, and failure context.
- Live utilisation metrics where the provider exposes them.
expanse analyse, expanse diagnose, and the intelligence layer.
Supported services
- AWS Batch.
- AWS SageMaker Training Jobs.
- Azure Batch.
- Azure ML Jobs.
- Google Batch.
- Vertex AI Custom Training.
Register Cloud Batch
Register a compute and choosecloud-batch when prompted:
Typical boundaries
| Provider | Register one compute for |
|---|---|
| AWS | An account and region running AWS Batch or AWS SageMaker Training Jobs. |
| Azure | A subscription, resource group, or workspace running Azure Batch or Azure ML Jobs. |
| Google Cloud | A project and region running Google Batch or Vertex AI Custom Training. |
Verify capture
Submit a managed batch or training job after the daemon starts. Within a minute of the provider reporting the job, thecloud-batch compute and the job appear in console.expanse.sh.
Next steps
Computes
How cloud batch maps to the Expanse compute model.
Telemetry
What Expanse captures before, during, and after each job.