Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.expanse.sh/llms.txt

Use this file to discover all available pages before exploring further.

Expanse treats a YARN estate as a yarn compute. Install the daemon on an edge, gateway, or ResourceManager-adjacent host with access to the scheduler APIs, and Expanse captures applications submitted to Apache Hadoop YARN.

What gets captured

For each YARN application, Expanse records:
  • Application ID, application name, queue, user, tags, and submission time.
  • Framework context for Spark, Hadoop MapReduce, Tez, Hive, and other YARN applications.
  • Requested memory, vcores, containers, and accelerator context where available.
  • Application attempts, container state, runtime, final status, and exit diagnostics.
  • Live CPU, memory, GPU, and I/O metrics where the cluster exposes them.
Those records appear in the Console and feed expanse analyse, expanse diagnose, and the intelligence layer.

Supported YARN platforms

  • Apache Hadoop YARN.
  • AWS EMR.
  • Google Dataproc.
  • Azure HDInsight.
  • Cloudera CDP.

Register YARN

Register a compute and choose yarn when prompted:
expanse compute register
The CLI prints the daemon install command and the scheduler endpoint configuration. Run the install command on the host that can reach the ResourceManager and the metrics endpoints used by your distribution.

Verify capture

Submit any YARN application after the daemon starts. Within a minute of the application entering the scheduler, the yarn compute and the application appear in console.expanse.sh.

Next steps

Computes

How YARN maps to the Expanse compute model.

Telemetry

What Expanse captures before, during, and after each application.