Version: 5.13.x

Remote Build Execution

Remote execution

Every Workflows deployment includes a remote cluster compliant with the Bazel Remote Execution Protocol v2. Remote build execution (RBE) is available with minimal additional configuration needed. This allows for the creation of specially tailored runners for individual jobs, configured throughout the build tree. This also means that jobs can be parallelized effectively, where pending actions can unblock many parallel runners at once without repeated work. As a result, cost can be effectively managed by only provisioning large workers for large actions, which can then be spun down more readily while the smaller actions continue on smaller, less powerful workers. Remote execution supports both arm64 and amd64 architectures simultaneously, meaning that a remote execution cluster can be provisioned that supports cross-compilation/publishing or any other cross-platform use case, with minimal additional configuration.

For configuration of the remote cache, see remote cache. For configuration of an external remote cluster, see external remote.

Enable remote execution

By default, remote execution is not enabled in Workflows remote resource clusters. This is because remote execution requires a special focus on sandboxing and reproducibility beyond what may be required on an ephemeral runner or a user's machine. That said, you can enable remote execution by adding the following to the Workflows remote cluster configuration.

From within your Aspect Workflows module definition, add the following code:

remote = {
  remote_execution = {
    executors = {
      default = {
        image = "ghcr.io/catthehacker/ubuntu:act-22.04@sha256:5f9c35c25db1d51a8ddaae5c0ba8d3c163c5e9a4a6cc97acd409ac7eae239448"
      }
    }
  }
}

Set up a node pool for remote executors under k8s_cluster and configure an executor type under remote.remote_execution. Below, the default executors use the default node type, although you may have several executors with different docker images using the same node pool.

remote = {
  remote_execution = {
    executors = {
      default = {
        image       = "ghcr.io/catthehacker/ubuntu:act-22.04@sha256:5f9c35c25db1d51a8ddaae5c0ba8d3c163c5e9a4a6cc97acd409ac7eae239448"
        node_type   = "default"
        min_scaling = 0
        max_scaling = 5
        concurrency = 4 # The # vCPUs on the node type (c2d-standard-4 below)
      }
    }
  }
}

k8s_cluster = {
  remote_exec_nodes = {
    default = {
      min_count    = 0
      max_count    = 5
      machine_type = "c2d-standard-4"
      num_ssds     = 1
    }
  }
}

This configuration spins up a new set of runners that pick up work using the Docker image specified. All the scaling and provisioning is otherwise handled automatically, and the endpoint is the same as the one for the remote cache. To use this in Bazel, add a new platform that looks something like the following.

In a BUILD file, e.g. <WORKSPACE root>/platforms/BUILD.bazel

info

Both the container-image and OSFamily keys are required, but you can add further custom keys.

platform(
  name = "my_remote_platform",
  exec_properties = {
    "OSFamily": "Linux", # Required
    "container-image": "docker://<your Docker image>", # Required
  },
)

Then, update the .bazelrc file to point to the new platform for a given configuration, e.g. --config rbe:

build:rbe --host_platform=@<WORKSPACE name>//<path to platform>:my_remote_platform
build:rbe --extra_execution_platforms=@<WORKSPACE name>//<path to platform>:my_remote_platform
build:rbe --jobs=<maximum number of concurrent actions>
build:rbe --remote_executor=unix:///mnt/ephemeral/buildbarn/.cache/bb_clientd/grpc
build:rbe --remote_timeout=3600

info

The above is an example, change it as needed to a given use case.

Once completed, jobs should be able to run on the remote executors seamlessly, provided the configuration points to the right place. You may encounter additional build failures that did not occur before. Remote execution requires greater attention to detail in the structure of the build tree.

You can find further configuration of the remote executors, including but not limited to direct provisioning of the underlying compute, in the Workflows configuration documentation.

Adding multiple platforms

Remote execution in Bazel is essentially a train station. Work (passengers) come in from the client and expects to be routed to a worker (destination). The train, in this case, is called a platform. It dictates how work is routed to underlying compute. Above specified how to create a platform for use in a single-track topology. However, you may want to route specific targets to specific underlying compute. Doing so means that, metaphorically, there need to be more trains that travel on dedicated tracks, along with additional destinations. This adds up to needing more platforms and more workers. This section covers how to create both in Bazel and in Terraform.

What is a platform?

A platform, in simple terms, is a JSON structure. It is, above all, a disambiguator that uniquely tells Bazel where work should go. In the train metaphor, this is like the train's number. In the above section, the minimal definition of a platform in Bazel is given:

platform(
  name = "my_remote_platform",
  exec_properties = {
    "OSFamily": "Linux",
    "container-image": "docker://<your Docker image>",
  },
)

However, the further definition of exec_properties (which makes up the structure of the platform), is expandable. You can add additional keys to further disambiguate from other platforms. For instance, if you wanted to add a platform where you know the underlying Docker image will run on a worker with a specific host configuration, you can add that property:

platform(
  name = "my_remote_platform_large", # NOTE: this is a new name
  exec_properties = {
    "OSFamily": "Linux",
    "container-image": "docker://<your Docker image>",
    "host_large": "true", # NOTE: this is the new property
  },
)

Adding workers for platforms in Terraform

When a new platform is added, you must create a dedicated worker to support it. You can do this by adding to the mapping in Terraform and specifying additional_platform_properties in the same section where the image is specified.

For the example above, that configuration is as follows:

remote = {
  remote_execution = {
    executors = {
      host_large = {
        image = "ghcr.io/catthehacker/ubuntu:act-22.04@sha256:5f9c35c25db1d51a8ddaae5c0ba8d3c163c5e9a4a6cc97acd409ac7eae239448"
        additional_platform_properties = {
          host_large = "true"
        }
        # ... any additional config
      }
    }
  }
}

info

Above that the key for this new platform in the executors map is no longer default. default is not a rigid requirement; it too is just a given name. Customize this as needed; the executor name is solely used for metrics and scaling, and does not impact user-facing configuration.

Bazel configuration

You can update the BUILD file hosting the original/default platform to include any and all new platforms, listing the full set of remote execution platforms in extra_execution_platforms in the .bazelrc file.

build:rbe --extra_execution_platforms=@<WORKSPACE name>//<path to platform>:my_remote_platform,@<WORKSPACE name>//<path to platform>:my_remote_platform_large

tip

The extra_execution_platforms flag may only be set once; later instances will override earlier flag settings. Read the Bazel command line reference for more info.

To constrain a build or test target to a specific execution platform, and therefore to run that target on a desired remote executor group, set the exec_properties attribute in the target to be constrained to the desired execution platform.

For example, to constrain a test target to run on the host_large executors defined above, exec_properties set the test target as follows:

sh_test(
    name = "larget_test",
    srcs = ["larget_test.sh"],
    exec_properties = {
        "OSFamily": "Linux",
        "container-image": "docker://<your Docker image>", # NOTE: this is the Docker image for this specific platform
        "host_large" = "true"
    },
)

info

Read the Bazel common attributes documentation for more info on exec_properties, which is a built-in attribute on all build & test rules.

Unconstrained targets, without exec_properties set, default to run on the first platform specified in extra_execution_platforms, which should also match the host_platform flag when there are multiple extra_execution_platforms specified.

Remote execution​

Enable remote execution​

Adding multiple platforms​

What is a platform?​

Adding workers for platforms in Terraform​

Bazel configuration​

Remote execution

Enable remote execution

Adding multiple platforms

What is a platform?

Adding workers for platforms in Terraform

Bazel configuration