Remote Build Execution
Remote execution
Every Workflows deployment includes a remote cluster compliant with the Bazel Remote Execution Protocol v2. Remote build execution (RBE) is available with minimal additional configuration needed. This allows for the creation of specially tailored runners for individual jobs, configured throughout the build tree. This also means that jobs can be parallelized effectively, where pending actions can unblock many parallel runners at once without repeated work. As a result, cost can be effectively managed by only provisioning large workers for large actions, which can then be spun down more readily while the smaller actions continue on smaller, less powerful workers. Remote execution supports both arm64 and amd64 architectures simultaneously, meaning that a remote execution cluster can be provisioned that supports cross-compilation/publishing or any other cross-platform use case, with minimal additional configuration.
For configuration of the remote cache, see remote cache. For configuration of an external remote cluster, see external remote.
Enable remote execution
By default, remote execution is not enabled in Workflows remote resource clusters. This is because remote execution requires a special focus on sandboxing and reproducibility beyond what may be required on an ephemeral runner or a user's machine. That said, remote execution can be enabled as simply as adding the following to the Workflows remote cluster configuration.
From within your Aspect Workflows module definition, add the following code:
- AWS
- GCP
remote = {
remote_execution = {
executors = {
default = {
image = "ghcr.io/catthehacker/ubuntu:act-22.04@sha256:5f9c35c25db1d51a8ddaae5c0ba8d3c163c5e9a4a6cc97acd409ac7eae239448"
}
}
}
}
Set up a node pool for remote executors under k8s_cluster
and configure an executor type under remote.remote_execution
.
Below, the default
executors use the default
node type, although you may have several executors with different docker images
using the same node pool.
remote = {
remote_execution = {
executors = {
default = {
image = "ghcr.io/catthehacker/ubuntu:act-22.04@sha256:5f9c35c25db1d51a8ddaae5c0ba8d3c163c5e9a4a6cc97acd409ac7eae239448"
node_type = "default"
min_scaling = 0
max_scaling = 5
concurrency = 4 # The # vCPUs on the node type (c2d-standard-4 below)
}
}
}
}
k8s_cluster = {
remote_exec_nodes = {
default = {
min_count = 0
max_count = 5
machine_type = "c2d-standard-4"
num_ssds = 1
}
}
}
This configuration spins up a new set of runners that pick up work using the Docker image specified. All the scaling and provisioning is otherwise handled automatically, and the endpoint is the same as the one for the remote cache. To use this in Bazel, a new platform needs to be added that looks something like the following.
In a BUILD file, e.g. <WORKSPACE root>/platforms/BUILD.bazel
platform(
name = "my_remote_platform",
exec_properties = {
"OSFamily": "linux",
"container-image": "docker://<your Docker image>",
},
)
Then, the .bazelrc
file needs to be updated to point to the new platform for a given configuration, e.g. --config rbe
:
build:rbe --host_platform=@<WORKSPACE name>//<path to platform>:my_remote_platform
build:rbe --extra_execution_platforms=@<WORKSPACE name>//<path to platform>:my_remote_platform
build:rbe --jobs=<maximum number of concurrent actions>
build:rbe --remote_timeout=3600
Note: the above is simply an example. Please adjust as needed to a given use case.
Once completed, jobs should be able to run on the remote executors seamlessly, provided the configuration points to the right place. Additional build failures may be encountered initially that did not occur before; as stated previously, remote execution requires greater attention to detail in the structure of the build tree.
Further configuration of the remote executors, including but not limited to direct provisioning of the underlying compute, can be found in the Workflows configuration documentation.
Adding multiple platforms
Remote execution in Bazel is essentially a train station. Work (passengers) come in from the client and expects to be routed to a worker (destination). The train, in this case, is called a platform. It dictates how work is routed to underlying compute. Above, it was specified how to create a platform for use in a single-track topology. However, it may be desirable to route specific targets to specific underlying compute. Doing so means that, metaphorically, there need to be more trains that travel on dedicated tracks, along with additional destinations. This adds up to needing more platforms and more workers. This section covers how to create both in Bazel and in Terraform.
What is a platform?
A platform, in simple terms, is a JSON structure. It is, above all, a disambiguator that uniquely tells Bazel where work should go. In the train metaphor, this is like the train's number. In the above section, the minimal definition of a platform in Bazel is given:
platform(
name = "my_remote_platform",
exec_properties = {
"OSFamily": "linux",
"container-image": "docker://<your Docker image>",
},
)
However, the further definition of exec_properties
(which makes up the structure of the platform), is expandable.
Additional keys can be added to further disambiguate from other platforms. For instance, if we wanted to add a platform
where we know the underlying Docker image will run on a worker with a specific host configuration, we can add that
property:
platform(
name = "my_remote_platform_large", # NOTE: this is a new name
exec_properties = {
"OSFamily": "linux",
"container-image": "docker://<your Docker image>",
"host_large": "true", # NOTE: this is the new property
},
)
Adding workers for platforms in Terraform
When a new platform is added, a dedicated worker must be created to support it. This can be done
by adding to the mapping in Terraform and specifying additional_platform_properties
in the same section
where the image
is specified. For the example above, that configuration is as follows:
remote = {
remote_execution = {
executors = {
host_large = {
image = "ghcr.io/catthehacker/ubuntu:act-22.04@sha256:5f9c35c25db1d51a8ddaae5c0ba8d3c163c5e9a4a6cc97acd409ac7eae239448"
additional_platform_properties = {
host_large = "true"
}
# ... any additional config
}
}
}
}
Note above that the key for this new platform in the executors
map is no longer default
. default
is not a rigid
requirement; it too is just a given name. Feel free to customize this as needed; the executor name is solely used for
metrics and scaling, and does not impact user-facing configuration.
Bazel configuration
The BUILD file hosting the original/default platform can be updated to include any and all new platforms.
The full set remote execution platforms must be listed in extra_execution_platforms
in the .bazelrc
file.
build:rbe --extra_execution_platforms=@<WORKSPACE name>//<path to platform>:my_remote_platform,@<WORKSPACE name>//<path to platform>:my_remote_platform_large
The extra_execution_platforms
flag may only be set once; later instances will override earlier flag settings.
See https://bazel.build/reference/command-line-reference#flag--extra_execution_platforms for more info.
To constrain a build or test target to a specific execution platform, and therefore to run that target on a desired
remote executor group, set the exec_properties
attribute in the target to be constrained to the desired execution platform.
For example, to constrain a test target to run on the host_large
executors defined above, exec_properties
on the test target could be set as follows:
sh_test(
name = "larget_test",
srcs = ["larget_test.sh"],
exec_properties = {
"host_large" = "true"
},
)
See https://bazel.build/reference/be/common-definitions#common-attributes for more info on exec_properties
, which is a built-in attribute on all build & test rules.
Unconstrained targets, without exec_properties
set, will default to run on the first platform specified in extra_execution_platforms
,
which should also match the host_platform
flag when there are multiple extra_execution_platforms
specified.