workflows
Requirements
Name | Version |
---|---|
terraform | >= 1.9.0 |
aws | >= 4.67.0, < 6.0.0 |
Providers
Name | Version |
---|---|
aws | >= 4.67.0, < 6.0.0 |
Modules
Name | Source | Version |
---|---|---|
alerting | ./alerting | n/a |
api | ./api | n/a |
bessie | ./services/bessie | n/a |
bk | ./bk | n/a |
cci | ./cci | n/a |
configuration | ./utils/configuration_properties | n/a |
core | ./core | n/a |
dashboards | ./alerting/alarms/cloudwatch_dashboards | n/a |
delivery | ./delivery | n/a |
external_remote | ./remote | n/a |
gh_api_token_secret | ./utils/aws_secret | n/a |
gha | ./gha | n/a |
gl | ./gl | n/a |
grafana | ./telemetry/grafana | n/a |
kilgore | ./services/kilgore | n/a |
logging | ./logging | n/a |
remote | ./remote | n/a |
telemetry | ./telemetry | n/a |
token_auth | ./services/token_auth | n/a |
warming | ./warming | n/a |
webapp | ./webapp | n/a |
Resources
Name | Type |
---|---|
aws_s3_object.runner_otel_config | resource |
aws_caller_identity.default | data source |
aws_ecr_authorization_token.token | data source |
aws_partition.default | data source |
aws_region.default | data source |
Inputs
Name | Description | Type | Default | Required |
---|---|---|---|---|
account_id | Account ID of the AWS Account where CloudWatch alarms reside | string | null | no |
aspect_artifacts_bucket | S3 bucket where Aspect delivers workflows assets | string | n/a | yes |
aw_grafana_org_role | The role to assign to users in the Aspect Grafana organization | string | "Viewer" | no |
bk_runner_groups | Mapping of Buildkite runner group name to settings for that runner group | map(object({ # Common settings for all CI hosts agent_idle_timeout_min = number max_runners = number min_runners = optional(number, 0) min_free_runners = optional(number, 0) policy_documents = optional(map(object({ json : string })), {}) policies = optional(map(string), {}) queue = string resource_type = string scale_out_factor = optional(number, 1) scaling_polling_frequency = optional(number, 1) # polls per minute reaper_sleep_minutes = optional(number, 10) security_groups = optional(map(string), {}) warming = optional(bool, false) warming_set = optional(string, "default") exclude_oncall_alerts = optional(list(string), []) tags = optional(map(string), {}) build_logs_bucket = optional(string, "BUCKET_PLACEHOLDER") })) | {} | no |
cci_runner_groups | Mapping of CircleCI runner group name to settings for that runner group | map(object({ # Common settings for all CI hosts agent_idle_timeout_min = number max_runners = number min_runners = optional(number, 0) min_free_runners = optional(number, 0) policy_documents = optional(map(object({ json : string })), {}) policies = optional(map(string), {}) resource_type = string scale_out_factor = optional(number, 1) scaling_polling_frequency = optional(number, 1) reaper_sleep_minutes = optional(number, 10) security_groups = optional(map(string), {}) warming = optional(bool, false) warming_set = optional(string, "default") exclude_oncall_alerts = optional(list(string), []) tags = optional(map(string), {}) # Settings specific to CircleCI circleci_api_url = optional(string, "https://circleci.com") circleci_runner_api_url = optional(string, "https://runner.circleci.com") job_max_run_time_min = optional(number, 360) })) | {} | no |
cost_allocation_tag | (deprecated) The tag name used for cost tagging | string | "CreatedBy" | no |
cost_allocation_tag_value | (deprecated) The value of the cost tag | string | null | no |
create_security_groups | Whether to create security groups automatically for all resources. Set to false to use the values set in security_group_ids. | bool | true | no |
create_vpc_endpoints | Whether to create VPC endpoints automatically. | bool | true | no |
customer_id | Unique, human-readable customer identifier provided by Aspect | string | n/a | yes |
delivery_enabled | If delivery infrastructure is enabled for Aspect Workflows | bool | true | no |
deployment_id | Unique, human-readable deployment identifier provided by Aspect | string | "default" | no |
external_remote | Configuration for the externalized Bazel remote endpoint (cache and execution), specifically the ALB. | object({ debug_tools = optional(bool, false) dns = object({ hosted_zone_id = optional(string, null) hosted_zone_name = optional(string, null) }) storage = object({ # Number of shards for the remote cache storage service num_shards = optional(number, 3) instance_type = optional(string) instance_image_id = optional(string) mirror = optional(bool) }) frontend = optional(object({ cpu = optional(number, 1024) memory = optional(number, 2048) max_scaling = optional(number, 20) min_scaling = optional(number) }), { cpu = 1024 memory = 2048 max_scaling = 20 }) downloader = optional(object({ frontend = optional(object({ cpu = optional(number, 1024) memory = optional(number, 2048) max_scaling = optional(number, 5) min_scaling = optional(number, 1) }), { cpu = 1024 memory = 2048 max_scaling = 10 min_scaling = 1 }) })) oidc = optional(object({ issuer = string auth_endpoint = string token_endpoint = string user_info_endpoint = string client_id = string client_secret = string session_timeout_seconds = optional(number, null) }), null) token = optional(object({ admin_users = list(string) prefix = string header = string cognito_user_pool_arn = string cognito_user_pool_id = string cognito_client_id = string amazon_verified_permissions_policy_store_id = string amazon_verified_permissions_policy_store_arn = string clusters = map(object({ permissions = object({ read = bool write = bool execute = bool }) })) eventbridge_scheduler_group_arn = string eventbridge_scheduler_group_name = string kms_key_arn = string kms_key_id = string }), null) remote_execution = optional(object({ executors = map(object({ platform = optional(string) image = string additional_platform_properties = optional(map(string), {}) workers = optional(list(object({ scaling = optional(object({ minimum = optional(number) maximum = optional(number) warm = optional(number) fast = optional(object({ target = optional(number) size = optional(number) cooldown = optional(number) })) policy = optional(object({ target = optional(number, 100) scale_in_cooldown = optional(number, 300) scale_out_cooldown = optional(number, 60) }), { target = 100 scale_in_cooldown = 300 scale_out_cooldown = 60 }) rules = optional(map(object({ schedule = string timezone = optional(string) minimum = optional(number) maximum = optional(number) }))) }), {}) isolated_actions = optional(object({ cpu = optional(number, 1024) memory = optional(number) })) network = optional(bool, true) max_concurrency = optional(number) max_download_concurrency = optional(number, 1000000) max_upload_concurrency = optional(number, 1000000) ec2 = optional(object({ instance_type = optional(string) instance_image = optional(string) docker_group = optional(string) })) ecs = optional(object({ architecture = optional(string) cpu = optional(number, 1024) memory = optional(number, 2048) })) runner = optional(object({ kill_processes = optional(bool) set_tmp_dir_env_variable = optional(bool) clean_tmp_directories = optional(list(string)) })) docker_user = optional(string, null) })), [{}]) })) use_size_class_cache = optional(bool, false) worker_cloudwatch_agent = optional(bool, true) storage = optional(object({ num_shards = optional(number) instance_type = optional(string) instance_image_id = optional(string) upstream_fallback = optional(bool, true) mirror = optional(bool) })) })) }) | null | no |
gha_runner_groups | Mapping of GitHub Actions runner group name to settings for that runner group | map(object({ # Common settings for all CI hosts agent_idle_timeout_min = number max_runners = number min_runners = optional(number, 0) min_free_runners = optional(number, 0) policy_documents = optional(map(object({ json : string })), {}) policies = optional(map(string), {}) queue = string resource_type = string scale_out_factor = optional(number, 1) scaling_polling_frequency = optional(number, 1) reaper_sleep_minutes = optional(number, 10) security_groups = optional(map(string), {}) warming = optional(bool, false) warming_set = optional(string, "default") exclude_oncall_alerts = optional(list(string), []) tags = optional(map(string), {}) # Settings specific to GitHub Actions gh_repo = string gha_workflow_ids = optional(list(string), []) })) | {} | no |
gl_runner_groups | Mapping of GitLab runner group name to settings for that runner group | map(object({ # Common settings for all CI hosts agent_idle_timeout_min = number max_runners = number min_runners = optional(number, 0) min_free_runners = optional(number, 0) policy_documents = optional(map(object({ json : string })), {}) policies = optional(map(string), {}) queue = string resource_type = string scale_out_factor = optional(number, 1) scaling_polling_frequency = optional(number, 1) reaper_sleep_minutes = optional(number, 10) security_groups = optional(map(string), {}) warming = optional(bool, false) warming_set = optional(string, "default") exclude_oncall_alerts = optional(list(string), []) tags = optional(map(string), {}) # Settings specific to GitLab gitlab_url = optional(string, "https://gitlab.com") project_id = string })) | {} | no |
hosts | ####################################### CI host configuration options # | list(string) | n/a | yes |
internal_use_only | For Aspect internal development use only | object({ # Whether to disable deletion protection on resources disable_deletion_protection = optional(bool, false) # Custom install bucket install_bucket = optional(object({ id = string arn = string bucket = string }), null) # Use dev service images at the specified tag dev_image_tag = optional(string, "unspecified") }) | {} | no |
partition | The partition to configure services in, if not commercial | string | null | no |
region | The default region to setup services in | string | null | no |
remote | Configuration for the Bazel remote endpoint (cache and execution), specifically the ALB. | object({ debug_tools = optional(bool, false) storage = object({ # Number of shards for the remote cache storage service num_shards = optional(number, 3) instance_type = optional(string) instance_image_id = optional(string) mirror = optional(bool) }) frontend = optional(object({ cpu = optional(number, 1024) memory = optional(number, 2048) max_scaling = optional(number, 20) min_scaling = optional(number) }), { cpu = 1024 memory = 2048 max_scaling = 20 }) downloader = optional(object({ frontend = optional(object({ cpu = optional(number, 1024) memory = optional(number, 2048) max_scaling = optional(number, 5) min_scaling = optional(number, 1) }), { cpu = 1024 memory = 2048 max_scaling = 10 min_scaling = 1 }) })) remote_execution = optional(object({ executors = map(object({ platform = optional(string) image = string additional_platform_properties = optional(map(string), {}) workers = optional(list(object({ scaling = optional(object({ minimum = optional(number) maximum = optional(number) warm = optional(number) fast = optional(object({ target = optional(number) size = optional(number) cooldown = optional(number) })) policy = optional(object({ target = optional(number, 100) scale_in_cooldown = optional(number, 300) scale_out_cooldown = optional(number, 60) }), { target = 100 scale_in_cooldown = 300 scale_out_cooldown = 60 }) rules = optional(map(object({ schedule = string timezone = optional(string) minimum = optional(number) maximum = optional(number) }))) }), {}) isolated_actions = optional(object({ cpu = optional(number, 1024) memory = optional(number) })) network = optional(bool, true) max_concurrency = optional(number) max_download_concurrency = optional(number, 1000000) max_upload_concurrency = optional(number, 1000000) ec2 = optional(object({ instance_type = optional(string) instance_image = optional(string) docker_group = optional(string) })) ecs = optional(object({ architecture = optional(string) cpu = optional(number, 1024) memory = optional(number, 2048) })) runner = optional(object({ kill_processes = optional(bool) set_tmp_dir_env_variable = optional(bool) clean_tmp_directories = optional(list(string)) })) docker_user = optional(string, null) })), [{}]) })) use_size_class_cache = optional(bool, false) worker_cloudwatch_agent = optional(bool, true) storage = optional(object({ num_shards = optional(number) instance_type = optional(string) instance_image_id = optional(string) upstream_fallback = optional(bool, true) mirror = optional(bool) })) })) }) | n/a | yes |
repository_urls | The repository URLs for the Docker images used by this module. Meant to be used in concert with the ecr_images submodule. | map(string) | { "adot_exporter": "public.ecr.aws/aws-observability/aws-otel-collector", "alert_manager": "quay.io/prometheus/alertmanager:v0.27.0", "aws_cli": "public.ecr.aws/aws-cli/aws-cli", "bash": "public.ecr.aws/docker/library/bash", "bb_browser": "ghcr.io/buildbarn/bb-browser:20240613T055327Z-f0fbe96", "bb_remote_asset": "ghcr.io/buildbarn/bb-remote-asset:20241014t200011z-5a41232", "bb_replicator": "ghcr.io/buildbarn/bb-replicator:20241120T153311Z-7c63b92", "bb_runner_installer": "ghcr.io/buildbarn/bb-runner-installer:20240716t044555z-9850e82", "bb_scheduler": "ghcr.io/buildbarn/bb-scheduler:20240716t044555z-9850e82", "bb_storage": "ghcr.io/buildbarn/bb-storage:20250226t091209z-85aafcb", "bb_worker": "ghcr.io/buildbarn/bb-worker:20240716t044555z-9850e82", "busybox": "public.ecr.aws/docker/library/busybox", "curl_jq": "registry.gitlab.com/gitlab-ci-utils/curl-jq:3.0.0", "grafana": "docker.io/grafana/grafana-enterprise:11.6.0-ubuntu", "otel_collector_contrib": "ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib:0.115.1", "prometheus": "quay.io/prometheus/prometheus:v2.52.0" } | no |
resource_types | Mapping of resource types name to settings for that type. Reference the name of a resource type in resource_type fields. | map(object({ # The ID of the AMI to use for this resource image_id = string # A list of instance types that are acceptable in the ASG instance_types = list(string) # The size of the root EBS volume in GB root_volume_size_gb = optional(number, 64) # Tags to apply to this resource tags = optional(map(string), {}) # Defines if spot instances should be used for this resource use_spot = optional(bool, false) # When using spot instances, allows further customization over the spot vs on-demand allocation instance_policy = optional(object({ on_demand_base_capacity = optional(number, 0) on_demand_percentage_above_base_capacity = optional(number, 0) spot_allocation_strategy = optional(string, "price-capacity-optimized") spot_max_price = optional(string, "") spot_instance_pools = optional(number, 2) }), {}) })) | {} | no |
scaling_function_memory_mb | The amount of memory to assign to scaling functions in MB | number | 512 | no |
security_group_ids | Optional security group ID substitutions for Workflows resources. | map(any({ remote = { adot = " frontend = " scheduler = " storage = " asg = " asg_worker = " alb = " vpce = " browser = " downloader = " } external_remote = { adot = " frontend = " scheduler = " storage = " asg = " asg_worker = " alb = " vpce = " browser = " downloader = " } workflows_services = { bessie = " bessie-rds = " kilgore = " otel = " prometheus = " alert-manager = " alb = " } delivery = { default = " } bk = { scaler = " } cci = { scaler = " } gha = { scaler = " } gl = { scaler = " } })) | {} | no |
support | Set of properties that allow Aspect to provide oncall support for Workflows | object({ # If true, alerts generated by Workflows will be reported back to Aspect. # Depending on the severity of the alert, this may result in an oncall # engineer being paged depending on the level of support included with this # Workflows install. alert_aspect = optional(bool, true) # A set of secret IDs that can be overriden if required. secrets = optional(object({ # Override the secret ID used for fetching the PagerDuty routing key from Aspects AWS account. aspect_pagerduty_routing_key_id = optional(string) # Override the secret ID used for fetching the Slack token from Aspects AWS account. aspect_slack_token_secret_id = optional(string) # Override the KMS ID used to decrypt the support secrets from Aspect. aspect_support_secret_kms_id = optional(string) }), {}) # Role ARN that allows support level access for Aspect. support_role_name = optional(string, null) # Role ARN that allows extended support access for Aspect. # This role will have write access to various areas Workflows infrastructure, # however it can only be assumed by a subset of Aspect oncall engineers. operator_role_name = optional(string, null) # Add policies that allow access to CI infrastructure instances via SSM enable_ssm_access = optional(bool, false) }) | n/a | yes |
tags | Tags to add to every resource Aspect Workflows creates | map(string) | {} | no |
telemetry | Configuration options for Workflows telemetry | object({ # Configuration for where Workflows telemetry data gets exported destinations = optional(object({ # Which exporters to set up. honeycomb = optional(object({ # Honeycome dataset to set for exports to this destination. dataset = optional(string) # Honeycomb Team secret reference, used for authentication. team_secret = object({ id = string arn = string }) })) datadog = optional(object({ # Datadog agent ingest site. site = string # Datadog API key secret reference, used for authentication. key_secret = object({ id = string arn = string }) })) generic_otlp = optional(object({ # Endpoint where to export telemetry to. endpoint = string })) }), {}) }) | {} | no |
vpc_id | ID of the VPC in which to deploy | string | n/a | yes |
vpc_subnets | List of subnet IDs to use for VM infrastructure | list(string) | n/a | yes |
vpc_subnets_public | List of subnet IDs to use for public facing VM infrastructure | list(string) | [] | no |
warming_sets | Mapping of warming set to settings for that set | map(object({ additional_paths = optional(string) })) | {} | no |
webapp | n/a | object({ dns = object({ hosted_zone_id = optional(string, null) hosted_zone_name = optional(string, null) hosted_zone_record_name = optional(string, "app") }) oidc = optional(object({ issuer = string auth_endpoint = string token_endpoint = string user_info_endpoint = string client_id = string client_secret = string session_timeout_seconds = optional(number, null) }), null) }) | null | no |
Outputs
Name | Description |
---|---|
alarms_sns_topic_arn | SNS topic ARN that provides notifications of all Workflows alarms |
api_secret_id | Secret for the Aspect API |
bk_agent_token_secret_ids | Mapping of Buildkite runner name to Buildkite agent token secret ID |
bk_api_token_secret_ids | Mapping of Buildkite runner name to Buildkite API token secret ID |
bk_git_ssh_key_secret_ids | Mapping of Buildkite runner name to ssh key secret ID |
buildkite_agent_hooks_buckets | Name of the bucket for storing custom Buildkite Agent hooks |
cost_allocation_tag | (deprecated) Name of the cost allocation tag to use |
cost_allocation_tag_value | (deprecated) The value of the cost allocation tag |
external_remote_cache_endpoint | The endpoint of the Internet-facing remote cache, if enabled. |
external_remote_certificate | The ACM certificate in use by the external remote cluster, if present. |
external_remote_load_balancer | The load balancer configuration for the external remote cluster, if present. |
gha_lambda_webhook_secret_ids | Mapping of GitHub Actions runner name and repo key to the ID's of the secrets containing the webhook token that the scaling lambda will use to verify the event came from GitHub |
gha_runner_secret_ids | Optional mapping of GitHub Actions runner name and repo key to runner secret ID |
gha_secret_ids | Mapping of GitHub Actions runner name and repo key to secret ID |
github_token_secret_id | Secret ID for a GitHub token used for making readonly calls to GitHub during a build |
gl_secret_ids | Mapping of Gitlab runner name and repo key to secret ID |
grafana_ecs_task_exec_role | The IAM role ARN for the ECS task execution |
grafana_ecs_task_role | The IAM role ARN for the ECS task |
internal_remote_cache_certificate | The CA certificate for the VPC-facing remote cache. |
internal_remote_cache_endpoint | The endpoint of the VPC-facing remote cache. |
internal_remote_load_balancer | The load balancer configuration for the external remote cluster, if present. |
managed_prometheus_endpoint | The endpoint of the Amazon Managed Prometheus (AMP) endpoint |
runner_secret_ids | Mapping of CircleCI runner name to secret ID |
security_group_rules | Security group rules for the Workflows module |
tags | Tags that will be added on all resources |
telemetry_config_bucket | S3 bucket where OTEL and Grafana config files live |
warming_management_policies | n/a |
webapp_certificate | The ACM certificate in use by the external remote cluster, if present. |
webapp_load_balancer | The load balancer configuration for the external remote cluster, if present. |