Functional Testing#
Purpose#
Functional testing validates that solutions behave correctly by executing scafctl commands against them and asserting on the output. Solution authors define test cases inline in their solution spec. The scafctl test functional command discovers tests, sets up isolated sandboxes, runs builtin and user-defined tests, and reports results.
This is the primary mechanism for validating solutions in CI and during development.
Implementation Status#
| Feature | Status | Notes |
|---|---|---|
test functional CLI command | ✅ Done | pkg/cmd/scafctl/test/functional.go |
test list CLI subcommand | ✅ Done | pkg/cmd/scafctl/test/list.go |
| Test spec types | ✅ Done | pkg/solution/soltesting/types.go |
| Builtin tests | ✅ Done | Parse, resolve, render, lint in builtins.go |
| Command-based test execution | ✅ Done | In-process cobra execution via CommandBuilder |
| CEL assertions | ✅ Done | pkg/solution/soltesting/assertions.go |
| Regex assertions | ✅ Done | |
| Contains assertions | ✅ Done | |
| Negation assertions | ✅ Done | notContains, notRegex |
| Golden file snapshots | ✅ Done | pkg/solution/soltesting/snapshot.go |
| Init scripts (exec provider) | ✅ Done | InitStep with exec provider schema |
| Test file includes | ✅ Done | TestInclude discovery source in bundler |
| Temp directory sandbox | ✅ Done | pkg/solution/soltesting/sandbox.go |
| JUnit XML reporting | ✅ Done | pkg/solution/soltesting/junit.go |
| Compose support for tests | ✅ Done | mergeTests() and mergeTestConfig() in compose |
| Parallel test execution | ✅ Done | Semaphore-based concurrency control |
| CEL assertion diagnostics | ✅ Done | pkg/solution/soltesting/diagnostics.go |
| Suite-level setup | ✅ Done | testing.config.setup with base sandbox copy |
| Test tags and filtering | ✅ Done | --tag flag, tags field on test cases |
| Per-test environment variables | ✅ Done | env field on test cases |
| Cleanup steps | ✅ Done | cleanup field, runs even on failure |
| Test inheritance (extends) | ✅ Done | pkg/solution/soltesting/inheritance.go |
| Assertion target (stderr) | ✅ Done | target field: stdout, stderr, combined |
File assertions (__files) | ✅ Done | Diff-based sandbox file change detection |
| Fail-fast (per-solution) | ✅ Done | --fail-fast stops remaining tests per solution |
| Test name validation | ✅ Done | Enforced in TestCase.Validate() |
| Selective builtin skip | ✅ Done | SkipBuiltinsValue with custom unmarshal |
| In-process command execution | ✅ Done | Root() with *RootOptions, CommandBuilder |
| Concurrency control | ✅ Done | -j flag, --sequential as sugar for -j 1 |
Conditional skip (CEL skip expression) | ✅ Done | CEL-based runtime skip evaluation |
| Test retries | ✅ Done | retries field for flaky test resilience |
| Suite-level cleanup | ✅ Done | testing.config.cleanup for teardown after all tests |
| File size guard | ✅ Done | Cap files[].content at 10MB to prevent OOM |
| In-process execution safety | ✅ Done | Root() accepts *RootOptions, no package-level state |
| Unused template lint warning | ✅ Done | unused-template lint rule |
Solution filtering (--solution) | ✅ Done | Glob-based solution name filtering |
--filter solution/test format | ✅ Done | --filter "solution/test-name" glob support |
--dry-run flag | ✅ Done | Validate test definitions without executing |
Suite-level env | ✅ Done | testing.config.env shared across all tests |
| Binary file content guard | ✅ Done | Non-UTF-8 files get content set to "<binary file>" |
| Test execution ordering | ✅ Done | Alphabetical by name; builtins first |
| Field max limits | ✅ Done | assertions: 100, files: 50, tags: 20, extends depth: 10 |
| Glob zero-match error | ✅ Done | Test files globs matching zero files produce error |
| Environment precedence chain | ✅ Done | process → testing.config.env → TestCase.env → InitStep.env |
TestCase.Validate() | ✅ Done | Comprehensive test case validation method |
| Extends non-existent error | ✅ Done | extends referencing non-existent test names is a parse-time error |
| Tests per solution limit | ✅ Done | Max 500 tests per solution |
Watch mode (--watch) | ✅ Done | fsnotify-based file watcher in soltesting/watch.go |
Auto-generated tests (-o test) | ✅ Done | pkg/solution/soltesting/generate.go; wired into render solution, run resolver, run solution |
Test Definition#
Location#
Tests are defined under spec.testing.cases in the solution YAML. Like resolvers, tests support the compose mechanism for splitting into separate files.
Inline Example#
apiVersion: scafctl.io/v1
kind: Solution
metadata:
name: terraform-scaffold
spec:
resolvers:
environment:
type: string
resolve:
with:
- provider: parameter
inputs:
key: env
- provider: static
inputs:
value: dev
validate:
with:
- provider: validation
inputs:
expression: '__self in ["dev", "staging", "prod"]'
message: "Invalid environment"
workflow:
actions:
render-main:
provider: template
inputs:
template:
tmpl: "main.tf.tmpl"
output: "{{environment}}/main.tf"
testing:
cases:
_base-render:
description: "Base render test template"
command: [render, solution]
assertions:
- expression: 'size(__output.actions) >= 1'
renders-dev-defaults:
description: "Default environment renders dev configuration"
extends: [_base-render]
tags: [smoke, render]
assertions:
- expression: 'size(__output.actions) == 1'
message: "Should produce exactly one action"
- contains: "dev/main.tf"
renders-prod-with-resolver-run:
description: "Run resolver with prod override"
command: [run, resolver]
args: ["-r", "env=prod"]
tags: [resolver]
assertions:
- expression: '__output.environment == "prod"'
- regex: '"environment":\s*"prod"'
render-prod-override:
description: "Render with prod override produces correct paths"
extends: [_base-render]
args: ["-r", "env=prod"]
tags: [render]
assertions:
- expression: '__output.actions["render-main"].inputs.output == "prod/main.tf"'
rejects-invalid-env:
description: "Invalid environment fails validation"
command: [run, resolver]
args: ["-r", "env=invalid"]
expectFailure: true
tags: [validation]
assertions:
- contains: "Invalid environment"
- notContains: "panic"
- contains: "validation"
target: stderr
passes-lint:
description: "Solution passes lint with no errors"
command: [lint]
tags: [lint]
assertions:
- expression: '__output.errorCount == 0'
snapshot-action-graph:
description: "Action graph matches golden file"
command: [render, solution]
args: ["-r", "env=dev"]
tags: [snapshot]
snapshot: "testdata/expected-render.json"
renders-with-setup:
description: "Render with custom setup and cleanup"
command: [render, solution]
env:
CUSTOM_VAR: "test-value"
init:
- command: "mkdir -p templates"
cleanup:
- command: "echo 'cleanup complete'"
assertions:
- expression: 'size(__output.actions) >= 1'
- expression: '__output.files["dev/main.tf"].exists'
temporarily-disabled:
description: "This test is skipped during development"
skip: true
skipReason: "Waiting on upstream provider fix"
command: [render, solution]
assertions:
- expression: 'size(__output.actions) == 1'Composed into Separate Files#
# solution.yaml
apiVersion: scafctl.io/v1
kind: Solution
metadata:
name: terraform-scaffold
compose:
- resolvers/environment.yaml
- tests/rendering.yaml
- tests/validation.yaml
spec:
workflow:
actions:
render-main:
provider: template
inputs:
template:
tmpl: "main.tf.tmpl"
output: "{{environment}}/main.tf"# tests/rendering.yaml
spec:
testing:
cases:
renders-dev-defaults:
description: "Default environment renders dev configuration"
command: [render, solution]
tags: [smoke, render]
assertions:
- expression: 'size(__output.actions) == 1'
renders-prod-override:
description: "Render with prod override produces correct paths"
command: [render, solution]
args: ["-r", "env=prod"]
tags: [render]
assertions:
- expression: '__output.actions["render-main"].inputs.output == "prod/main.tf"'# tests/validation.yaml
spec:
testing:
cases:
rejects-invalid-env:
description: "Invalid environment fails validation"
command: [run, resolver]
args: ["-r", "env=invalid"]
expectFailure: true
tags: [validation]
assertions:
- contains: "Invalid environment"Test Case Spec#
Each test case is a named entry under spec.testing.cases. Test names must match ^[a-zA-Z0-9][a-zA-Z0-9_-]*$ (letters, numbers, hyphens, underscores; must start with a letter or number). Names starting with _ are test templates — they are not executed directly but can be inherited via extends.
cases:
<test-name>:
description: <string>
command: <list[string]>
args: <list[string]>
extends: <list[string]>
tags: <list[string]>
env: <map[string, string]>
files: <list[string]>
init: <list[InitStep]>
cleanup: <list[InitStep]>
assertions: <list[Assertion]>
snapshot: <string>
injectFile: <bool>
expectFailure: <bool>
exitCode: <int>
timeout: <string> # Go duration format, e.g., "30s", "2m"
skip: <bool | Expression>
skipReason: <string>
retries: <int>Each test case is a named entry under spec.testing.cases. The maximum number of tests per solution is 500.
Field Details#
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
description | string | Yes | — | Human-readable test description |
command | []string | No | [render, solution] | scafctl subcommand as an array (e.g., [render, solution], [run, resolver], [lint]). By default the runner auto-injects -f <sandbox-path> — set injectFile: false to disable |
args | []string | No | [] | Additional CLI flags appended after the command. -f must never be included here — use injectFile to control file injection |
extends | []string | No | [] | Names of test templates to inherit from. Applied left-to-right; this test’s fields override inherited values. See Test Inheritance |
tags | []string | No | [] | Tags for categorization and filtering. Use --tag to run only tests with matching tags. Max 20 tags per test |
env | map[string]string | No | {} | Environment variables set for this test’s init, command, and cleanup steps. Merged with process environment. See Environment Precedence |
files | []string | No | [] | Relative paths or globs for files required by this test. Supports ** recursive globs. Globs are resolved at sandbox setup time; zero-match globs produce a test error. Max 50 entries |
init | []InitStep | No | [] | Setup steps executed sequentially before the command |
cleanup | []InitStep | No | [] | Teardown steps executed after the command, even on failure. See Cleanup Steps |
assertions | []Assertion | Conditional | — | Required unless snapshot is set. All assertions are evaluated regardless of prior failures. Max 100 assertions per test |
snapshot | string | No | — | Relative path to a golden file for normalized comparison |
injectFile | bool | No | true | When true (default), the runner auto-injects -f <sandbox-solution-path>. Set to false for commands that don’t accept -f (e.g., config get, auth status) or for catalog solution tests that use --catalog instead. -f must never appear in args regardless of this setting |
expectFailure | bool | No | false | When true, the test passes if the command exits non-zero |
exitCode | int | No | — | Exact expected exit code. Mutually exclusive with expectFailure — setting both is a validation error |
timeout | string | No | "30s" | Per-test timeout as a Go duration string (e.g., "30s", "2m", "1m30s"). Parsed via a custom Duration type with string-based YAML/JSON marshalling |
skip | bool | Expression | No | false | Skip this test. When true, unconditionally skip. When a CEL expression string, evaluated at discovery time — if true, the test is skipped. Context variables: os (GOOS), arch (GOARCH), env (environment variables map), subprocess (bool). Example: 'os == "windows"' |
skipReason | string | No | — | Human-readable reason for skipping. Shown in test output |
retries | int | No | 0 | Number of retry attempts for a failing test. The test passes if any attempt succeeds. Retry count shown in output: PASS (retry 2/3) |
Test Configuration#
Solution-level test configuration is defined under spec.testing.config:
spec:
testing:
config:
skipBuiltins: true
env:
SCAFCTL_CONFIG_DIR: "$SCAFCTL_SANDBOX_DIR"
setup:
- command: "scafctl config set defaults.environment staging"
- command: "mkdir -p templates"
cleanup:
- command: "echo 'suite teardown complete'"skipBuiltins accepts either a boolean or a list of builtin test names for selective skipping:
# Skip all builtins
config:
skipBuiltins: true
# Skip only specific builtins
config:
skipBuiltins:
- resolve-defaults
- render-defaults| Field | Type | Default | Description |
|---|---|---|---|
skipBuiltins | bool | []string | false | Disable builtin tests. true disables all; a list of names disables only those builtins (e.g., ["resolve-defaults"]) |
env | map[string]string | {} | Suite-level environment variables applied to all tests. Merged with process environment. Individual test env fields override on key conflict. See Environment Precedence |
setup | []InitStep | [] | Suite-level setup steps. Run once, then the resulting sandbox is copied per-test |
cleanup | []InitStep | [] | Suite-level teardown steps. Run once after all tests for the solution complete, even on failure. Symmetric with setup |
Suite-Level Setup#
When testing.config.setup is defined:
- Create a base sandbox and copy the solution + bundle files
- Run
setupsteps sequentially in the base sandbox - For each test case, copy the prepared base sandbox to an isolated per-test sandbox
- Run per-test
initsteps, then the command
This avoids duplicating the same init steps across every test. If any setup step fails, all tests for that solution report as error.
Compose Merge Semantics#
When testing.config appears in multiple compose files:
| Field | Merge Behavior |
|---|---|
skipBuiltins (bool) | true wins — if any compose file sets skipBuiltins: true, all builtins are skipped |
skipBuiltins (list) | Unioned (deduplicated) across all compose files |
setup | Appended in compose-file order (first file’s steps run first). This is a new merge strategy distinct from the existing reject-duplicates and union patterns |
cleanup | Appended in compose-file order (first file’s steps run first) |
env | Merged map; last compose file wins on key conflict |
Compose-file order affects testing.config.setup, testing.config.cleanup, and testing.config.env merge ordering but does not affect test execution order. Tests from all compose files are merged into a single map and executed alphabetically (see Test Execution Ordering
).
spec.testing.cases entries are merged by name using the reject-duplicates strategy (same as resolvers and actions). If two compose files define a test with the same name, the compose merge fails with an error.
composePartstruct: ThecomposePartstruct inpkg/solution/bundler/compose.gohas aTesting *soltesting.TestSuitefield to parse test-related sections from compose files.
Note:
SkipBuiltinsValuerequires bothUnmarshalYAMLandMarshalYAMLimplementations to survive thedeepCopySolutionYAML round-trip used in compose.
Init Scripts#
Tests can define setup steps that run before the test command. Init steps execute sequentially in the sandbox directory. Init uses the exec provider’s input schema, giving access to all execution options.
cases:
renders-with-custom-config:
description: "Renders with custom configuration"
init:
- command: "mkdir -p templates && echo '# Generated' > templates/main.tf.tmpl"
- command: "scafctl config set defaults.environment staging"
env:
SCAFCTL_CONFIG_DIR: "$SCAFCTL_SANDBOX_DIR"
- command: "echo 'setting up test data'"
shell: bash
timeout: 10
workingDir: "templates"InitStep#
Init steps accept the same fields as the exec provider:
| Field | Type | Required | Description |
|---|---|---|---|
command | string | Yes | Command to execute. Supports POSIX shell syntax (pipes, redirections, variables) |
args | []string | No | Additional arguments, automatically shell-quoted |
stdin | string | No | Standard input to provide to the command |
workingDir | string | No | Working directory (relative to sandbox root) |
env | map[string]string | No | Environment variables merged with the parent process |
timeout | int | No | Timeout in seconds (default: 30) |
shell | string | No | Shell interpreter: auto (default), sh, bash, pwsh, cmd |
Init failures cause the test to report as error (not fail). Stdout/stderr from init steps are captured and included in verbose output for debugging.
Environment Variables#
The test runner automatically injects the following environment variables into every init step and test command:
| Variable | Description |
|---|---|
SCAFCTL_SANDBOX_DIR | Absolute path to the sandbox directory |
These are standard process environment variables — no custom template syntax.
Environment Precedence#
Environment variables are resolved in the following precedence order (highest wins):
| Priority | Source | Description |
|---|---|---|
| 1 (lowest) | Process environment | Inherited from the parent process |
| 2 | testing.config.env | Suite-level env applied to all tests |
| 3 | TestCase.env | Per-test env overrides suite-level on key conflict |
| 4 (highest) | InitStep.env | Per-step env overrides all others on key conflict |
Each level merges with the previous — keys not overridden are preserved. The SCAFCTL_SANDBOX_DIR variable is always injected by the runner and cannot be overridden.
Test Files#
Tests can declare additional files required for execution. These files are copied into the sandbox alongside the solution.
spec:
testing:
cases:
renders-with-custom-template:
description: "Renders with a test-specific template"
files:
- testdata/custom-main.tf.tmpl
- testdata/variables.json
command: [render, solution]
assertions:
- expression: 'size(__output.actions) >= 1'
bundle:
include:
- testdata/**How Files Work#
| Phase | Behavior |
|---|---|
| Development | files paths are resolved relative to the solution directory. The runner copies them into the sandbox before init/command execution |
| Build | scafctl build auto-discovers files referenced in spec.testing.cases[*].files and includes them in the bundle artifact as a TestInclude discovery source |
| Lint | scafctl lint produces an error if test files are not covered by bundle.include patterns. Tests must work from remote catalog artifacts |
| Bundle extraction | Test files are extracted alongside solution files when a bundled solution is unpacked |
Files are copied into the sandbox maintaining their relative directory structure. Path traversal above the solution root (..) is rejected. Symlinks are not supported and are rejected.
Glob Resolution#
Globs in the files field (e.g., testdata/**/*.json) are resolved at sandbox setup time — after suite-level setup completes but before per-test init steps run. This means:
- Globs are expanded against the solution source directory (or the suite-level base sandbox if
testing.config.setupis defined) - Zero-match globs produce a test
error, not a silent no-op. This catches typos and missing test data early - Glob patterns are validated at lint time —
scafctl lintwarns on syntactically invalid glob patterns - Resolved paths are logged in verbose output for debugging
Test Inheritance#
Tests can inherit from template tests using the extends field. This reduces duplication when many tests share common configuration.
Template Tests#
Test names starting with _ are templates — they are not executed directly and do not appear in test output. They exist only to be inherited by other tests.
scafctl lint produces a warning for templates that are never referenced by any extends field. Template-only compose files (containing only _-prefixed tests) are valid.
Extends Rules#
extendsaccepts a list of test template names, applied left-to-right- The extending test’s fields override inherited values
- Circular extends chains are detected and rejected
- Templates can extend other templates
- Extends chain depth is limited to 10 levels. Chains deeper than 10 are rejected with a validation error
- Referencing a non-existent test name in
extendsis a validation error at parse time.scafctl lintalso reports this as an error
Field Merge Strategy#
| Field | Merge Behavior |
|---|---|
command | Child wins if set |
args | Appended (base args first, then child args) |
assertions | Appended (base assertions first, then child assertions) |
files | Appended (deduplicated) |
init | Base init steps prepended before child init steps |
cleanup | Base cleanup steps appended after child cleanup steps |
tags | Appended (deduplicated) |
env | Merged map (child values override base on key conflict) |
description | Child wins if set |
timeout | Child wins if set |
expectFailure | Child wins if set |
exitCode | Child wins if set |
skip | Child wins if set |
injectFile | Child wins if set |
snapshot | Child wins if set |
retries | Child wins if set |
Example#
cases:
_base-render:
description: "Base render test"
command: [render, solution]
tags: [render]
assertions:
- expression: 'size(__output.actions) >= 1'
_base-prod:
description: "Base prod test"
args: ["-r", "env=prod"]
tags: [prod]
render-prod:
description: "Render prod configuration"
extends: [_base-render, _base-prod]
assertions:
- expression: '__output.actions["render-main"].inputs.output == "prod/main.tf"'The resolved render-prod test inherits:
command: [render, solution]from_base-renderargs: ["-r", "env=prod"]from_base-prodtags: [render, prod]merged from both basesassertions: all three assertions (one from_base-render, one fromrender-proditself)description: "Render prod configuration"overridden by the child
Cleanup Steps#
Tests can define cleanup steps that run after the test command, even if the command or assertions fail. Cleanup uses the same InitStep schema as init.
cases:
renders-with-temp-state:
description: "Render with temporary state file"
command: [render, solution]
init:
- command: "echo '{"key": "value"}' > state.json"
cleanup:
- command: "echo 'cleanup complete'"
assertions:
- expression: 'size(__output.actions) >= 1'Cleanup steps:
- Execute sequentially in the sandbox directory
- Run even when the test command fails, init fails, or assertions fail
- Cleanup failures are logged in verbose output but do not change the test status (the original pass/fail/error result is preserved)
- Have access to the same environment variables as init steps (
SCAFCTL_SANDBOX_DIR, per-testenv)
Test Tags#
Tests can be tagged for categorization and selective execution.
cases:
renders-dev:
description: "Render dev config"
command: [render, solution]
tags: [smoke, render, fast]
assertions:
- expression: 'size(__output.actions) >= 1'Filter tests by tag using the --tag flag:
# Run only tests tagged "smoke"
scafctl test functional -f solution.yaml --tag smoke
# Combine with name filter
scafctl test functional -f solution.yaml --tag render --filter "*prod*"
# Filter by solution and tag
scafctl test functional --tests-path ./solutions/ --solution "terraform-*" --tag smoke
# Filter with solution/test-name format
scafctl test functional --tests-path ./solutions/ --filter "terraform-*/render-*"A test matches the --tag filter if it has any of the specified tags. Tags inherited via extends are included in the match.
When --tag, --filter, and --solution are combined, they are ANDed: a test must match the solution filter AND the name filter AND have a matching tag.
Test Name Validation#
Test names must match the pattern ^[a-zA-Z0-9][a-zA-Z0-9_-]*$:
- Must start with a letter or digit
- May contain letters, digits, hyphens (
-), and underscores (_) - Template names starting with
_are the exception — they must match^_[a-zA-Z0-9][a-zA-Z0-9_-]*$
This constraint ensures compatibility with JUnit XML output, CLI --filter glob matching, and compose file merge keys.
Invalid names are rejected during YAML parsing and surfaced as lint errors.
Builtin Tests#
Every solution automatically receives builtin tests unless testing.config.skipBuiltins is set. Builtins validate baseline correctness without requiring explicit test definitions.
| Builtin Test | Command | Passes When |
|---|---|---|
builtin:parse | (internal) | Solution YAML parses without errors |
builtin:lint | lint | No lint errors (warnings allowed) |
builtin:resolve-defaults | run resolver | All resolvers resolve with default values |
builtin:render-defaults | render solution | Render succeeds with default values |
Builtins run before user-defined tests. By default, if a builtin fails, user-defined tests still run (they are independent). Use --fail-fast to stop remaining tests for that solution on first failure.
Selective skipping is supported via testing.config.skipBuiltins — set to true to skip all, or provide a list of specific builtin names (without the builtin: prefix) to skip only those.
Assertions#
Each assertion has exactly one of expression, regex, contains, notRegex, or notContains, plus an optional message and target. Exactly one assertion type must be set — this is enforced via runtime validation after YAML unmarshal.
All assertions in a test are always evaluated, regardless of whether earlier assertions fail. This ensures the user sees all problems at once rather than fixing them one at a time.
| Field | Type | Description |
|---|---|---|
expression | Expression | CEL expression evaluating to bool. Runs against structured output context |
regex | string | Regex pattern that must match somewhere in the target text |
contains | string | Substring that must appear in the target text |
notRegex | string | Regex pattern that must NOT match anywhere in the target text |
notContains | string | Substring that must NOT appear in the target text |
target | string | Text to match against: stdout (default), stderr, or combined (stdout + stderr). Only applies to regex, contains, notRegex, notContains. CEL expressions access both via context variables |
message | string | Custom failure message (optional). If omitted, the assertion itself is shown |
Target Field#
The target field controls which output stream regex, contains, notRegex, and notContains assertions match against:
assertions:
# Matches against stdout (default)
- contains: "rendered successfully"
# Matches against stderr
- contains: "warning: deprecated field"
target: stderr
# Matches against combined stdout + stderr
- notContains: "panic"
target: combined
# CEL expressions always have access to both via context variables
- expression: '__stderr.contains("warning") && __stdout.contains("success")'The target field has no effect on expression assertions — CEL expressions access __stdout, __stderr, __exitCode, __output, and __files as separate context variables.
Assertion Context#
How Output Is Captured#
When a test executes a scafctl command:
- The runner captures stdout, stderr, and exit code
- If stdout is valid JSON, it is parsed into the
__outputvariable. Otherwise__outputisnil - The tester is responsible for passing
-o jsoninargswhen structured output is needed
CEL Context Variables#
| Variable | Type | Always Available | Description |
|---|---|---|---|
__stdout | string | Yes | Raw stdout text |
__stderr | string | Yes | Raw stderr text |
__exitCode | int | Yes | Process exit code |
__output | map[string, any] | When -o json is passed in args | Parsed JSON output. nil when stdout is not valid JSON. CEL expressions referencing __output when nil cause the test to report as error (not fail) with the diagnostic: “variable ‘__output’ is nil — this command does not support structured output or -o json was not specified”. This is a configuration issue, not an assertion failure |
__files | map[string, FileInfo] | Yes | Files created or modified in the sandbox during command execution. Key is relative path. Each FileInfo has exists (bool) and content (string) |
The output variable structure depends on the command:
| Command | __output structure |
|---|---|
render solution | Action graph: __output.actions, each with provider, inputs, dependsOn, when |
run resolver | Resolver map: __output.<resolverName> = resolved value |
run solution | Execution result: __output.status, __output.actions, __output.duration |
lint | Lint result: __output.findings, __output.errorCount, __output.warnCount |
snapshot diff | Diff result: __output.added, __output.removed, __output.modified |
Note: This table is non-exhaustive. For commands not listed,
outputfollows the command’s-o jsonschema. Use verbose mode (-v) to inspect the raw JSON structure for any command.
File Assertions (__files variable)#
The __files variable exposes files that were created or modified in the sandbox during command execution. The runner snapshots all file paths and modification times before the command runs, then diffs after execution. Only new or changed files appear in __files.
assertions:
# Check that a file was created
- expression: '__files["prod/main.tf"].exists'
# Check file content
- expression: '__files["prod/main.tf"].content.contains("resource")'
# Check number of generated files
- expression: 'size(__files) == 3'Each entry in __files is keyed by the relative path from the sandbox root and has:
exists(bool): alwaystruefor entries in the map (present for consistency)content(string): the full file content as a string
Size guard: Files larger than 10MB have their
contentset to"<file too large>"and a warning is emitted in verbose output. This prevents OOM on solutions that generate large binary artifacts.
Binary file guard: Files with non-UTF-8 content (binary files) have their
contentset to"<binary file>"and a warning is emitted in verbose output. Theexistsfield is stilltrue. Useexistschecks rather thancontentchecks for binary outputs.
Regex and Contains Context#
regex, contains, notRegex, and notContains assertions match against the stream specified by the target field (default: stdout). When target is combined, stdout and stderr are concatenated with a newline separator. This is useful for:
- Commands that don’t support
-o json(e.g.,explain solution) - Quick substring checks without CEL overhead
- Pattern matching on formatted output
- Ensuring sensitive values or panic traces don’t appear in output
CEL Assertion Diagnostics#
When a CEL assertion fails, the runner evaluates sub-expressions to provide actionable diagnostics rather than just “expected true, got false”:
✗ expression: size(__output.actions) == 3
size(__output.actions) = 5
Expected 3, got 5✗ expression: __output.actions["render-main"].inputs.output == "prod/main.tf"
__output.actions["render-main"].inputs.output = "dev/main.tf"
Expected "prod/main.tf", got "dev/main.tf"The runner inspects comparison expressions (==, !=, <, >, in) and evaluates both sides independently to surface actual vs expected values.
Snapshot Assertions#
When snapshot is set, the test runner:
- Executes the command and captures stdout
- Normalizes the output through a fixed pipeline:
- Sort JSON map keys deterministically
- Replace ISO-8601 timestamps (
2006-01-02T15:04:05Zpatterns) with<TIMESTAMP> - Replace UUIDs (
[0-9a-f]{8}-[0-9a-f]{4}-...) with<UUID> - Replace absolute paths matching the sandbox directory with
<SANDBOX>
- Compares against the golden file at the specified path (relative to solution directory)
- On mismatch, displays a unified diff showing expected vs actual
Future: A
snapshotScrubbersfield on the test case for custom regex replacements (e.g., replacing dynamic API keys or version strings). Not included in v1.
Snapshots can be used alongside other assertions — all must pass.
Snapshot with expectFailure#
When snapshot is combined with expectFailure: true, the snapshot captures stdout from the failing command. The snapshot comparison runs after the exit code check passes (i.e., after confirming the command did fail as expected). If the exit code check fails (command unexpectedly succeeds), the snapshot comparison is skipped and the test reports as fail.
Snapshot Diff Output#
When a snapshot doesn’t match, the failure output shows a unified diff:
✗ snapshot: testdata/expected-render.json
--- expected
+++ actual
@@ -3,7 +3,7 @@
"render-main": {
"provider": "template",
"inputs": {
- "output": "dev/main.tf"
+ "output": "staging/main.tf"
}
}Updating Snapshots#
scafctl test functional -f solution.yaml --update-snapshotsThis re-runs all tests with snapshot fields and overwrites the golden files with actual output. Use a glob to selectively update:
scafctl test functional -f solution.yaml --update-snapshots --filter "snapshot-*"Execution Model#
In-Process Command Execution#
The test runner executes scafctl commands in-process by invoking the cobra command tree directly, rather than shelling out to a scafctl binary. This is faster, avoids requiring a built binary on PATH, and simplifies output capture.
Root() accepts a *RootOptions struct that enables isolated, concurrent invocations:
opts := &scafctl.RootOptions{
IOStreams: terminal.NewIOStreams(nil, &stdout, &stderr, false),
ExitFunc: func(code int) { panic(&exitcode.ExitError{Code: code}) },
ConfigPath: "",
}
cli := scafctl.Root(opts)
cli.SetArgs([]string{"render", "solution", "-f", sandboxPath, "-o", "json"})
err := cli.Execute()Each call to Root() creates its own cliParams, flag bindings, and writer — no package-level mutable state. This means multiple test goroutines can construct and execute cobra trees fully in parallel without data races or mutex serialization.
The runner:
- Constructs the cobra root command using
Root(opts)with customIOStreams(backed bybytes.Buffer) and a customExitFunc - Sets the
commandarray as args (e.g.,[render, solution]→ cobra traversal) - Injects
-f <sandbox-solution-path>by default. SetinjectFile: falseon the test case to disable (e.g., for catalog solution tests). The runner always errors if-fappears in the test’sargs, regardless ofinjectFile - Stdout/stderr are captured via the
IOStreamsbuffers passed inRootOptions - Uses
RootOptions.ExitFuncto interceptos.Exitcalls and convert them to*exitcode.ExitErrorvalues, preventing the test runner from terminating
Sandbox#
Each test runs in an isolated temporary directory:
- Copy the solution file and its bundle files to a temp directory
- Copy test
filesinto the sandbox (maintaining relative paths). Symlinks are rejected - Snapshot all file paths and modification times (for
__filesdiff) - Inject per-test
envvariables andSCAFCTL_SANDBOX_DIR - Run init steps in the sandbox
- Execute the scafctl command in-process
- Diff sandbox files against snapshot to populate
__files - Capture output and run assertions
- Run cleanup steps (even on failure)
- Clean up the temp directory (unless
--keep-sandboxis set)
This ensures init scripts cannot modify source files.
Test Execution Ordering#
Tests execute in a deterministic order:
- Builtin tests run first, in alphabetical order (
builtin:lint,builtin:parse,builtin:render-defaults,builtin:resolve-defaults) - User-defined tests run next, in alphabetical order by test name
- Template tests (names starting with
_) are never executed
With -j > 1 (default), tests run in parallel and may complete in any order, but result reporting is always alphabetical. With --sequential (-j 1), both execution and reporting follow alphabetical order.
This is consistent with how the codebase handles action ordering within execution phases (sort.Strings). Tests should be independent — alphabetical ordering exposes hidden ordering dependencies.
Discovery#
The test runner discovers tests in two ways:
- Single solution:
scafctl test functional -f solution.yaml— runs tests defined in that solution - Directory scan:
scafctl test functional --tests-path path/to/solutions/— recursively discovers all solution files and runs theirspec.testing.cases
Solutions with no spec.testing.cases still run builtin tests (unless skipBuiltins is set).
Test templates (names starting with _) are resolved via extends but never executed directly.
Execution Flow#
For each test case:
- Resolve
extendschains and merge inherited fields - Validate test name matches
^[a-zA-Z0-9][a-zA-Z0-9_-]*$ - If
skip: true→ statusskip, stop - If
skipis a CEL expression → evaluate withos,arch,env,subprocesscontext. Iftrue→ statusskipwith expression as reason, stop - Create temp sandbox and copy solution + bundle files + test files
- Snapshot sandbox file list and modification times
- Run init steps sequentially; if any fails → status
error, run cleanup, stop - Build the command: construct cobra tree with
<command> <args...>. IfinjectFileistrue(default), prepend-f <sandbox-solution> - If the test’s
argsinclude-o jsonor--output json, the runner will pass them through. The tester is responsible for including-o jsoninargswhen structured output is needed - Inject
SCAFCTL_SANDBOX_DIRand per-testenvenvironment variables - Execute in-process with timeout; capture stdout, stderr, exit code via
RootOptions.ExitFunc - Diff sandbox files against snapshot → populate
__filescontext variable - Parse JSON stdout if available → populate
__outputcontext variable (nil if stdout is not valid JSON) - Check exit code against
exitCodeorexpectFailure - If
snapshotis set → run snapshot comparison (show unified diff on mismatch) - Run all assertions (CEL against parsed output, regex/contains against target stream). All assertions always run regardless of prior failures
- Run cleanup steps (even on failure or error)
- All checks pass →
pass; any check fails →fail - If
failandretries > 0→ re-run from step 5 up toretriestimes. Each retry creates a fresh sandbox (re-copies solution + bundle files from the suite-level base sandbox iftesting.config.setupis present, otherwise from source). Init steps re-run on each retry. If any retry passes →pass (retry N/M). Retry attempts are shown in verbose output
Parallelism#
Test cases run in parallel by default, limited by the -j / --concurrency flag (default: runtime.NumCPU()). Each test has its own sandbox and its own Root() invocation with isolated state — no shared mutable state, no mutex serialization needed.
Use --sequential (sugar for -j 1) to disable parallel execution (useful for debugging).
Fail-Fast#
Use --fail-fast to stop executing remaining tests for the current solution on first failure. Tests for other solutions continue to run. This is useful for quick feedback during debugging.
Without --fail-fast, all tests for all solutions execute and all failures are reported.
Timeouts#
| Level | Default | Flag |
|---|---|---|
| Per-test | 30s | --test-timeout |
| Global | 5m | --timeout |
| Per-test override | — | timeout field in test spec |
CLI Interface#
scafctl test functional#
Discovers and runs functional tests.
scafctl test functional [flags]Flags#
| Flag | Type | Default | Description |
|---|---|---|---|
-f, --file | string | — | Path to a single solution file |
--tests-path | string | — | Directory to scan for solution files |
-o, --output | string | table | Output format: table, json, yaml, quiet |
--report-file | string | — | Write JUnit XML report to this path |
--update-snapshots | bool | false | Update golden files instead of comparing. Combine with --filter for selective updates |
--sequential | bool | false | Disable parallel test execution (sugar for -j 1) |
-j, --concurrency | int | runtime.NumCPU() | Maximum number of tests to run in parallel |
--skip-builtins | bool | false | Skip builtin tests for all solutions |
--test-timeout | duration | 30s | Per-test timeout |
--timeout | duration | 5m | Global timeout for all tests |
--filter | []string | — | Run only tests matching this name pattern (glob via doublestar.Match). Supports two formats: (1) test name only (e.g., "render-*") — matches against the test name, (2) solution/test-name format (e.g., "terraform-*/render-*") — matches against both solution name and test name. When no / is present, matches test name only (backward-compatible). Builtin tests are matched with their full name including builtin: prefix. Multiple --filter flags allowed; a test runs if it matches any filter. Registered via StringArrayVar per project convention |
--tag | []string | — | Run only tests with these tags. Multiple --tag flags allowed (e.g., --tag smoke --tag render). A test matches if it has any of the specified tags. Registered via StringArrayVar per project convention |
--solution | []string | — | Run only tests from solutions matching this name pattern (glob via doublestar.Match). Multiple --solution flags allowed; a solution is included if it matches any pattern. When combined with --filter and --tag, all filters are ANDed: a test must match the solution filter AND the name filter AND have a matching tag |
--dry-run | bool | false | Validate test definitions, resolve extends chains, and report discovery results without executing any tests. Useful for CI preflight checks. Exits 0 if valid, exitcode.InvalidInput (3) if invalid |
--fail-fast | bool | false | Stop remaining tests for the current solution on first failure. Other solutions continue |
-v, --verbose | bool | false | Show full command, init output, and raw stdout/stderr |
--keep-sandbox | bool | false | Preserve sandbox directories for failed tests |
--no-color | bool | false | Disable colored output |
-q, --quiet | bool | false | Only output failures |
Exit Codes#
| Code | Constant | Meaning |
|---|---|---|
| 0 | exitcode.Success | All tests passed |
| 11 | exitcode.TestFailed (new) | One or more tests failed |
| 3 | exitcode.InvalidInput | Configuration or usage error |
scafctl test list#
Lists all tests without executing them.
scafctl test list [flags]Flags#
| Flag | Type | Default | Description |
|---|---|---|---|
-f, --file | string | — | Path to a single solution file |
--tests-path | string | — | Directory to scan for solution files |
-o, --output | string | table | Output format: table, json, yaml, quiet |
--include-builtins | bool | false | Include builtin tests in the listing |
--tag | []string | — | Filter to tests with these tags. Multiple --tag flags allowed |
--solution | []string | — | Filter to solutions matching this name pattern (glob). Multiple --solution flags allowed |
--filter | []string | — | Filter to tests matching this name pattern (glob). Supports solution/test-name format. Multiple --filter flags allowed |
Example Output#
SOLUTION TEST COMMAND TAGS SKIP
terraform-scaffold renders-dev-defaults render solution smoke,render -
terraform-scaffold renders-prod-override render solution render -
terraform-scaffold rejects-invalid-env run resolver validation -
terraform-scaffold temporarily-disabled render solution Waiting on upstream provider fixOutput#
Interactive Progress (default, TTY)#
When running on a TTY terminal, tests show animated spinners that resolve to
final status lines as each test completes. Completed tests scroll upward
(via mpb’s PopCompletedMode) while active tests animate at the bottom.
After all tests complete, only failure details and a summary line are printed — the per-test table is not repeated since the progress lines already showed each result. This eliminates the previous duplication where every test appeared twice with mismatched durations.
All durations shown in progress lines use TestResult.Duration (the pure
execution time measured inside the runner), ensuring consistency between
progress output and structured (JSON/YAML) output.
✓ terraform-scaffold :: builtin:parse pass 1ms
✓ terraform-scaffold :: builtin:lint pass 45ms
✓ terraform-scaffold :: renders-dev-defaults pass 12ms
✗ terraform-scaffold :: renders-prod-override fail 15ms
✓ terraform-scaffold :: rejects-invalid-env pass 8ms
⊘ terraform-scaffold :: temporarily-disabled skip
Failures:
terraform-scaffold/renders-prod-override: exit code mismatch
[exitCode] expected 1 got 0: exit code assertion failed
5 passed, 1 failed, 0 errors, 1 skipped (121ms)Table (non-TTY / --no-progress)#
When progress output is disabled (piped output, --no-progress flag, or
non-table output format), the full per-test table is shown as the only view:
SOLUTION TEST STATUS DURATION
terraform-scaffold builtin:parse PASS 1ms
terraform-scaffold builtin:lint PASS 45ms
terraform-scaffold builtin:resolve-defaults PASS 18ms
terraform-scaffold builtin:render-defaults PASS 22ms
terraform-scaffold renders-dev-defaults PASS 12ms
terraform-scaffold renders-prod-override PASS 15ms
terraform-scaffold rejects-invalid-env PASS 8ms
terraform-scaffold temporarily-disabled SKIP -
7 passed, 0 failed, 0 errors, 1 skipped (121ms)In verbose mode (-v), passing tests show assertion counts:
SOLUTION TEST STATUS DURATION
terraform-scaffold renders-dev-defaults PASS (2/2) 12ms
terraform-scaffold renders-prod-override PASS (3/3) 15ms
terraform-scaffold rejects-invalid-env PASS (2/2) 8msFailing tests show which assertions failed:
terraform-scaffold renders-dev-defaults FAIL (1/3) 14msError Output#
SOLUTION TEST STATUS DURATION
terraform-scaffold renders-with-setup ERROR 3ms
Init [1/2] failed:
$ mkdir -p /restricted/path
mkdir: permission denied
(exit 1)
Init step failure is an error, not an assertion failure.
terraform-scaffold renders-dev-defaults PASS 12ms
1 passed, 0 failed, 1 error, 0 skipped (15ms)Failure Output#
SOLUTION TEST STATUS DURATION
terraform-scaffold renders-dev-defaults FAIL 14ms
✗ expression: size(__output.actions) == 1
size(__output.actions) = 3
Expected 1, got 3
Message: Should produce exactly one action
✗ contains: "dev/main.tf"
Substring not found in stdout
terraform-scaffold renders-prod-override PASS 15ms
1 passed, 1 failed, 0 errors, 0 skipped (29ms)Verbose Failure Output (-v)#
SOLUTION TEST STATUS DURATION
terraform-scaffold renders-dev-defaults FAIL (1/2) 14ms
Command: scafctl render solution -f /tmp/scafctl-test-abc123/solution.yaml -o json
Sandbox: /tmp/scafctl-test-abc123/
Init [1/1]:
$ mkdir -p templates
(exit 0)
Stdout:
{"actions":{"render-main":{"provider":"template","inputs":{"output":"staging/main.tf"}},...}}
Stderr:
(empty)
Exit code: 0
✗ expression: size(__output.actions) == 1
size(__output.actions) = 3
Expected 1, got 3
Message: Should produce exactly one actionJSON (-o json)#
{
"results": [
{
"solution": "terraform-scaffold",
"test": "renders-dev-defaults",
"status": "pass",
"duration": "12ms",
"command": "render solution",
"assertions": [
{ "type": "expression", "value": "size(__output.actions) == 1", "passed": true },
{ "type": "contains", "value": "dev/main.tf", "passed": true }
]
},
{
"solution": "terraform-scaffold",
"test": "temporarily-disabled",
"status": "skip",
"skipReason": "Waiting on upstream provider fix"
}
],
"summary": { "passed": 3, "failed": 0, "errors": 0, "skipped": 1, "duration": "35ms" }
}JUnit XML (--report-file)#
Written to the specified path alongside normal terminal output. One <testsuite> per solution, one <testcase> per test. Skipped tests emit <skipped message="reason"/>. Failed assertions use <failure> with diagnostic output. Infrastructure/setup errors use <error> (distinct from <failure>) to differentiate assertion failures from environment issues.
Example <error> element:
<testcase name="renders-with-setup" classname="terraform-scaffold" time="0.003">
<error message="init step 1 failed: exit code 1">
$ mkdir -p /restricted/path
mkdir: permission denied
</error>
</testcase>Build Integration#
Bundler Discovery#
scafctl build and the bundler’s DiscoverFiles() must scan spec.testing.cases[*].files entries as an additional discovery source. These are tagged as TestInclude to distinguish them from StaticAnalysis and ExplicitInclude sources.
This ensures test files are included in the bundle artifact and available when tests run from a remote catalog.
Lint Rule#
scafctl lint produces an error when files referenced in spec.testing.cases[*].files are not covered by bundle.include patterns. Tests must work when the solution is fetched from a remote catalog, so all test files must be bundled.
Go Types#
Package: pkg/solution/testing#
// TestCase defines a single functional test for a solution.
type TestCase struct {
Name string `json:"name" yaml:"name" doc:"Test name (auto-set from map key)"`
Description string `json:"description" yaml:"description" doc:"Human-readable test description"`
Command []string `json:"command,omitempty" yaml:"command,omitempty" doc:"scafctl subcommand as array" example:"[render, solution]"`
Args []string `json:"args,omitempty" yaml:"args,omitempty" doc:"Additional CLI flags. -f is always auto-injected by the runner"`
Extends []string `json:"extends,omitempty" yaml:"extends,omitempty" doc:"Names of test templates to inherit from" maxItems:"10"`
Tags []string `json:"tags,omitempty" yaml:"tags,omitempty" doc:"Tags for categorization and --tag filtering" maxItems:"20"`
Env map[string]string `json:"env,omitempty" yaml:"env,omitempty" doc:"Per-test environment variables"`
Files []string `json:"files,omitempty" yaml:"files,omitempty" doc:"Relative paths or globs for test files" maxItems:"50"`
Init []InitStep `json:"init,omitempty" yaml:"init,omitempty" doc:"Setup steps run before the command"`
Cleanup []InitStep `json:"cleanup,omitempty" yaml:"cleanup,omitempty" doc:"Teardown steps run after the command, even on failure"`
Assertions []Assertion `json:"assertions,omitempty" yaml:"assertions,omitempty" doc:"Output assertions. All are evaluated regardless of prior failures" maxItems:"100"`
Snapshot string `json:"snapshot,omitempty" yaml:"snapshot,omitempty" doc:"Golden file path for normalized comparison"`
InjectFile *bool `json:"injectFile,omitempty" yaml:"injectFile,omitempty" doc:"Auto-inject -f sandbox path. Default true. Set false for catalog tests"`
ExpectFailure bool `json:"expectFailure,omitempty" yaml:"expectFailure,omitempty" doc:"Pass if command exits non-zero"`
ExitCode *int `json:"exitCode,omitempty" yaml:"exitCode,omitempty" doc:"Exact expected exit code. Mutually exclusive with expectFailure"`
Timeout *Duration `json:"timeout,omitempty" yaml:"timeout,omitempty" doc:"Per-test timeout as Go duration string" example:"30s"`
Skip SkipValue `json:"skip,omitempty" yaml:"skip,omitempty" doc:"Skip this test: true for unconditional, or a CEL expression string for conditional skip"`
SkipReason string `json:"skipReason,omitempty" yaml:"skipReason,omitempty" doc:"Human-readable skip reason"`
Retries int `json:"retries,omitempty" yaml:"retries,omitempty" doc:"Number of retry attempts for failing tests" maximum:"10"`
}
// Duration is a time.Duration with string-based YAML/JSON marshalling.
// Supports Go duration strings like "30s", "2m", "1m30s".
// Implements both UnmarshalYAML/MarshalYAML and UnmarshalJSON/MarshalJSON.
type Duration struct {
time.Duration
}
// IsTemplate returns true if this test is a template (name starts with _).
func (tc *TestCase) IsTemplate() bool {
return strings.HasPrefix(tc.Name, "_")
}
// Validate performs comprehensive validation of a TestCase.
// Checks:
// - command is non-empty (unless template or inherited via extends)
// - exitCode and expectFailure are not both set (mutual exclusion)
// - snapshot or assertions — at least one must be present (unless template)
// - template names match ^_[a-zA-Z0-9][a-zA-Z0-9_-]*$
// - non-template names match ^[a-zA-Z0-9][a-zA-Z0-9_-]*$
// - args does not contain "-f" or "--file"
// - retries is 0–10
// - assertions count ≤ 100, files count ≤ 50, tags count ≤ 20
// - extends depth ≤ 10 (enforced during inheritance resolution)
func (tc *TestCase) Validate() error { /* ... */ }
// Max limits enforced by Validate().
const (
MaxAssertionsPerTest = 100
MaxFilesPerTest = 50
MaxTagsPerTest = 20
MaxExtendsDepth = 10
MaxTestsPerSolution = 500
MaxRetries = 10
)
// TestConfig holds solution-level test configuration.
type TestConfig struct {
SkipBuiltins SkipBuiltinsValue `json:"skipBuiltins,omitempty" yaml:"skipBuiltins,omitempty" doc:"Disable builtins: true for all, or list of specific names"`
Env map[string]string `json:"env,omitempty" yaml:"env,omitempty" doc:"Suite-level environment variables applied to all tests"`
Setup []InitStep `json:"setup,omitempty" yaml:"setup,omitempty" doc:"Suite-level setup steps. Run once, copied per-test"`
Cleanup []InitStep `json:"cleanup,omitempty" yaml:"cleanup,omitempty" doc:"Suite-level teardown steps. Run once after all tests complete, even on failure"`
}
// SkipBuiltinsValue supports both bool and []string via custom UnmarshalYAML.
// When bool: true skips all builtins, false skips none.
// When []string: skips only the named builtins (without "builtin:" prefix).
// Both UnmarshalYAML and MarshalYAML are required to survive
// the deepCopySolution YAML round-trip used in compose.
type SkipBuiltinsValue struct {
All bool // true = skip all builtins
Names []string // specific builtin names to skip
}
// InitStep is a setup/cleanup command.
// Uses the same input schema as the exec provider.
type InitStep struct {
Command string `json:"command" yaml:"command" doc:"Command to execute" maxLength:"1000"`
Args []string `json:"args,omitempty" yaml:"args,omitempty" doc:"Additional arguments, auto shell-quoted" maxItems:"100"`
Stdin string `json:"stdin,omitempty" yaml:"stdin,omitempty" doc:"Standard input"`
WorkingDir string `json:"workingDir,omitempty" yaml:"workingDir,omitempty" doc:"Working directory relative to sandbox root"`
Env map[string]string `json:"env,omitempty" yaml:"env,omitempty" doc:"Environment variables merged with parent process"`
Timeout int `json:"timeout,omitempty" yaml:"timeout,omitempty" doc:"Timeout in seconds" maximum:"3600"`
Shell string `json:"shell,omitempty" yaml:"shell,omitempty" doc:"Shell interpreter" pattern:"^(auto|sh|bash|pwsh|cmd)$"`
}
// Assertion validates command output.
// Exactly one of Expression, Regex, Contains, NotRegex, or NotContains must be set.
// Enforced via Validate() after YAML unmarshal.
type Assertion struct {
Expression celexp.Expression `json:"expression,omitempty" yaml:"expression,omitempty" doc:"CEL expression evaluating to bool"`
Regex string `json:"regex,omitempty" yaml:"regex,omitempty" doc:"Regex pattern that must match"`
Contains string `json:"contains,omitempty" yaml:"contains,omitempty" doc:"Substring that must appear"`
NotRegex string `json:"notRegex,omitempty" yaml:"notRegex,omitempty" doc:"Regex pattern that must NOT match"`
NotContains string `json:"notContains,omitempty" yaml:"notContains,omitempty" doc:"Substring that must NOT appear"`
Target string `json:"target,omitempty" yaml:"target,omitempty" doc:"Match target: stdout (default), stderr, combined" pattern:"^(stdout|stderr|combined)$"`
Message string `json:"message,omitempty" yaml:"message,omitempty" doc:"Custom failure message"`
}
// Validate checks that exactly one assertion type is set and target is valid.
func (a *Assertion) Validate() error { /* ... */ }
// FileInfo represents a file created or modified in the sandbox.
type FileInfo struct {
Exists bool `json:"exists"`
Content string `json:"content"`
}
// CommandOutput is the assertion context passed to CEL expressions.
type CommandOutput struct {
Stdout string `json:"stdout"`
Stderr string `json:"stderr"`
ExitCode int `json:"exitCode"`
Output map[string]any `json:"output,omitempty"`
Files map[string]FileInfo `json:"files"`
}
// TestResult captures the outcome of a single test.
type TestResult struct {
Solution string `json:"solution"`
Test string `json:"test"`
Status Status `json:"status"`
Duration time.Duration `json:"duration"`
Command string `json:"command"`
AssertionResults []AssertionResult `json:"assertions,omitempty"`
SkipReason string `json:"skipReason,omitempty"`
Error error `json:"-"`
SandboxPath string `json:"sandboxPath,omitempty"`
}
// AssertionResult captures the outcome of a single assertion.
type AssertionResult struct {
Type string `json:"type"`
Value string `json:"value"`
Passed bool `json:"passed"`
Message string `json:"message,omitempty"`
Expected any `json:"expected,omitempty"`
Actual any `json:"actual,omitempty"`
}
// Status represents the outcome of a test.
type Status string
const (
StatusPass Status = "pass"
StatusFail Status = "fail"
StatusSkip Status = "skip"
StatusError Status = "error"
)Additions to Existing Types#
pkg/solution/spec.go: HasTesting *soltesting.TestSuitefield onSpec.HasTests()andHasTestConfig()delegate toTesting.HasCases()andTesting.HasConfig()pkg/solution/bundler/compose.go:composeParthasTesting *soltesting.TestSuitefield. Compose mergesspec.testing.cases(by name, reject duplicates) andspec.testing.config(skipBuiltins: true-wins for bool / union for lists;env: merged map, last-file-wins on conflict;setup/cleanup: appended in compose-file order)pkg/solution/bundler/discover.go: AddTestIncludediscovery source; scanspec.testing.cases[*].filesentries
Files to Create/Modify#
| Action | Path | Description |
|---|---|---|
| Create | pkg/solution/testing/types.go | Test spec types, SkipBuiltinsValue with custom unmarshal |
| Create | pkg/solution/testing/runner.go | In-process cobra execution, sandbox orchestration, mutex-serialized command invocation |
| Create | pkg/solution/testing/context.go | Build CEL assertion context from command output |
| Create | pkg/solution/testing/assertions.go | CEL, regex, contains, negation assertion evaluation with target |
| Create | pkg/solution/testing/diagnostics.go | CEL sub-expression evaluation for failure diagnostics |
| Create | pkg/solution/testing/builtins.go | Builtin test definitions and execution |
| Create | pkg/solution/testing/sandbox.go | Temp directory creation, file copying, file diff for __files |
| Create | pkg/solution/testing/snapshot.go | Golden file comparison, update, unified diff output |
| Create | pkg/solution/testing/inheritance.go | extends resolution, merge logic, circular detection |
| Create | pkg/solution/testing/discovery.go | Test discovery, --filter/--tag filtering, template exclusion |
| Create | pkg/solution/testing/reporter.go | kvx result formatting |
| Create | pkg/solution/testing/junit.go | JUnit XML report writer (<failure> vs <error>) |
| Create | pkg/solution/testing/runner_test.go | Runner unit tests |
| Create | pkg/solution/testing/assertions_test.go | Assertion unit tests (including target field) |
| Create | pkg/solution/testing/snapshot_test.go | Snapshot comparison tests |
| Create | pkg/solution/testing/diagnostics_test.go | CEL diagnostics tests |
| Create | pkg/solution/testing/inheritance_test.go | Extends chain, merge rules, circular detection tests |
| Create | pkg/solution/testing/discovery_test.go | Glob filter, tag filter, template exclusion tests |
| Create | pkg/cmd/scafctl/test/test.go | test parent command |
| Create | pkg/cmd/scafctl/test/functional.go | test functional command |
| Create | pkg/cmd/scafctl/test/list.go | test list command |
| Modify | pkg/cmd/scafctl/root.go | Register test command |
| Modify | pkg/solution/spec.go | Has Testing *soltesting.TestSuite field |
| Modify | pkg/solution/bundler/compose.go | Merge spec.testing.cases and spec.testing.config in compose |
| Modify | pkg/solution/bundler/discover.go | Add TestInclude discovery source |
| Modify | pkg/cmd/scafctl/lint/ | Add lint rule for unbundled test files (error), invalid test names, and unused templates (warning) |
| Modify | pkg/exitcode/exitcode.go | Add TestFailed = 11 constant |
| Create | tests/integration/solutions/ | Functional test fixtures |
| Create | docs/design/functional-testing.md | This design doc |
| Modify | docs/design/testing.md | Reference this design doc from section 5 |
| Create | docs/tutorials/functional-testing.md | Tutorial for solution authors (see Tutorial Outline below) |
| Create | examples/solutions/tested-solution/ | Example solution with inline tests |
Verification#
- Unit tests:
go test ./pkg/solution/testing/... - Race detection:
go test -race ./pkg/solution/testing/...to verify no data races in mutex-serialized execution - CLI integration tests: Add
test functionalandtest listtotests/integration/cli_test.go - Self-hosted:
scafctl test functional --tests-path tests/integration/solutions - Taskfile:
task integrationpasses - Lint:
golangci-lint run --fix - Build integration:
scafctl buildon a solution with test files produces a bundle that includes them - Concurrency: Run with
-j 1and default concurrency to validate both paths - YAML round-trip: Unit test verifying
SkipBuiltinsValuesurvivesdeepCopySolutionYAML marshal/unmarshal
Future Enhancements#
Auto-Generated Tests (-o test)#
✅ Implemented. An output type for commands that support -o. When used, scafctl captures the command and its arguments, executes it, and generates a complete test definition with assertions derived from the actual output.
scafctl render solution -f solution.yaml -r env=prod -o testWould output:
render-solution-env-prod:
description: "Auto-generated test for: render solution -r env=prod"
command: [render, solution]
args: ["-r", "env=prod", "-o", "json"]
tags: [generated]
assertions:
- expression: 'size(__output) == 3'
message: __output should have 3 keys
- expression: 'size(__output["actions"]) == 2'
message: __output["actions"] should have 2 keys
snapshot: "testdata/render-solution-env-prod.json"Implementation:
pkg/solution/soltesting/generate.go—Generate(),DeriveTestName(),GenerateToYAML(),deriveAssertions()pkg/terminal/kvx/output.go—OutputFormatTestconstant added toBaseOutputFormats()pkg/cmd/scafctl/render/solution.go— wired viawriteTestOutput(),--test-nameflagpkg/cmd/scafctl/run/common.go—generateTestOutput()shared helper,--test-nameflag on allrunsubcommandspkg/cmd/scafctl/run/resolver.go— intercepts-o testbeforewriteResolverOutputpkg/cmd/scafctl/run/solution.go—writeActionTestOutput(),buildActionOutputData()extracted helper
Behavior:
- Assertions are derived by walking the output up to depth 2:
size()for maps/arrays, literal equality for strings/numbers/bools. __executionmetadata is excluded from assertion derivation (too volatile) but included in the snapshot for normalization.- The snapshot is written to
testdata/<name>.jsonbeside the solution file (ortestdata/relative to CWD when using stdin). -o jsonis appended to the generated testargsautomatically when not already present.- Use
--test-nameto override the derived test name.
Catalog Regression Testing (scafctl pipeline)#
A future command that executes functional tests across solutions in a remote catalog. This enables the scafctl team to validate that changes to scafctl don’t break existing solutions.
scafctl pipeline test --catalog https://catalog.example.com --solutions "terraform-*"Would fetch matching solutions, extract bundled test files, run test functional against each, and report aggregate results.
This is the primary use case for requiring test files to be bundled and why scafctl lint errors on unbundled test files.
Test Scaffolding (scafctl test init)#
✅ Implemented. Generates a starter test suite for an existing solution by analyzing its structure:
scafctl test init -f solution.yamlParses the solution, identifies resolvers with defaults, validation rules, and workflow actions,
then outputs skeleton test YAML to stdout. Unlike -o test (which captures actual output),
test init generates a starting point before you run anything.
Generated test categories:
- Smoke tests:
resolve-defaults,render-defaults,lint - Per-resolver tests:
resolver-<name>with non-null assertions - Validation failure tests:
resolver-<name>-invalidwithexpectFailure: true - Per-action tests:
action-<name>with provider tags
Implementation:
pkg/solution/soltesting/scaffold.go— scaffolding logic (Scaffold(),ScaffoldInput,ScaffoldToYAML())pkg/cmd/scafctl/test/init.go— CLI subcommand
Watch Mode (--watch)#
Re-runs tests when solution files change:
scafctl test functional -f solution.yaml --watchMonitors the solution file, its compose files, and their parent directories for
changes. Uses fsnotify with a 300ms debounce to collapse rapid successive
writes into a single re-run. Clears the screen on TTY terminals before each
re-run. Combines with --tag and --filter for scoped watch runs. Exit cleanly
with Ctrl-C. Implementation in pkg/solution/soltesting/watch.go.
Tutorial Outline#
The planned tutorial at docs/tutorials/functional-testing.md should cover:
- Writing your first test — minimal solution with one test case,
scafctl test functionalinvocation, reading output - Assertions deep dive — CEL expressions vs regex/contains,
targetfield,outputvariable structure per command, negation assertions - Test inheritance —
_-prefixed templates,extendschains, merge behavior for args/assertions/tags - Snapshots — golden file workflow,
--update-snapshots, normalization pipeline - CI integration — JUnit XML reporting with
--report-file, exit codes,--fail-fastpatterns - Advanced features — init/cleanup steps, test files, conditional
skipexpressions, retries, suite-level setup
The example solution at examples/solutions/tested-solution/ should include:
- A solution with 2-3 resolvers and a template action
- 3-4 inline tests covering: CEL expression assertion, contains/regex assertion,
expectFailurefor validation, and a snapshot test - A
testdata/directory with a golden file - A
bundle.includecovering the test files
Decisions#
- Command-based tests over render-only: tests can exercise any scafctl subcommand, making the framework a general-purpose solution validation tool
- Command as array:
command: [render, solution]instead of a string, for unambiguous parsing and consistency with args - In-process execution over subprocess: the runner invokes the cobra command tree directly using
Root(). Faster, no built binary dependency, simpler output capture - Auto-inject
-fby default: the runner injects-f <sandbox-solution-path>unlessinjectFile: false.-fmust never appear inargsregardless ofinjectFile. Disabled for catalog solution tests where no local file is needed - Custom
exitFunc: useswriter.WithExitFunc()to interceptos.Exitcalls during in-process execution, converting them to*exitcode.ExitErrorvalues - Tests in
spec.testing.caseswith compose support: follows existing split-file pattern, keeps tests colocated with the solution - Temp directory sandbox: init scripts can modify files safely without affecting source. Symlinks rejected
- Five assertion types (CEL, regex, contains, notRegex, notContains): CEL for structured assertions; text matching for quick checks; negation for safety
- Assertion
targetfield:stdout(default),stderr, orcombinedfor text assertions. Cleaner than separatestderrContains/stderrRegexfields - All assertions always evaluated: failures don’t short-circuit. User sees all problems at once
- Structured + raw output: the tester is responsible for passing
-o jsoninargswhen structured output is needed; the runner always provides raw__stdout/__stderr __filesvia diff: snapshot sandbox files before command, diff after, expose only new/modified files in CEL context asmap[string]FileInfo- Environment variables over Go templates:
SCAFCTL_SANDBOX_DIR— no custom template engine, natural for shell commands - Per-test
env: additional environment variables set for init, command, and cleanup - Builtins on by default: baseline correctness without boilerplate
- Selective builtin skip:
skipBuiltinsacceptsbool(all) or[]string(specific names) via customUnmarshalYAML - Init uses exec provider schema: one input model, consistent with the rest of scafctl
- Cleanup steps:
cleanupfield runs even on failure, likefinallyin actions. Cleanup failures are logged but don’t change test status - Test inheritance: multi-extends via
extends: [base1, base2], applied left-to-right. Template tests prefixed with_are not executed - Test tags:
tagsfield for categorization,--tagflag for filtering. Match if test has any specified tag - Test name validation: must match
^[a-zA-Z0-9][a-zA-Z0-9_-]*$for JUnit/CLI compatibility. Templates match^_[a-zA-Z0-9][a-zA-Z0-9_-]*$ - Test files in bundle:
scafctl buildauto-discovers test file references;scafctl linterrors if not inbundle.include. Required for remote catalog testing - Parallel by default: each test has its own sandbox.
--sequentialopt-out for debugging - Fail-fast per-solution:
--fail-faststops remaining tests for the current solution on first failure. Other solutions continue - kvx + JUnit XML reporting: kvx for consistency; JUnit for CI integration. JUnit distinguishes
<failure>(assertion) from<error>(setup/infrastructure) expectFailure+exitCode: simple inversion vs specific code- CEL assertion diagnostics: sub-expression evaluation to show actual vs expected
- Normalized snapshots: always normalize (strip timestamps, sort map keys). Not configurable
- Unified diff for snapshots: actionable mismatch output. Selective updates via
--update-snapshots --filter - Suite-level setup: run once, copy per-test. Avoids init duplication
--keep-sandbox: preserve failed test directories for manual inspection- Skip + skipReason: standard test framework feature for development workflow
- Lint error (not warning) for unbundled test files: catalog regression testing requires bundled tests
- Lint warning for unused templates: templates defined but never referenced via
extendsare likely dead code - Root() isolation via
RootOptions:Root()accepts a*RootOptionsstruct and creates all state locally — no package-level mutable variables. Each concurrent test invocation gets its owncliParams, flag bindings,ioStreams, and writer. This eliminates data races without requiring mutex serialization - Exit codes: new
TestFailed = 11constant rather than reusingValidationFailed = 2, which has different semantics --tag,--filter, and--solutionas[]string: registered viaStringArrayVarper project convention (notStringSliceVarwhich uses CSV parsing). Multiple flags allowed; OR logic within each flag type, AND logic between them (test must match solution filter AND name filter AND tag filter)--filterglob library:doublestar.Match— already a project dependency inpkg/solution/bundler/discover.go- No auto-inject
-o json: the tester is responsible for passing-o jsoninargswhen structured output is needed. The runner parses stdout as JSON when possible and populates__output __outputnil when unsupported: diagnostic error rather than empty map to prevent silent assertion failures- Concurrency control:
-j Nflag with--sequentialas sugar for-j 1— standard test runner pattern - File size guard: 10MB cap on
files[].contentto prevent OOM without blocking tests - Conditional skip via CEL:
skipfield accepts eithertrueor a CEL expression string, evaluated at discovery time withos,arch,env,subprocesscontext - Test retries:
retriesfield for flaky test resilience, capped at 10 attempts - Suite-level cleanup:
testing.config.cleanupruns after all tests, symmetric withtesting.config.setup - Compose
testing.configmerge:setup/cleanupsteps appended in compose-file order (new merge strategy);skipBuiltinsusestrue-wins for bool, union for lists SkipBuiltinsValueround-trip: requires bothUnmarshalYAMLandMarshalYAMLfor composedeepCopySolutioncompatibility- Snapshot normalization pipeline: fixed set of scrubbers (timestamps, UUIDs, sandbox paths, sorted keys). Custom scrubbers deferred to future enhancement
- Alphabetical test ordering over YAML definition order: consistent with how actions use
sort.Stringswithin execution phases. No new infrastructure (ordered maps) needed. Tests should be independent — alphabetical ordering exposes hidden ordering dependencies --filtersupportssolution/test-nameformat: when filter contains/, match againstsolution-name/test-name. When no/, match test name only (backward-compatible). Enables scoping in multi-solution runs--solutionflag for solution-level filtering: glob-based, ANDed with--filterand--tag. Simpler than always usingsolution/test-nameformat when you only care about the solution--dry-runflag over separatetest validatesubcommand: simpler, one less command. Validates definitions and reports discovery without executing- Suite-level
env(testing.config.env): avoids repeating environment variables on every test case. Precedence: process →testing.config.env→TestCase.env→InitStep.env - Fresh sandbox per retry: each retry creates a new sandbox to ensure side-effect isolation. Init steps re-run on each attempt. Suite-level base sandbox is re-copied, not re-run
- Binary file content guard: non-UTF-8 files get
contentset to"<binary file>", parallel to the size guard. Prevents garbled CEL string comparisons outputnil → testerror(notfail): referencingoutputwhen the command doesn’t support-o jsonis a configuration issue, not an assertion failure- Extends non-existent → validation error: referencing a test name that doesn’t exist in
extendsis caught at parse time, not silently ignored - Extends chain depth limit of 10: prevents stack overflow and extremely complex inheritance. Deep chains indicate a design problem
- Max field limits: assertions (100), files (50), tags (20), tests per solution (500). Prevents accidentally expensive test suites while remaining generous for real-world use
- Glob zero-match → test
error: catches typos and missing test data early rather than silently proceeding without files - Snapshot captures stdout even with
expectFailure: snapshot comparison runs after exit code check. Enables golden-file testing of error output exitCodeandexpectFailuremutual exclusion: both set is aValidate()error.exitCodeis strictly more expressiveTestCase.Validate()method: comprehensive validation covering name format, field limits, mutual exclusion, andargscontent. Catches errors early at parse time- Assertion count in verbose output:
PASS (4/4)/FAIL (2/4)gives quick visibility into test thoroughness without requiring-o json - Duration as string type:
"30s"YAML format with custom marshal/unmarshal. More human-readable than integer seconds and more explicit than Go’s default nanosecond marshalling - Environment precedence chain documented: process →
testing.config.env→TestCase.env→InitStep.env. Each level merges with previous; only conflicting keys are overridden - Compose test ordering independent of file order: tests from all compose files execute alphabetically, not in compose-file order. Only
testing.config.setup/cleanup/envfollow compose-file ordering