Workflow Engine
How multi-agent workflows are orchestrated — step execution modes, retry logic, variable passing, PII boundaries, and memory isolation.
Workflow Execution Overview
A workflow is a named sequence of steps where each step routes a task to a specific agent. The engine supports sequential pipelines, parallel fan-out, conditional branching, and iterative loops.
Step Execution Modes
Sequential (default)
Steps execute one after another. Each step receives the previous step's output as {{input}}.
FanOut
Consecutive FanOut steps execute in parallel. All start with the same input (the last sequential step's output).
Implementation: all FanOut steps are gathered into a batch and executed via futures::future::join_all. If any fails, the entire workflow fails (no partial completion in fan-out).
Collect
Aggregates all preceding FanOut step outputs into a single input for the next step. Outputs are joined with \n\n---\n\n separators.
Conditional
A step that only executes if the previous step's output contains a specific string (case-insensitive match).
{
"mode": {"Conditional": {"condition": "needs_review"}},
"prompt_template": "Review this: {{input}}"
}
If the condition is not met, the step is skipped and execution continues with the next step.
Loop
Repeats a step until:
- The output contains the
untilstring (case-insensitive), OR max_iterationsis reached
{
"mode": {"Loop": {"max_iterations": 5, "until": "DONE"}},
"prompt_template": "Refine this draft: {{input}}"
}
Each iteration's output becomes the next iteration's input. Step results are logged as "step_name (iter N)".
Error Handling
Each step declares an error_mode that controls behavior on failure:
ErrorMode::Fail (default)
The step must succeed. On failure or timeout, the entire workflow is marked Failed and execution stops.
ErrorMode::Skip
On failure or timeout, the step is skipped and execution continues with the next step. A warning is logged.
ErrorMode::Retry
The step is retried up to max_retries times before the workflow fails:
On each retry:
- The same prompt is sent to the same agent
- A warning is logged with the attempt number and error
- No backoff delay between retries (immediate retry)
Template Variable System
Steps can store their output in named variables and reference them later:
{
"steps": [
{
"name": "extract",
"prompt_template": "Extract key data from: {{input}}",
"output_var": "extracted_data"
},
{
"name": "validate",
"prompt_template": "Validate: {{extracted_data}}",
"output_var": "validation_result"
},
{
"name": "decide",
"prompt_template": "Given data: {{extracted_data}} and validation: {{validation_result}}, make a decision"
}
]
}
Variable expansion:
{{input}}— the current step input (previous step's output for sequential, or initial input for the first step){{var_name}}— any previously storedoutput_var- Variables persist for the entire workflow run
- Undefined variables are left as-is (not replaced)
PII Boundary Enforcement
At each step boundary, data passes through the target agent's PII pipeline:
How It Works
- Detection — the target agent's PII detector scans the outbound data for PII spans (SSNs, policy numbers, names, etc.)
- Tokenization — detected spans are replaced with opaque tokens (e.g.,
[PII_SSN_001]). A token map records the original values. - Delivery — the tokenized text is sent to the agent
- Hydration — if needed downstream, tokens can be restored to original values via the hydrator
Each agent has its own PII pipeline instance, configured based on its trust policy. The current default is a no-op pipeline (Phase 1) — PII passes through unchanged. The architecture supports pluggable detection engines for Phase 2.
Memory Isolation
Each agent participating in a workflow maintains its own separate memory scope:
- The
ScopedMemorywrapper validatescaller == scope_owneron every operation - Cross-agent memory access returns
MemoryViolationerror - Data passes between agents only through step outputs (with PII tokenization)
- This is enforced at the memory layer, not the workflow layer — even if an agent somehow obtains another agent's ID, the memory access is denied
Validation at Step Boundaries
Before each step executes, the engine validates:
- Target agent has its own metadata entry (owns a memory scope)
- Source and target agents are distinct (prevents self-referential memory confusion)
Timeout Enforcement
Each step has a configurable timeout (1–3600 seconds, default 120):
- The step execution is wrapped in
tokio::time::timeout - On timeout: behavior depends on
error_mode(Fail, Skip, or Retry) - FanOut steps each have their own individual timeouts running in parallel
Workflow Run Lifecycle
Run Retention
The engine retains up to 200 workflow runs. When exceeded, oldest completed/failed runs are evicted (LRU by started_at).
Agent Resolution
Steps reference agents by ID or name:
{"ById": "550e8400-e29b-41d4-a716-446655440000"}
{"ByName": "claims-intake-agent"}
Resolution happens at run start:
ById— validates the UUID exists in the agent registryByName— scans the registry for the first agent matching the name
If an agent is not found, the workflow fails before executing any steps.
Invocation Source Metadata
Each workflow step is invoked through the unified invocation layer with InvocationSource::Workflow metadata:
InvocationSource::Workflow {
workflow_id: "a1b2c3d4-...",
step_index: 2,
upstream_agent_id: Some(AgentId("550e8400-..."))
}
This metadata flows through rate limiting, telemetry, and audit — each step invocation is separately tracked.