Agent Execution Model

How a message flows from entry point, across the Hoziron↔OpenFang boundary, through the LLM tool-use loop, and back — including the mediated ingress/egress seams, memory recall, session management, and error recovery. Grounded in crates/platform/hoziron-core/src/agent/ and crates/platform/hoziron-core/src/invocation/, the kernel dispatch() seam (ADR-036), and the outbound mediation seam (ADR-042/047).

Message Flow Overview

Every agent turn is core-mediated end to end (ADR-061): the kernel's raw send_message* family is not a callable turn-origin from outside hoziron-core, so this sequence is the only path a turn can take, regardless of whether it originated from the API, a channel, a cron tick, or a workflow step.

The kernel.dispatch(ExecutionRequest) call is the ADR-036 dispatch seam: core resolves and hands the kernel a concrete model_target, tool allowlist, and already-tokenized prompt as plain parameters. There is no shared-state manifest mutation and no Drop-guard restore pattern — the pre-fork implementation's RoutingDispatchGuard is gone (grep-proven zero references).

Pre-Execution Validation

Before the agent loop begins, several checks are enforced:

Agent state — must be Running. Any other state returns InvalidState error.
Message length — maximum 128,000 characters. Exceeding returns SizeLimitExceeded.
Tool allowlist resolution — the equipped competency's required skills are resolved to concrete tool names at execution time (not just at equip time). This ensures changes to skills are reflected immediately.
Model resolution — the platform resolves which provider and model to use:
- Apply defaults if agent has empty provider/model_id
- Resolve via model catalog (alias expansion)
- Apply complexity routing (optional)
- Check circuit breaker state
- Find alternative provider if primary is tripped
- Update kernel manifest with resolved model/provider

The Agent Loop

The core execution engine runs an iterative loop that alternates between LLM calls and tool execution:

Key Parameters

Parameter	Value	Description
`MAX_ITERATIONS`	50	Maximum LLM call iterations per invocation
`MAX_RETRIES`	3	Retries for rate-limited/overloaded API calls
`BASE_RETRY_DELAY_MS`	1000	Base for exponential backoff
`TOOL_TIMEOUT_SECS`	120	Per-tool execution timeout
`AGENT_TOOL_TIMEOUT_SECS`	600	Timeout for inter-agent tools (agent_send, agent_spawn)
`MAX_CONTINUATIONS`	5	MaxTokens continuations before returning partial
`DEFAULT_CONTEXT_WINDOW`	200,000	Token budget for context management

Environment Overrides

Variable	Purpose
`HOZIRON_TOOL_TIMEOUT_SECS`	Override tool timeout (0 = disable)
`HOZIRON_AGENT_TOOL_TIMEOUT_SECS`	Override agent delegation timeout (0 = disable)

Note: These environment variables control the execution kernel's timeout behavior. They are part of the Hoziron platform configuration namespace.

Memory Recall

Before the first LLM call, the loop recalls relevant memories using the user's message as a query:

Vector similarity (preferred) — if an embedding driver is configured, the message is embedded and used for approximate nearest-neighbor search against the agent's memory store
Text search (fallback) — if embedding fails or isn't configured, falls back to text-based recall

Up to 5 memory fragments are retrieved and injected into the system prompt as a structured section.

LLM Call with Retry

Each LLM call includes exponential backoff retry logic:

The retry logic handles:

Rate limiting (429) — backs off exponentially
Overloaded (529) — same backoff strategy
Fallback models — if configured, tries alternative models before failing

Tool Execution

When the LLM responds with ToolUse, each tool call is processed:

Loop guard check — prevents infinite loops (same tool + same input repeated)
- Allow — proceed normally
- Warn — proceed but append warning to result
- Block — reject this call, return error to LLM
- CircuitBreak — abort the entire agent loop
BeforeToolCall hook — plugin hook can block execution (observe/allow-deny only — see note below)
Visibility check — the tool must have been in the available_tools list offered to the model in the first place, per the two independent ADR-052 allow-lists (allow_skill_tools for in-process skill tools, allow_integration_tools for MCP/contract tools). This is a visibility control, not the trust boundary — see step 4.
Outbound mediation (MCP/contract tools only) — before any outbound call actually leaves the process, it crosses CoreOutboundMediator.mediate() (ADR-042, generalized to REST/SOAP by ADR-047). The mediator classifies the destination against the licence's system-of-record designation and either hydrates authorized PII fields, tokenizes/blocks unauthorized ones, or rejects the call outright — fail-closed. This is the actual data-trust enforcement point; it runs regardless of what the model was shown, so a fabricated or injected call is stopped here even if it was never in available_tools. In-process skill tools (no outbound IO) do not cross this seam.
Timeout-wrapped execution — each tool runs with a configurable timeout:
- Regular tools: 120 seconds
- Inter-agent tools (agent_send, agent_spawn): 600 seconds
- Timeout = 0 disables the limit (for slow local inference)
AfterToolCall hook — observability hook fires post-execution. Distinct in kind from the mediator: the hook is a fan-out, policy-blind observer (feeds the audit chain via AuditSink); the mediator is the single, mandatory, transform-capable authority on the outbound path. The two are never conflated.
Result truncation — large tool outputs are dynamically truncated based on remaining context budget

Tool Error Handling

After all tool calls complete:

If any returned errors, guidance is injected telling the LLM not to fabricate results
If approval was denied, guidance tells the LLM not to retry denied tools
The LLM sees both successful results and error messages in its next turn

Phantom Action Detection

A safety mechanism detects when the LLM claims to have performed an action (sent a message, posted to a channel) without actually calling any tools. When detected:

The claim is captured
A re-prompt is injected: "You claimed to perform an action but did not call any tools..."
The loop continues, forcing the LLM to use actual tools

This prevents hallucinated completions where the agent tells the user "Message sent!" without actually sending anything.

Session Management

History Trimming

Before each LLM call, the message history is checked:

Default maximum: configurable per agent (overrides global default)
When exceeded: oldest messages are drained
After trimming: history is validated for tool_use/tool_result pairing

Context Overflow Recovery

A multi-stage pipeline handles context overflow:

Guard — compact oversized tool results before LLM call
Recovery Stage 1 — trim old messages
Recovery Stage 2 — more aggressive trimming
Final Error — suggest /reset or /compact

Session Persistence

After the loop completes:

Final assistant message is saved to session (preserving Thinking blocks for reasoning models)
Heartbeat turns are pruned (saves context budget)
The interaction is remembered in the memory substrate (with embedding if available)

Silent Completions

Agents can intentionally choose not to respond by outputting NO_REPLY or [SILENT]. When detected:

An internal marker [no reply needed] is stored in history
The response is returned as empty string with silent: true
Channel adapters suppress message delivery

Resource Tracking

The scheduler tracks per-agent resource usage:

Tokens: rolling 1-hour window, checked against max_llm_tokens_per_hour
Tool calls: tracked per minute
Cost: estimated from token usage × model pricing

When a quota is exceeded, the agent's next invocation is rejected with QuotaExceeded.

Related:

invocation-model.md — the unified entry point upstream of this flow
../architecture/data-flow.md — the same seam at the architecture layer
provider-routing.md, pii-data-protection.md, permission-model.md
docs/decisions/036-hoziron-openfang-boundary.md, 042-outbound-tool-call-mediation-seam.md, 047-unified-outbound-egress.md, 052-tool-visibility-gate.md, 061-agent-turn-ingress-authority.md