Performance Troubleshooting
High memory, slow responses, and optimization strategies.
High memory usage
Diagnosis:
hoziron status --json
# Check agent count and memory subsystem
ps aux | grep hoziron
Root causes:
- Each agent holds conversation history in memory
- Long-lived agents accumulate context over time
- Many idle agents still hold base memory (~10 MB each)
Fix:
- Reset an agent's session:
POST /agents/{id}/session/reset - Clear history:
DELETE /agents/{id}/history - Terminate idle agents:
hoziron agent stop <id> - Schedule agents (start on demand) rather than running perpetually
Slow responses
Diagnosis:
-
Is it the LLM or the platform?
- If
hoziron healthshows providers healthy → latency is at the provider - If the health check itself is slow → check network/DNS
- If
-
Check timeout config:
hoziron config get server.limits.request_timeout_secs # Default: 600 (10 minutes)
Fix:
- For faster responses, use a faster model (Groq with Llama is significantly faster than Claude/GPT-4)
- Configure complexity routing:
[routing] simple_model = "groq/llama-3.1-8b-instant" complex_model = "anthropic/claude-sonnet-4-20250514" simple_threshold = 100 complex_threshold = 500 - Check provider quotas — you may be rate-limited at the provider level
Capacity guidelines
| Workload | CPU | Memory | Storage |
|---|---|---|---|
| Minimal (1–5 agents) | 2 cores | 2 GB | 1 GB |
| Standard (10–50 agents) | 4 cores | 4 GB | 10 GB |
| Production (50–200 agents) | 8 cores | 8 GB | 50 GB |
| Enterprise (200+ agents) | 16+ cores | 16+ GB | 100+ GB |
Per-agent overhead
- Memory: ~10 MB base + conversation context
- Storage: ~1 MB per 1000 messages
- CPU: Negligible when idle; bursts during LLM interactions
Scaling considerations
- Hoziron is single-process, single-node by design (simplicity over distribution)
- Scale vertically for more agents on one instance
- For multi-team isolation, run separate instances per team
- The circuit breaker protects against provider overload
- Health monitoring auto-recovers degraded subsystems (max 3 attempts with cooldown)
Storage breakdown
data/hoziron.db— main state database (agents, sessions, config)data/memory/— per-agent KV storesdata/audit/— audit trail with Merkle chaindata/packages/— installed catalog packages
If storage is growing fast, check audit trail and agent history accumulation.
Related: