Provider Troubleshooting

Issues with LLM providers — auth failures, circuit breakers, and connectivity.

Provider auth failure

Symptoms: "Provider authentication failed" or "401 Unauthorized"

Diagnosis:

# Test the provider key
hoziron config test-key anthropic

# Check if the env var is set
hoziron config get providers.anthropic.api_key_env

# Verify the env var exists
echo $ANTHROPIC_API_KEY

Fix:

  1. Verify the API key is valid (not expired, not revoked at the provider's dashboard)
  2. Ensure the env var is available to the daemon process:
    # If using systemd, add to the service:
    Environment=ANTHROPIC_API_KEY=sk-ant-...
    
  3. Re-set the key: hoziron config set-key anthropic
  4. Restart if the env var was changed externally (config reload handles file changes, not env changes)

Provider keys are resolved lazily — a missing key won't error at startup, it fails on first use.

Circuit breaker tripped

Symptoms: All requests to a provider fail immediately. Logs show "circuit breaker open."

Diagnosis:

hoziron health --json
# Look for providers with state: "Open"

Fix:

The circuit breaker auto-recovers after the cooldown period (default: 60s). After cooldown, one probe request is sent (HalfOpen state). If it succeeds, the circuit closes.

To tune:

  • [health].failure_threshold (default 5) — consecutive failures to trip
  • [health].recovery_cooldown_secs (default 60) — seconds before probe

If the provider is genuinely down, configure a fallback via complexity routing:

[routing]
simple_model = "groq/llama-3.1-8b-instant"
medium_model = "anthropic/claude-sonnet-4-20250514"
complex_model = "openai/gpt-4o"

Provider URL not reachable (Docker / air-gapped)

Symptoms: Connection timeout or "connection refused" to Ollama or vLLM.

Common causes:

  • Ollama bound to 127.0.0.1 but Hoziron is in a container
  • Wrong port or hostname

Fix:

  • Start Ollama with OLLAMA_HOST=0.0.0.0 ollama serve
  • In Docker, use --add-host=host.docker.internal:host-gateway and set:
    [providers.ollama]
    base_url = "http://host.docker.internal:11434"
    
  • Verify from inside the container:
    wget -qO- http://host.docker.internal:11434/api/tags
    

Rate limiting (429)

If a provider returns 429, Hoziron retries with exponential backoff (up to 3 attempts). If all retries fail and fallback models are configured, the next model in the chain is tried.

To reduce rate limiting:

  • Use complexity routing to spread load across providers
  • Reduce concurrent agent count
  • Check your provider's rate limit tier

Related: