知识引擎/Hermes 知识引擎/备用 Provider (Fallback Providers)

返回分馆所属主题：核心功能更新于 2026年4月16日官方来源

Hermes Agent has three layers of resilience that keep your sessions running when providers hit issues: 1. Credential pools — rotate across multiple API keys for

备用 Provider (Fallback Providers)

> 📖 本文档翻译自 Hermes Agent 官方文档 > 最后更新：2026-04-16

Hermes Agent has three layers of resilience that keep your sessions running when providers hit issues:

Credential pools — rotate across multiple API keys for the same provider (tried first)
Primary model fallback — automatically switches to a different provider :model when your main model fails
Auxiliary task fallback — independent provider resolution for side tasks like vision, compression, and web extraction

Credential pools handle same-provider rotation (e.g., multiple OpenRouter keys). This page covers cross-provider fallback. Both are optional and work independently.

Primary Model Fallback

When your main LLM provider encounters errors — rate limits, server overload, auth failures, connection drops — Hermes can automatically switch to a backup provider :model pair mid-session without losing your conversation.

配置

Add a fallback_model section to ~/.hermes/config.yaml:

fallback_model:
  provider: openrouter
  model: anthropic/claude-sonnet-4

Both provider and model are required. If either is missing, the fallback is disabled.

Supported Providers

Provider	Value	Requirements
AI Gateway	ai-gateway	AI_GATEWAY_API_KEY
OpenRouter	openrouter	OPENROUTER_API_KEY
Nous Portal	nous	hermes auth(OAuth)
OpenAI Codex	openai-codex	hermes model(ChatGPT OAuth)
GitHub Copilot	copilot	COPILOT_GITHUB_TOKEN,GH_TOKEN, orGITHUB_TOKEN
GitHub Copilot ACP	copilot-acp	External process (editor integration)
Anthropic	anthropic	ANTHROPIC_API_KEYor Claude Code credentials
z.ai / GLM	zai	GLM_API_KEY
Kimi / Moonshot	kimi-coding	KIMI_API_KEY
MiniMax	minimax	MINIMAX_API_KEY
MiniMax (China)	minimax-cn	MINIMAX_CN_API_KEY
DeepSeek	deepseek	DEEPSEEK_API_KEY
OpenCode Zen	opencode-zen	OPENCODE_ZEN_API_KEY
OpenCode Go	opencode-go	OPENCODE_GO_API_KEY
Kilo Code	kilocode	KILOCODE_API_KEY
Xiaomi MiMo	xiaomi	XIAOMI_API_KEY
Arcee AI	arcee	ARCEEAI_API_KEY
Alibaba / DashScope	alibaba	DASHSCOPE_API_KEY
Hugging Face	huggingface	HF_TOKEN
Custom endpoint	custom	base_url+api_key_env(see below)

Custom Endpoint Fallback

For a custom OpenAI-compatible endpoint, add base_url and optionally api_key_env:

fallback_model:
  provider: custom
  model: my-local-model
  base_url: http://localhost:8000/v1
  api_key_env: MY_LOCAL_KEY          # env var name containing the API key

When Fallback Triggers

The fallback activates automatically when the primary model fails with:

Rate limits (HTTP 429) — after exhausting retry attempts
Server errors (HTTP 500, 502, 503) — after exhausting retry attempts
Auth failures (HTTP 401, 403) — immediately (no point retrying)
Not found (HTTP 404) — immediately
Invalid responses — when the API returns malformed or empty responses repeatedly

When triggered, Hermes:

Resolves credentials for the fallback provider
Builds a new API client
Swaps the model, provider, and client in-place
Resets the retry counter and continues the conversation

The switch is seamless — your conversation history, tool calls, and context are preserved. The agent continues from exactly where it left off, just using a different model.

:::info

:::

Examples

OpenRouter as fallback for Anthropic native:

model:
  provider: anthropic
  default: claude-sonnet-4-6

fallback_model:
  provider: openrouter
  model: anthropic/claude-sonnet-4

Nous Portal as fallback for OpenRouter:

model:
  provider: openrouter
  default: anthropic/claude-opus-4

fallback_model:
  provider: nous
  model: nous-hermes-3

Local model as fallback for cloud:

fallback_model:
  provider: custom
  model: llama-3.1-70b
  base_url: http://localhost:8000/v1
  api_key_env: LOCAL_API_KEY

Codex OAuth as fallback:

fallback_model:
  provider: openai-codex
  model: gpt-5.3-codex

Where Fallback Works

Context	Fallback Supported
CLI sessions	✔
Messaging gateway (Telegram, Discord, etc.)	✔
Subagent delegation	✘ (subagents do not inherit fallback config)
Cron jobs	✘ (run with a fixed provider)
Auxiliary tasks (vision, compression)	✘ (use their own provider chain — see below)

:::tip

:::

Auxiliary Task Fallback

Hermes uses separate lightweight models for side tasks. Each task has its own provider resolution chain that acts as a built-in fallback system.

Tasks with Independent Provider Resolution

Task	What It Does	Config Key
Vision	Image analysis, browser screenshots	auxiliary.vision
Web Extract	Web page summarization	auxiliary.web_extract
Compression	Context compression summaries	auxiliary.compression
Session Search	Past session summarization	auxiliary.session_search
Skills Hub	Skill search and discovery	auxiliary.skills_hub
MCP	MCP helper operations	auxiliary.mcp
Memory Flush	Memory consolidation	auxiliary.flush_memories

Auto-Detection Chain

When a task's provider is set to "auto" (the default), Hermes tries providers in order until one works:

For text tasks (compression, web extract, etc.):

OpenRouter → Nous Portal → Custom endpoint → Codex OAuth →
API-key providers (z.ai, Kimi, MiniMax, Xiaomi MiMo, Hugging Face, Anthropic) → give up

For vision tasks:

Main provider (if vision-capable) → OpenRouter → Nous Portal →
Codex OAuth → Anthropic → Custom endpoint → give up

If the resolved provider fails at call time, Hermes also has an internal retry: if the provider is not OpenRouter and no explicit base_url is set, it tries OpenRouter as a last-resort fallback.

Configuring Auxiliary Providers

Each task can be configured independently in config.yaml:

auxiliary:
  vision:
    provider: "auto"              # auto | openrouter | nous | codex | main | anthropic
    model: ""                     # e.g. "openai/gpt-4o"
    base_url: ""                  # direct endpoint (takes precedence over provider)
    api_key: ""                   # API key for base_url

  web_extract:
    provider: "auto"
    model: ""

  compression:
    provider: "auto"
    model: ""

  session_search:
    provider: "auto"
    model: ""

  skills_hub:
    provider: "auto"
    model: ""

  mcp:
    provider: "auto"
    model: ""

  flush_memories:
    provider: "auto"
    model: ""

Every task above follows the same provider / model / base_url pattern. Context compression is configured under auxiliary.compression:

auxiliary:
  compression:
    provider: main                                    # Same provider options as other auxiliary tasks
    model: google/gemini-3-flash-preview
    base_url: null                                    # Custom OpenAI-compatible endpoint

And the fallback model uses:

fallback_model:
  provider: openrouter
  model: anthropic/claude-sonnet-4
  # base_url: http://localhost:8000/v1               # Optional custom endpoint

All three — auxiliary, compression, fallback — work the same way: set provider to pick who handles the request, model to pick which model, and base_url to point at a custom endpoint (overrides provider).

Provider Options for Auxiliary Tasks

These options apply to auxiliary:, compression:, and fallback_model: configs only — "main" is not a valid value for your top-level model.provider. For custom endpoints, use provider: custom in your model: section (see AI Providers).

Provider	Description	Requirements
"auto"	Try providers in order until one works (default)	At least one provider configured
"openrouter"	Force OpenRouter	OPENROUTER_API_KEY
"nous"	Force Nous Portal	hermes auth
"codex"	Force Codex OAuth	hermes model→ Codex
"main"	Use whatever provider the main agent uses (auxiliary tasks only)	Active main provider configured
"anthropic"	Force Anthropic native	ANTHROPIC_API_KEYor Claude Code credentials

Direct Endpoint Override

For any auxiliary task, setting base_url bypasses provider resolution entirely and sends requests directly to that endpoint:

auxiliary:
  vision:
    base_url: "http://localhost:1234/v1"
    api_key: "local-key"
    model: "qwen2.5-vl"

base_url takes precedence over provider. Hermes uses the configured api_key for authentication, falling back to OPENAI_API_KEY if not set. It does not reuse OPENROUTER_API_KEY for custom endpoints.

Context Compression Fallback

Context compression uses the auxiliary.compression config block to control which model and provider handles summarization:

auxiliary:
  compression:
    provider: "auto"                              # auto | openrouter | nous | main
    model: "google/gemini-3-flash-preview"

:::info

:::

If no provider is available for compression, Hermes drops middle conversation turns without generating a summary rather than failing the session.

Delegation Provider Override

Subagents spawned by delegate_task do not use the primary fallback model. However, they can be routed to a different provider :model pair for cost optimization:

delegation:
  provider: "openrouter"                      # override provider for all subagents
  model: "google/gemini-3-flash-preview"      # override model
  # base_url: "http://localhost:1234/v1"      # or use a direct endpoint
  # api_key: "local-key"

See Subagent Delegation for full configuration details.

Cron Job Providers

Cron jobs run with whatever provider is configured at execution time. They do not support a fallback model. To use a different provider for cron jobs, configure provider and model overrides on the cron job itself:

cronjob(
    action="create",
    schedule="every 2h",
    prompt="Check server status",
    provider="openrouter",
    model="google/gemini-3-flash-preview"
)

See Scheduled Tasks (Cron) for full configuration details.

Summary

Feature	Fallback Mechanism	Config Location
Main agent model	fallback_modelin config.yaml — one-shot failover on errors	fallback_model:(top-level)
Vision	Auto-detection chain + internal OpenRouter retry	auxiliary.vision
Web extraction	Auto-detection chain + internal OpenRouter retry	auxiliary.web_extract
Context compression	Auto-detection chain, degrades to no-summary if unavailable	auxiliary.compression
Session search	Auto-detection chain	auxiliary.session_search
Skills hub	Auto-detection chain	auxiliary.skills_hub
MCP helpers	Auto-detection chain	auxiliary.mcp
Memory flush	Auto-detection chain	auxiliary.flush_memories
Delegation	Provider override only (no automatic fallback)	delegation.provider/delegation.model
Cron jobs	Per-job provider override only (no automatic fallback)	Per-jobprovider/model

Continue Exploring

继续探索

这不是课程式的上一篇下一篇，而是从当前节点向外继续漫游。

核心功能

Credential Pools（凭据池）

原文链接：Credential Pools title: Credential Pools description: 为每个 Provider 汇聚多个 API 密钥或 OAuth Token，实现自动轮换和速率限制恢复。 sidebar label: Credential Pools

集成

AI 提供商

This page covers setting up inference providers for Hermes Agent — from cloud APIs like OpenRouter and Anthropic, to self-hosted endpoints like Ollama and vLLM,

核心功能

子 Agent 委派 (Delegation)

The delegate task tool spawns child AIAgent instances with isolated context, restricted toolsets, and their own terminal sessions. Each child gets a fresh conve

核心功能

定时任务 (Cron Scheduler)

通过自然语言或 cron 表达式自动调度任务。Hermes 通过单个 cronjob 工具暴露 cron 管理，使用动作风格的命令操作，而不是独立的 schedule/list/remove 工具。 Cron 任务可以： - 调度一次性或重复性任务 - 暂停、恢复、编辑、触发和删除任务

核心功能

工具与工具集 (Tools & Toolsets)

Tools are functions that extend the agent's capabilities. They're organized into logical toolsets that can be enabled or disabled per platform.

核心功能

记忆系统 (Memory System)

Hermes Agent has bounded, curated memory that persists across sessions. This lets it remember your preferences, your projects, your environment, and things it h

Core Features

核心功能

Hermes 的能力核心：工具、记忆、技能、委派、自动化、语音、插件与浏览器控制。

31 篇文档30 个节点

当前节点

备用 Provider (Fallback Providers)

返回分馆回到知识引擎

同主题继续探索

工具与工具集 (Tools & Toolsets)

Tools are functions that extend the agent's capabilities. They're organized into logical toolsets that can be enabled or disabled per platform.

记忆系统 (Memory System)

Hermes Agent has bounded, curated memory that persists across sessions. This lets it remember your preferences, your projects, your environment, and things it h

技能系统 (Skill System)

技能是 Hermes 的可复用知识模块。每个技能都是一个 Markdown 文件，在激活时注入到 Agent 的上下文中——为其提供持久的工作流、领域知识和行为指南，而无需将这些内容塞入系统提示中。技能是可热插拔的：你可以在会话中途安装、创建、编辑和切换技能。它们在 CLI、消息平台和 Gateway 后台任务中均可

MCP 集成 (MCP Integration)

MCP 让 Hermes Agent 连接到外部工具服务器，使 Agent 能够使用 Hermes 本身之外的工具——GitHub、数据库、文件系统、浏览器栈、内部 API 等。如果你曾想让 Hermes 使用一个已经存在于其他地方的工具，MCP 通常是最简洁的方式。 - 无需先编写原生 Hermes 工具即可访问外

ACP 编辑器集成 (ACP Editor Integration)

Hermes Agent 可以作为 ACP 服务器运行，让 ACP 兼容的编辑器通过 stdio 与 Hermes 通信，并渲染： - 聊天消息 - 工具活动 - 文件差异 - 终端命令 - 审批提示 - 流式思考 / 响应片段当你希望 Hermes 像编辑器原生的编程 Agent 一样工作，而不是独立的 CLI 或

API 服务器 (API Server)

The API server exposes hermes-agent as an OpenAI-compatible HTTP endpoint. Any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, NextCha

备用 Provider (Fallback Providers)

Primary Model Fallback

配置

Supported Providers

Custom Endpoint Fallback

When Fallback Triggers

Examples

Where Fallback Works

Auxiliary Task Fallback

Tasks with Independent Provider Resolution

Auto-Detection Chain

Configuring Auxiliary Providers

Provider Options for Auxiliary Tasks

Direct Endpoint Override

Context Compression Fallback

Delegation Provider Override

Cron Job Providers

Summary

继续探索

Credential Pools（凭据池）

AI 提供商

子 Agent 委派 (Delegation)

定时任务 (Cron Scheduler)

工具与工具集 (Tools & Toolsets)

记忆系统 (Memory System)

核心功能

知识引擎 AI 问答