知识引擎/Hermes 知识引擎/记忆系统 (Memory System)

Hermes Agent has bounded, curated memory that persists across sessions. This lets it remember your preferences, your projects, your environment, and things it h

记忆系统 (Memory System)

> 📖 本文档翻译自 Hermes Agent 官方文档 > 最后更新:2026-04-16


Hermes Agent has bounded, curated memory that persists across sessions. This lets it remember your preferences, your projects, your environment, and things it has learned.

工作原理

Two files make up the agent's memory:

FilePurposeChar Limit
MEMORY.mdAgent's personal notes — environment facts, conventions, things learned2,200 chars (~800 tokens)
USER.mdUser profile — your preferences, communication style, expectations1,375 chars (~500 tokens)

Both are stored in ~/.hermes/memories/ and are injected into the system prompt as a frozen snapshot at session start. The agent manages its own memory via the memory tool — it can add, replace, or remove entries.

:::info

:::

How Memory Appears in the System Prompt

At the start of every session, memory entries are loaded from disk and rendered into the system prompt as a frozen block:

══════════════════════════════════════════════
MEMORY (your personal notes) [67% — 1,474/2,200 chars]
══════════════════════════════════════════════
User's project is a Rust web service at ~/code/myapi using Axum + SQLx
§
This machine runs Ubuntu 22.04, has Docker and Podman installed
§
User prefers concise responses, dislikes verbose explanations

The format includes:

  • A header showing which store (MEMORY or USER PROFILE)
  • Usage percentage and character counts so the agent knows capacity
  • Individual entries separated by § (section sign) delimiters
  • Entries can be multiline

Frozen snapshot pattern: The system prompt injection is captured once at session start and never changes mid-session. This is intentional — it preserves the LLM's prefix cache for performance. When the agent adds/removes memory entries during a session, the changes are persisted to disk immediately but won't appear in the system prompt until the next session starts. Tool responses always show the live state.

Memory Tool Actions

The agent uses the memory tool with these actions:

  • add — Add a new memory entry
  • replace — Replace an existing entry with updated content (uses substring matching via old_text)
  • remove — Remove an entry that's no longer relevant (uses substring matching via old_text)

There is no read action — memory content is automatically injected into the system prompt at session start. The agent sees its memories as part of its conversation context.

Substring Matching

The replace and remove actions use short unique substring matching — you don't need the full entry text. The old_text parameter just needs to be a unique substring that identifies exactly one entry:

# If memory contains "User prefers dark mode in all editors"
memory(action="replace", target="memory",
       old_text="dark mode",
       content="User prefers light mode in VS Code, dark mode in terminal")

If the substring matches multiple entries, an error is returned asking for a more specific match.

Two Targets Explained

memory— Agent's Personal Notes

For information the agent needs to remember about the environment, workflows, and lessons learned:

  • Environment facts (OS, tools, project structure)
  • Project conventions and configuration
  • Tool quirks and workarounds discovered
  • Completed task diary entries
  • Skills and techniques that worked

user— User Profile

For information about the user's identity, preferences, and communication style:

  • 姓名、角色、时区
  • 沟通偏好(简洁 vs 详细、格式偏好)
  • 反感和需要避免的事项
  • 工作流习惯
  • 技术水平

What to Save vs Skip

Save These (Proactively)

The agent saves automatically — you don't need to ask. It saves when it learns:

  • User preferences: "I prefer TypeScript over JavaScript" → save to user
  • Environment facts: "This server runs Debian 12 with PostgreSQL 16" → save to memory
  • Corrections: "Don't use sudo for Docker commands, user is in docker group" → save to memory
  • Conventions: "Project uses tabs, 120-char line width, Google-style docstrings" → save to memory
  • Completed work: "Migrated database from MySQL to PostgreSQL on 2026-01-15" → save to memory
  • Explicit requests: "Remember that my API key rotation happens monthly" → save to memory

Skip These

  • Trivial/obvious info: "User asked about Python" — too vague to be useful
  • Easily re-discovered facts: "Python 3.12 supports f-string nesting" — can web search this
  • Raw data dumps: Large code blocks, log files, data tables — too big for memory
  • Session-specific ephemera: Temporary file paths, one-off debugging context
  • Information already in context files: SOUL.md and AGENTS.md content

Capacity Management

Memory has strict character limits to keep system prompts bounded:

StoreLimitTypical entries
memory2,200 chars8-15 entries
user1,375 chars5-10 entries

What Happens When Memory is Full

When you try to add an entry that would exceed the limit, the tool returns an error:

{
  "success": false,
  "error": "Memory at 2,100/2,200 chars. Adding this entry (250 chars) would exceed the limit. Replace or remove existing entries first.",
  "current_entries": ["..."],
  "usage": "2,100/2,200"
}

The agent should then:

  1. Read the current entries (shown in the error response)
  2. Identify entries that can be removed or consolidated
  3. Use replace to merge related entries into shorter versions
  4. Then add the new entry

Best practice: When memory is above 80% capacity (visible in the system prompt header), consolidate entries before adding new ones. For example, merge three separate "project uses X" entries into one comprehensive project description entry.

Practical Examples of Good Memory Entries

Compact, information-dense entries work best:

# Good: Packs multiple related facts
User runs macOS 14 Sonoma, uses Homebrew, has Docker Desktop and Podman. Shell: zsh with oh-my-zsh. Editor: VS Code with Vim keybindings.

# Good: Specific, actionable convention
Project ~/code/api uses Go 1.22, sqlc for DB queries, chi router. Run tests with 'make test'. CI via GitHub Actions.

# Good: Lesson learned with context
The staging server (10.0.1.50) needs SSH port 2222, not 22. Key is at ~/.ssh/staging_ed25519.

# Bad: Too vague
User has a project.

# Bad: Too verbose
On January 5th, 2026, the user asked me to look at their project which is
located at ~/code/api. I discovered it uses Go version 1.22 and...

Duplicate Prevention

The memory system automatically rejects exact duplicate entries. If you try to add content that already exists, it returns success with a "no duplicate added" message.

Security Scanning

Memory entries are scanned for injection and exfiltration patterns before being accepted, since they're injected into the system prompt. Content matching threat patterns (prompt injection, credential exfiltration, SSH backdoors) or containing invisible Unicode characters is blocked.

Beyond MEMORY.md and USER.md, the agent can search its past conversations using the session_search tool:

  • All CLI and messaging sessions are stored in SQLite (~/.hermes/state.db) with FTS5 full-text search
  • Search queries return relevant past conversations with Gemini Flash summarization
  • The agent can find things it discussed weeks ago, even if they're not in its active memory
hermes sessions list    # Browse past sessions

session_search vs memory

FeaturePersistent MemorySession Search
Capacity~1,300 tokens totalUnlimited (all sessions)
SpeedInstant (in system prompt)Requires search + LLM summarization
Use caseKey facts always availableFinding specific past conversations
ManagementManually curated by agentAutomatic — all sessions stored
Token costFixed per session (~1,300 tokens)On-demand (searched when needed)

Memory is for critical facts that should always be in context. Session search is for "did we discuss X last week?" queries where the agent needs to recall specifics from past conversations.

配置

# In ~/.hermes/config.yaml
memory:
  memory_enabled: true
  user_profile_enabled: true
  memory_char_limit: 2200   # ~800 tokens
  user_char_limit: 1375     # ~500 tokens

External Memory Providers

For deeper, persistent memory that goes beyond MEMORY.md and USER.md, Hermes ships with 8 external memory provider plugins — including Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover, and Supermemory.

External providers run alongside built-in memory (never replacing it) and add capabilities like knowledge graphs, semantic search, automatic fact extraction, and cross-session user modeling.

hermes memory setup      # pick a provider and configure it
hermes memory status     # check what's active

See the Memory Providers guide for full details on each provider, setup instructions, and comparison.

Continue Exploring

继续探索

这不是课程式的上一篇下一篇,而是从当前节点向外继续漫游。

核心功能

Memory Providers(记忆 Provider)

原文链接:Memory Providers sidebar position: 4 title: "Memory Providers" description: "External memory provider plugins — Honcho, OpenViking, Mem0, Hindsight, Hologr

核心功能

工具与工具集 (Tools & Toolsets)

Tools are functions that extend the agent's capabilities. They're organized into logical toolsets that can be enabled or disabled per platform.

核心功能

技能系统 (Skill System)

技能是 Hermes 的可复用知识模块。每个技能都是一个 Markdown 文件,在激活时注入到 Agent 的上下文中——为其提供持久的工作流、领域知识和行为指南,而无需将这些内容塞入系统提示中。 技能是可热插拔的:你可以在会话中途安装、创建、编辑和切换技能。它们在 CLI、消息平台和 Gateway 后台任务中均可

核心功能

MCP 集成 (MCP Integration)

MCP 让 Hermes Agent 连接到外部工具服务器,使 Agent 能够使用 Hermes 本身之外的工具——GitHub、数据库、文件系统、浏览器栈、内部 API 等。 如果你曾想让 Hermes 使用一个已经存在于其他地方的工具,MCP 通常是最简洁的方式。 - 无需先编写原生 Hermes 工具即可访问外

核心功能

ACP 编辑器集成 (ACP Editor Integration)

Hermes Agent 可以作为 ACP 服务器运行,让 ACP 兼容的编辑器通过 stdio 与 Hermes 通信,并渲染: - 聊天消息 - 工具活动 - 文件差异 - 终端命令 - 审批提示 - 流式思考 / 响应片段 当你希望 Hermes 像编辑器原生的编程 Agent 一样工作,而不是独立的 CLI 或

核心功能

API 服务器 (API Server)

The API server exposes hermes-agent as an OpenAI-compatible HTTP endpoint. Any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, NextCha

Core Features

核心功能

Hermes 的能力核心:工具、记忆、技能、委派、自动化、语音、插件与浏览器控制。

31 篇文档30 个节点

当前节点

记忆系统 (Memory System)

同主题继续探索

工具与工具集 (Tools & Toolsets)

Tools are functions that extend the agent's capabilities. They're organized into logical toolsets that can be enabled or disabled per platform.

技能系统 (Skill System)

技能是 Hermes 的可复用知识模块。每个技能都是一个 Markdown 文件,在激活时注入到 Agent 的上下文中——为其提供持久的工作流、领域知识和行为指南,而无需将这些内容塞入系统提示中。 技能是可热插拔的:你可以在会话中途安装、创建、编辑和切换技能。它们在 CLI、消息平台和 Gateway 后台任务中均可

MCP 集成 (MCP Integration)

MCP 让 Hermes Agent 连接到外部工具服务器,使 Agent 能够使用 Hermes 本身之外的工具——GitHub、数据库、文件系统、浏览器栈、内部 API 等。 如果你曾想让 Hermes 使用一个已经存在于其他地方的工具,MCP 通常是最简洁的方式。 - 无需先编写原生 Hermes 工具即可访问外

ACP 编辑器集成 (ACP Editor Integration)

Hermes Agent 可以作为 ACP 服务器运行,让 ACP 兼容的编辑器通过 stdio 与 Hermes 通信,并渲染: - 聊天消息 - 工具活动 - 文件差异 - 终端命令 - 审批提示 - 流式思考 / 响应片段 当你希望 Hermes 像编辑器原生的编程 Agent 一样工作,而不是独立的 CLI 或

API 服务器 (API Server)

The API server exposes hermes-agent as an OpenAI-compatible HTTP endpoint. Any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, NextCha

Honcho 记忆 (Honcho Memory)

Honcho is an AI-native memory backend that adds dialectic reasoning and deep user modeling on top of Hermes's built-in memory system. Instead of simple key-valu

相关节点

Memory Providers(记忆 Provider)

原文链接:Memory Providers sidebar position: 4 title: "Memory Providers" description: "External memory provider plugins — Honcho, OpenViking, Mem0, Hindsight, Hologr

工具与工具集 (Tools & Toolsets)

Tools are functions that extend the agent's capabilities. They're organized into logical toolsets that can be enabled or disabled per platform.

技能系统 (Skill System)

技能是 Hermes 的可复用知识模块。每个技能都是一个 Markdown 文件,在激活时注入到 Agent 的上下文中——为其提供持久的工作流、领域知识和行为指南,而无需将这些内容塞入系统提示中。 技能是可热插拔的:你可以在会话中途安装、创建、编辑和切换技能。它们在 CLI、消息平台和 Gateway 后台任务中均可

MCP 集成 (MCP Integration)

MCP 让 Hermes Agent 连接到外部工具服务器,使 Agent 能够使用 Hermes 本身之外的工具——GitHub、数据库、文件系统、浏览器栈、内部 API 等。 如果你曾想让 Hermes 使用一个已经存在于其他地方的工具,MCP 通常是最简洁的方式。 - 无需先编写原生 Hermes 工具即可访问外

ACP 编辑器集成 (ACP Editor Integration)

Hermes Agent 可以作为 ACP 服务器运行,让 ACP 兼容的编辑器通过 stdio 与 Hermes 通信,并渲染: - 聊天消息 - 工具活动 - 文件差异 - 终端命令 - 审批提示 - 流式思考 / 响应片段 当你希望 Hermes 像编辑器原生的编程 Agent 一样工作,而不是独立的 CLI 或

API 服务器 (API Server)

The API server exposes hermes-agent as an OpenAI-compatible HTTP endpoint. Any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, NextCha