Hermes Agent: Building Real Multi-Agent Support

The Problem with Hermes’s Built-in Multi-Agent

HermesAgent ships with delegate_task — it spins up sub-agents in-process, fast and simple. But look at the source code:

DELEGATE_BLOCKED_TOOLS = frozenset({"delegate_task", "clarify", "memory", ...})
child = AIAgent(..., skip_memory=True, ...)

Every insight a sub-agent develops dies when the thread exits. The swarm does work, but never gets smarter.

That’s the fundamental problem. Sub-agents are disposable compute, not collaborative intelligence. I wanted something different.

What I Built Instead

Each sub-agent is a complete Hermes instance — own OS process, own config, own state, full memory access.

Main Agent delegates to Sub-Agent via Skill Manager, injecting only the skills needed

The Lifecycle

Spawn → Execute → Handoff → Complete → Merge Learnings → Cleanup

Spawn: spawn-agent.sh snapshots the main agent’s config into an isolated instance
Execute: The sub-agent runs with full autonomy — no restricted tools, real memory
Handoff: Sub-agent writes a structured handoff with findings, memory updates, and skill recommendations
Complete: complete-agent.sh validates the handoff, sends results via message queue, deletes the instance directory immediately
Merge: The main agent absorbs learnings through the native memory pipeline

Instances are ephemeral. Learnings are permanent.

Mistakes I Made Along the Way

Zombie agents in the registry. Strict bash mode + missing handoff file = the cleanup script exits early, leaving dead entries behind. Fixed with graceful degradation — always clean up the registry, even on failure.

Agent ignored my sub-agent skill. Given a choice between native delegate_task and my shell script approach, the LLM picked the simpler option every time. The model naturally gravitates to the path of least resistance. Fixed by adding a Decision Guide explaining when each approach is appropriate — now the agent knows when to use the lightweight in-process delegate vs. when to spin up a full isolated instance.

Wrong API keys. The spawn script was pulling from the global Hermes install instead of the project-local agent. Fixed to fork from the running instance so the sub-agent inherits the correct context.

Why This Matters

The core insight: learning shouldn’t be scoped to a thread lifetime.

If you’re building a multi-agent system and your sub-agents can’t retain what they discover, you’re running an expensive stateless compute cluster, not a system that gets smarter over time.

Process isolation costs more than in-process threads. But it buys you:

Real memory that persists across the agent’s lifetime
No cross-contamination between concurrent agents
Clean handoff artifacts you can inspect and audit
Agents that actually accumulate knowledge

All experiments done with Qoder’s expert mode — highly recommended for long-running agentic tasks where you want the agent to make mistakes, learn, and fix them autonomously.

GitHub

Full implementation: github.com/Czhang0727/agent-from-scratch

Next: how skill management keeps the main agent sane as the number of skills grows.