Agent from Scratch Part 0: What Is an Agent?

I’m starting to build a general agent framework from scratch, sharing what I’ve learned over the past few years. Let’s start from the very beginning.

What Is an Agent?

IMO, an agent is a workflow that can think like a human — do what a human can do. That concept existed even before LLMs, when we had stateful agents in backend system design.

The only reason “agents” are popular now is Large Models. We finally found a moment when agent design could be generalized — not hand-crafted for each narrow task.

The 10,000ft View: An Agent Is a PC

Back to old-fashioned computing: we have IO, a CPU, and storage.

An agent maps almost perfectly:

CPU → LLM
IO → connector to external devices (tools, APIs, sensors)
Storage → memory

Yep, it’s that simple.

Agent architecture — External Input flows into Agent, which connects to Memory, External Tools, and Guidelines, producing Agent Output

Over time, engineers added fancy stuff to make each component faster:

Better CPU → better models
Larger bandwidth → larger context windows
More applications → more skills / MCP servers

Nothing fundamentally changed.

The Agent Heartbeat

Here’s the fake code of agent orchestration — if you know how OpenClaw works, this is pretty much the heartbeat:

while True:
    sleep(1000)
    input = read_input(context)
    intent_and_plan = think(context, input)
    execution_result = do(context, intent_and_plan)
    # this phase can be async sometime
    evaluation(context, execution_result)

Simple loop: read, think, do, evaluate. Repeat.

The Event-Driven Upgrade

There’s a known problem with sleep — wasting resources waiting. The solution? Event-driven, just like JavaScript.

Claude Code’s internals indicate they’re doing the same thing. So the loop evolves:

User interaction side:

pub_sub_client = PubSubClient()

input = read_user_input()
pub_sub_client.send(topic="user_input", input)
result = pub_sub_client.subscript(topic="task_result")

Consumer (agent) side:

user_input = pub_sub_client.subscript(topic="user_input")
intent_and_plan = think(context, input)
execution_result = do(context, intent_and_plan)
pub_sub_client.send(topic="task_result", execution_result)
# this phase can be async sometime
evaluation(context, execution_result)

Clean decoupling. The agent becomes a proper event consumer.

What’s Coming

In this series, I’ll dig deeper into each component:

IO — how the agent talks to the world
Orchestration — prompt engineering and the harness system
Skills — user manuals for tools
Memory — expanding the context window
Multi-agent — when one agent isn’t enough

Track progress and raise issues / PRs at github.com/Czhang0727/agent-from-scratch.