Start here if Agentic AI is new to you.
This chapter assumes nothing. By the end, you will know what an agent is, what makes a system "agentic", what tools and memory are doing, and you will have walked through the lifecycle of a single agent answering one question. The rest of the manual will then make sense.
- Basic Python: you can read a function, a class, and a list comprehension. We will not teach Python here. required
- What an LLM is: a model that takes text in and produces text out. ChatGPT, Claude, Gemini are examples. You have used at least one. required
- Prompting basics: you have written a prompt, gotten a response, and tweaked the prompt. You know "system prompt" vs "user prompt". helpful
- API calls: comfortable with HTTP requests, JSON, and reading API documentation. helpful
- Async programming: familiarity with async/await in Python or JavaScript. The parallel chapters lean on this. optional
- Distributed systems vocabulary: words like consensus, quorum, fault tolerance. Useful for Chapter 20, not required. optional
What is an "agent", really?
An LLM by itself is a function: text in, text out. It has no memory between calls, no ability to take action, no way to look things up. It just predicts the next token.
An agent wraps an LLM in a loop and gives it three things it lacks: the ability to use tools, the ability to remember across turns, and the ability to decide what to do next. That loop, perceive, decide, act, observe, repeat, is what makes a system "agentic".
The three things every agent does, in order
The agent receives the goal plus whatever context is relevant: prior messages, tool results from earlier in the loop, retrieved documents. This becomes the prompt sent to the LLM.
The hard part is not what to include. It is what to exclude. Stuffing too much into context confuses the model and burns tokens. A good agent passes only what is needed for the current step.
The LLM responds with a structured decision. Modern systems use function calling or tool use, where the model produces something like:
{
"action": "call_tool",
"tool": "search_web",
"args": { "query": "agentic AI orchestration patterns" }
}
Or it can declare it is finished:
{
"action": "final_answer",
"text": "Based on the search results, the four main patterns are..."
}
The agent runs the chosen tool, captures the output, appends it to memory, then loops back to step 1. The next perception now includes what just happened. This is how the agent "learns" within a single task.
If the model returned final_answer, the loop exits and the answer is returned to the user.
A complete tiny agent in Python
Here is the smallest honest agent. Roughly 30 lines. Read it top to bottom. The patterns you will see in production systems are elaborations of this.
def tiny_agent(goal, llm, tools, max_steps=10):
"""A minimal agent. Loops until the LLM says 'done' or budget runs out."""
history = [{"role": "system", "content": f"You are an agent. Goal: {goal}"}]
for step in range(max_steps):
# 1. PERCEIVE: build the prompt from history
decision = llm(history, available_tools=list(tools.keys()))
# 2. DECIDE: did the model finish or call a tool?
if decision["action"] == "final_answer":
return decision["text"]
if decision["action"] == "call_tool":
tool_name = decision["tool"]
args = decision["args"]
# 3. ACT: run the tool, observe the result
if tool_name not in tools:
result = f"Error: tool '{tool_name}' is not available"
else:
try:
result = tools[tool_name](**args)
except Exception as e:
result = f"Tool error: {e}"
# Append the tool call AND its result to history (this is memory)
history.append({"role": "assistant", "content": f"calling {tool_name}({args})"})
history.append({"role": "tool", "content": str(result)})
return "Agent stopped: max steps reached without final answer"
Three things to internalize from this code. First, history is everything: it is the only memory the agent has within one task. Second, the loop has a hard upper bound (max_steps): without this, a confused agent loops forever. Third, tool errors are caught and fed back: the agent gets the chance to recover instead of crashing.
search_web. Mentally walk through the loop. What does step 1 look like? Step 2? When does the loop exit? What would happen if search_web threw an error on every call?
From one agent to many: why orchestration?
A single agent is enough for narrow, well-scoped tasks. But three things break when you push it harder:
- Context bloat: as the loop grows, history balloons. The model gets confused, slow, and expensive. There is a hard ceiling around 100k to 200k tokens before quality drops sharply.
- Skill spread: the same prompt cannot make a model great at researching, writing, coding, and reviewing. Specialization beats generalization.
- Failure correlation: a single agent's blind spots are the system's blind spots. No one is checking its work.
Multi-agent systems address all three: shorter contexts per agent, specialization by role, and cross-agent verification. That is what the rest of this manual covers.
- Translating a 200-word email from English to Japanese.
- Researching a market and producing a 10-page report with citations.
- Reviewing a pull request that changes the payment system.
- Answering "what time is it in Paris right now?".
Answers: 1. single (narrow, no skill spread). 2. multi (research, writing, fact-checking are different skills, plus context bloat). 3. multi (proposer plus critic plus security is the textbook debate pattern). 4. single (one tool call, done).
The conceptual map: what we will cover
Each chapter answers one question. Use this as a roadmap.
Recommended next steps
- Build the tiny agent above with one real LLM and one real tool (web search, a calculator, anything). Watch the history grow. Break it on purpose: remove the
max_stepscap, see what happens. - Then read Chapter 2 (Architecture). The seven-layer architecture will map directly onto what you just built, and you will see what your minimal version was missing.
- Pick a real problem you care about. Decide single vs multi, sketch the agents on paper, and only then start coding. The thinking matters more than the framework.
- One scope note before you go. Almost every agent in this manual is the call-and-response kind: a function that takes a prompt and returns an answer. That is one valid kind of agent, and the kind nearly all production systems ship today. Chapter 24 names a different kind, the one almost nobody is building yet, that watches continuous streams instead of answering questions. If after building tiny agents you find yourself wondering why every example feels like a chatbot under the hood, that chapter will tell you what is missing.