CHAPTER 02
Architecture & flow
A practical seven-layer way to think about agent systems: identity, router, state, messaging, memory, tools, and observability. One author's mental model, not a standard.
Explore architecture
CHAPTER 09
Generalists & specialists
Three legitimate shapes for an agent in production: generalist, specialist, generalist plus RAG. Where knowledge actually lives across five anchors (weights, fine-tune, prompt, tools, retrieved). How the choice changes which guards do real work.
Pick the right shape
CHAPTER 10
Fine-tuning
When fine-tuning actually pays off, when prompting or RAG would do the same job for less, and the math you should not skip: compute scaling, sample efficiency, catastrophic forgetting, and the DPO loss derived. Plus the four failure modes specific to fine-tuned models.
Decide whether to fine-tune
CHAPTER 11
When the agent itself is wrong
LLM agents hallucinate their own capabilities, drift away from the original ask, and try to call tools they were never authorized for. Three external checks (capability registry, pinned ask, tool gate) close the gap. No attacker required.
Defend against drift
CHAPTER 13
Trust, privileges & RAG
What your agent is allowed to do, and how it earns more. Pre-config vs runtime privileges, RAG access as a security boundary, the six trust mechanisms, and a working behavior-tracking system: Beta-distributed reputation with exponential decay, signed Ed25519 capability tokens, Sybil-detection by correlation analysis, append-only audit chains.
Manage privileges
CHAPTER 14
Control plane
Real-time enforcement over heterogeneous data sources. Sub-80ms compliance budgets, in-memory policy engines, classification-aware retrieval, PII redaction, contextvar-threaded lineage, multi-store right-to-erasure with signed deletion certificates, and residency routing. Compliance that actually runs while the agent is running.
Enforce in real time
CHAPTER 16
Predicting what's next
A real-time predictive model of the system the agent operates in or observes. What it buys you (anomaly detection, lower latency, compressed memory, faster failure detection), the surprise minimization math, and the failure mode of agents that trust predictions over reality. The brain's predictive system, applied as engineering.
Predict the environment
CHAPTER 21
Infra & deployment
Running agents in dev, test, staging, and production. Blue-green and canary deployments, version pinning, and five real-world case studies from software, fintech, retail, e-commerce, and healthcare.
Deploy to prod
CHAPTER 23
The 2026 frontier
Where agent security has moved in the last twelve months. Cross-Agent Privilege Escalation, Agent Session Smuggling, the OWASP Top 10 for Agentic Applications, the four foundations from CSA, and the IETF drafts (HDP, AIP) converging on the fix. Plain language, verified sources.
Cross the frontier
CHAPTER 24
Production case studies
Real shipped systems with public sources. GitHub Copilot Workspace, Anthropic's computer use, Cursor's background agent, the Devin benchmark controversy, Shopify Sidekick, the Berkeley scanning-agent results. What each engineering team reported in writing, and what those choices reveal.
See real systems
CHAPTER 25
End-to-end walkthrough
One coherent customer support agent, designed from architecture through trust, control plane, predictability, risk, alerting, evaluation, guardrails, and deployment. Each section names which earlier chapter it draws from. The whole system in one place.
Walk it through
CHAPTER 27
Beyond software
Agents that watch instead of answer. The case for moving agentic AI out of software workflows and into clinical floors, neuroimaging streams, autonomous labs. Grounded in 2025-26 research from BMC, Nature, Royal Society Open Science, Frontiers, Meta FAIR. The next decade's frontier.
Cross the frontier
CHAPTER 29
Glossary
Every term defined in one place. Agent, blackboard, blast radius, BFT, MCP, A2A, semantic consensus, and more.
Look up terms
CHAPTER 30
References
Curated bibliography. Foundational papers (ReAct, Reflexion, Toolformer), 2025–26 protocols (MCP, A2A), benchmarks, security research, and the multi-agent frameworks that shaped this manual.
Browse references