Where this is all heading.
The rest of this manual covers what works now. This chapter is different. It's a map of where the field is moving and what production teams should start preparing for. Some of these ideas are already showing up in research papers; some are still in early prototypes. Treat this as a guide for what to keep your eye on, not a how-to.
Six trends look poised to reshape how agents are built between 2026 and 2030. They aren't independent; they reinforce each other. Together, they suggest agents will move from "smart assistants you instruct" to "self-improving systems you supervise".
1 · Agents that learn from their own work
Today's agents start fresh on every task. They don't get better from doing more work, except in the sense that you (a human) might tweak the prompt based on what you saw. The next wave of agents will themselves notice what worked, save it, and use it next time.
The pattern that's getting traction is called a skill library. The agent runs a task, succeeds, and saves the strategy that worked as a named, reusable function. Next time it sees a similar task, it checks its library first before trying to figure things out from scratch. The original demo of this idea was Voyager (NVIDIA / Caltech / Stanford, 2023), which played Minecraft and built up a library of crafting strategies. By 2026, frameworks like SAGE Self-Improving Agents Guide, 2026 are formalizing this with reinforcement learning so the library improves over time, not just grows.
What this means for builders: storing successful agent traces in a structured way will become as standard as logging is today. A "skill" is just a function with a description, an example call, and a success metric. Once you have a hundred of them, your agent has a head start on any task that resembles something it's done before.
2 · World models: agents that imagine before acting
Today's agents act, see what happened, then reactYao 2023. This is fine when actions are cheap. But when actions are expensive (sending an email, transferring money, deploying code), it would be much better if the agent could imagine the consequences first.
A "world model" is what researchers call an internal simulator: the agent's own predictive model of how the environment will respond to a given action. With a good world model, an agent can run several "what if" scenarios in its head, pick the action with the best predicted outcome, and only then commit. Recent work like SIMA 2 from Google DeepMind SIMA 2, DeepMind 2025 showed agents in 3D environments that can self-improve by generating their own practice tasks and rewards using a world model. EvoAgent EvoAgent, arXiv 2025 combined world models with continual learning to handle long, multi-step tasks.
For production agents, the most useful version of this in the near term will be cheaper: instead of a full physics simulator, a "lightweight what-if" model that predicts API responses, user reactions, or financial outcomes well enough that the agent can compare two plans before acting on either. Think of it as turning every agent action from a "send" into a "preview".
The calibration problem: when imagined consequences distort decisions
There's a subtle failure mode worth understanding. If the world model thinks the consequences of an action are worse than they really are, the agent becomes pathologically risk-averse: it refuses to ship anything, declines reasonable requests, asks for human approval on safe actions. If the world model thinks consequences are milder than reality, the agent ships dangerous things confidently. Both failures look like alignment problems but they're really calibration problems, and they're fixable with technique rather than philosophy.
The mechanism that helps:
- Counterfactual baselines. Don't just ask "what happens if I do X?" Ask "what happens if I do X minus what happens if I do nothing?" The difference is the causal effect of the action, which is what you actually care about. Without this, the model conflates "this outcome is bad" with "this action caused something bad," which leads to refusing to act even when inaction is worse.
- Conservative-optimistic ensemble. Run two world models (or the same model twice with different priors): one that errs pessimistic, one that errs optimistic. Compare their predictions. When they agree, you have a confident estimate. When they disagree sharply, you've found a high-uncertainty action where the agent should defer to a human or pick the safer alternative. The disagreement itself is the signal.
- Calibration on held-out outcomes. Periodically check the model's predicted distribution against what actually happened. If it predicted "30% chance the user will be unhappy" 1000 times, the user should have been unhappy roughly 300 times. If the actual rate is 60%, the model is miscalibrated and shouldn't be trusted for decisions until retrained. This is the same proper-scoring-rule discipline that weather forecasters use, applied to agents.
- Action-cost asymmetry. Reversible actions deserve a lower confidence threshold than irreversible ones. The world model output should feed into a decision rule that is more cautious for actions that can't be undone. "Send a draft to a colleague" and "publish to all customers" both look like "send a message" to the LLM but warrant very different confidence floors. Encode this asymmetry explicitly.
The upshot: the answer to "what if the imagined consequences are too harsh and force out wrong answers?" is not to imagine consequences less, it's to calibrate the imagining. Counterfactual reasoning, ensemble disagreement, and held-out calibration are the technical tools. None of these are speculative; they're standard techniques in causal inference and decision theory waiting to be applied to LLM agents.
3 · Multi-agent economies
Once agents can talk to other agents (via A2AYang 2025 and similar protocols), and once those agents are owned by different companies, you have the makings of an economy. An agent representing you could pay another agent representing a vendor for a piece of work. Both agents might in turn pay other agents for sub-tasks.
This sounds science-fictional but the building blocks are mostly already here. A2A handles discovery. Stablecoins and account abstraction handle payments. The missing pieces are mostly social and legal: who's liable when an agent overspends? How do you audit a chain of agent-to-agent transactions? What does "consent" mean when an agent acts on your behalf?
Expect the first wave of agent-to-agent economies in narrow domains: ad bidding, supply-chain procurement, freelance task marketplaces. Then expect a long, messy decade of figuring out the social rules. The technical pieces are easier than the human ones.
4 · Embodied agents: the physical / digital bridge
Most of this manual assumes agents that live in software: they call APIs, read documents, send messages. But a growing line of research is putting these same models into robots, drones, and other physical systems. A 2025 survey on world models for embodied AI Embodied AI Survey, arXiv 2025 covered how models that learn how the physical world responds can ground agent decisions in reality.
Why this matters even if you don't build robots: the patterns will flow back into pure-software agents. Embodied agents have to deal with real consequences (a robot can break things), partial observation (you can't see everything), and irreversible actions. The techniques that solve these problems for robots, especially world models and recovery procedures, will improve software agents that operate in expensive real-world systems like financial markets, supply chains, and healthcare.
A useful prediction: by 2028, "the agent that controls the warehouse" will look architecturally a lot like "the agent that books your meetings". Different tools, same orchestration, same memory, same guardrails.
5 · Formal verification of agent behavior
Today, you mostly know your agent works by testing it. You run examples, check the outputs, and ship if things look good. But "looks good" doesn't scale to systems that handle thousands of decisions per second across millions of users. Eventually we're going to need to prove certain things about agent behavior, not just test them.
Some early research is exploring how to mathematically guarantee certain agent properties. Examples of properties you might want to prove: "this agent will never call the delete-everything tool", "this agent's spending will never exceed $X per hour", "this agent's response will always cite at least one source from the approved list".
The hard part is that LLMs are huge statistical models, so proving things about their full behavior is currently impossible. The realistic path is a hybrid: prove properties about the wrapper around the LLM (the heuristics, the schemas, the tool restrictions), and use that wrapper to constrain what the LLM can do. This is essentially formalizing the layered guardrails approach from Chapter 17. Tools and standards for this don't exist yet, but they're coming. Expect to see "verified agent" frameworks emerge in regulated industries first (finance, healthcare, defense).
A reasonable worry to address here: can model providers prevent or charge for these wrappers? The answer today is no, and the architecture is the reason. A wrapper sits on the wrapper-builder's infrastructure, not the model provider's. It filters input before the prompt is sent, validates output after the response returns, and gates tool calls outside the model entirely. The LLM doesn't know it's being wrapped; from its perspective it just sees a request and produces a reply. There's no API surface a provider could lock down to stop this without breaking their product for everyone, and even if they tried, the constraint stack works identically against open-weight models (Llama, Mistral, Qwen, DeepSeek) where there's no provider to gatekeep at all. "Verified agent" frameworks will likely become a market signal (auditors and insurers asking for them) rather than a permission a provider grants. The closer parallel isn't the App Store, where Apple controls distribution; it's TLS, where the protocol is open and the verification ecosystem grew around it without anyone owning it.
6 · The shift from "instruct the agent" to "supervise the agent"
Tying all of this together: the role of a person working with an agent is shifting. Today, you mostly instruct: write a prompt, watch the output, write a better prompt. Tomorrow, you'll mostly supervise: set goals, define what's off-limits, review summaries of what the agent did, intervene only when something looks wrong.
This is similar to the shift from writing assembly to writing in a high-level language, or from manually tuning database queries to letting the optimizer handle it. You give up some control in exchange for being able to operate at a much larger scale.
The skills that will matter most in this new world:
- Designing good objectives. If you can express what you want as a metric the agent can improve, you can hand off the work. If you can't, you can't.
- Spotting reward hacking. Knowing when an agent is technically meeting the goal but in a way that violates the spirit. This is harder than it sounds.
- Building trust gradients. Knowing how to expand an agent's authority over time as it earns trust, without letting one good week convince you to remove all the guardrails.
- Triaging escalations. When the agent flags something for human review, knowing how to decide quickly without re-doing the agent's work.
- Reading agent traces. Becoming fluent at scanning a multi-agent log and spotting where things went sideways. Like debugging, but for systems that don't crash, just drift.
What probably isn't going to happen (or not soon)
Two predictions to push back on, since they get a lot of airtime:
- "Single super-agents will replace multi-agent systems." Even as base models get smarter, the case for splitting a system into specialized agents stays strong: easier to debug, easier to audit, easier to scope permissions, easier to swap one component without breaking others. The shape of agent systems will keep looking like teams, not solo geniuses.
- "Agents will replace all the human work." The pattern that's actually emerging is leverage, not replacement. One person plus a good set of agents does the work of three. The roles change, the headcount doesn't disappear, and the people who learn to direct agents well will be in much higher demand than the people who don't. There's a real bottleneck worth naming here: the gap between the people who can build agents and the people with the domain knowledge to direct them usefully. Most of the value isn't in either skill alone, it's in bridging them. That's why low-code and no-code agent builders (the next paragraph) matter more than they look at first glance.
Low-code agent platforms: the near-term shape of the bridge
Web development went through a familiar arc: in the 90s, you needed to know HTML and a server stack to put up a page; by the 2010s, Squarespace, Wix, and Webflow let domain experts (small business owners, photographers, nonprofits) ship without writing code. Agents are headed for the same arc, faster. The constraints are similar: most of the value comes from understanding a specific business problem deeply, not from being able to wire up a Python loop, a memory store, and a guardrail pipeline by hand.
The technical pieces that make low-code agent builders viable today are now in place: standardized protocols (MCPMCP 2025, A2A) so a builder can compose tools without custom integration work, capability tokens for safe delegation, well-understood patterns (the seven in Chapter 5) that can be exposed as configurable templates rather than handwritten code, and evaluation frameworks robust enough that platform users can validate their agents without being ML researchers. Expect a wave of these to appear over the next 24 months, both as standalone products and as features inside existing SaaS tools (the customer-support platform that ships an agent builder, the analytics tool that ships a research-agent template, etc.).
The interesting strategic question for the field isn't whether these platforms appear, it's where the value accrues: with the platform vendor (like Squarespace), with the domain expert who composes good agents on top, or with the operators who run the underlying compute. History suggests all three layers capture some, with the balance depending on how commoditized each layer becomes.
Practical advice for the next 24 months
- Invest in observability. Whatever you build, you'll want to be able to look back at agent decisions and understand them. Good logs are the foundation everything else (skill libraries, preference learning, world models, verification) is built on.
- Build for the wrapper, not the model. Models will keep getting better and cheaper. The layers around them (your heuristics, your schemas, your guardrails, your routing) are where your real product value lives. Make those solid.
- Track which model you used for what. When models change (and they will, every few months), you want to be able to A/B test, roll back, and compare. Pin your model versions; never rely on "latest".
- Get good at evaluation. The teams winning over the next few years won't be the ones with access to the best models (everyone will have those). They'll be the ones who can tell, faster than competitors, whether a change made things better or worse.
- Stay curious about hybrid approaches. The most interesting work is at the intersections: heuristics + LLMs, world models + RL, multi-agent + verification. The breakthroughs are happening where multiple ideas meet.