Tone Dark
Tint
06 Context exchange · how agents share what they know without leaking what they shouldn't

Two agents collaborating is not "send everything you know."

The previous chapter covered the seven shapes of agent collaboration. Once you have picked a shape, the next question shows up almost immediately: when the conductor delegates work to a specialist, what context comes along, and what stays behind? Pass too little and the specialist guesses. Pass too much and you have just leaked the customer's password to a model that did not need it.

This is the question almost every "agent system" gets wrong on the first try. The default in most frameworks is to dump the full conversation history into every sub-call. It is fast to build and it works for demos. In production, it leaks data, blows up the context window, and makes audits impossible because you cannot tell which agent saw what.

The fix is three small ideas working together: a typed envelope around every piece of context, a short handshake that pins down what will flow before any data flows, and a compartment that enforces the agreement at the boundary. None of this is new in computer security; we are just applying old ideas to a new shape of system.

Every other security control in this manual is a wall around an agent. Context exchange is the door. Get the door right and most of the walls become much easier.

The three pieces, in plain English

Envelope
A wrapper around any piece of information passed between agents. Carries provenance (who produced it), classification (how sensitive), and TTL (when to discard). Replaces the bare string that most frameworks send today.
Handshake
A short negotiation that runs before delegation. Both sides declare what they need, what they can return, and what classification of data they handle. The output is a contract that both sides honor for the rest of the conversation.
Compartment
A boundary across which envelopes must be filtered. It applies the handshake's contract on every envelope that crosses, by minimizing keys, redacting patterns, and validating tags. The boundary is one function call, not a discipline.
Together
The envelope makes context inspectable. The handshake makes the rules explicit before any data flows. The compartment makes the rules enforceable. Skip any one and the other two stop helping much.

Why a typed envelope, when a string almost works

The most common way one agent passes context to another today is a string. The orchestrator says "look up this customer's order status," the specialist reads it, calls a tool, returns a string. It works for the demo. It breaks for three reasons in real systems.

The envelope is the smallest amount of metadata that fixes all three. The companion code defines it as a frozen dataclass with five fields beyond the payload itself.

from dataclasses import dataclass, field
from enum import IntEnum
import time, uuid


class Classification(IntEnum):
    PUBLIC = 0          # safe to send anywhere
    INTERNAL = 1        # within the org, but not to third parties
    CONFIDENTIAL = 2    # only to agents with a need-to-know
    SECRET = 3          # only to a small named set; never to logs


@dataclass(frozen=True)
class ContextEnvelope:
    payload: object                                  # the actual data
    producer: str                                    # who created this envelope
    classification: Classification = Classification.INTERNAL
    ttl_seconds: float | None = None           # when to discard
    created_at: float = field(default_factory=time.time)
    envelope_id: str = field(default_factory=lambda: uuid.uuid4().hex)
    derived_from: tuple = ()                       # provenance chain
    tags: tuple = ()                               # free-form routing tags

Three design choices in that snippet are worth calling out. The envelope is frozen: to "modify" an envelope, you create a new one with explicit lineage, the same way version control records every change. Classification is a lattice: combining envelopes with different classifications yields the highest one, which is the standard rule from any label-based system. The TTL is wall-clock: not "ten more steps," not "until the conversation ends," but a real expiry, because that is the only kind of TTL that survives an agent that gets confused about what step it is on.

The handshake, before any data flows

Now that we have a way to carry context, the next question is what context to carry. The wrong answer is "everything we have, and trust the specialist to ignore the parts it does not need." That is exactly how prompt-injection attacks chain through agent systems, and it is also how the orchestrator's instructions on tone-of-voice end up in the customer's reply by mistake.

The right answer is older than agents. Before TLS sends bytes, it negotiates a cipher suite. Before gRPC sends a call, both sides agree on a protobuf schema. Before OAuth issues a token, the scopes are pinned. The pattern is the same: agree on the contract before any data moves. We do the same thing for inter-agent context.

The companion kit's handshake has three pieces. The callee publishes its AgentCapabilities: the highest classification it accepts, the highest it produces, the tags it adds to its outputs, the sub-tools it might call, and how long it will keep a session alive. The caller sends a HandshakeRequest: the task summary, the classification of the data it intends to send, the tags it needs back, and the keys it is willing to share (the need-to-know set). The negotiation function checks compatibility on five rules and either fails fast with a reason, or returns a SessionContract that both sides will honor.

def negotiate(request: HandshakeRequest,
              callee_caps: AgentCapabilities) -> SessionContract:
    # 1. Caller's data classification must not exceed callee's intake.
    if request.input_classification > callee_caps.accepts_max_classification:
        raise HandshakeError("input classification exceeds callee's max")

    # 2. Output ceiling is bounded by what the callee can produce
    #    AND by what the caller already declared (the response cannot be
    #    more sensitive than the input it was derived from).
    output_ceiling = min(callee_caps.produces_max_classification,
                          request.input_classification)

    # 3. Required output tags must be a subset of what callee produces.
    missing = request.required_output_tags - callee_caps.output_tags
    if missing:
        raise HandshakeError(f"callee does not produce: {sorted(missing)}")

    # 4. Session length must fit inside callee's window.
    if request.requested_session_seconds > callee_caps.max_session_seconds:
        raise HandshakeError("requested session exceeds callee max")

    # 5. Need-to-know set: only the keys the caller declared can flow.
    allowed_keys = request.need_to_know_keys or frozenset()

    return SessionContract(
        session_id=uuid.uuid4().hex,
        caller_id=request.caller_id,
        callee_id=callee_caps.agent_id,
        max_input_classification=request.input_classification,
        max_output_classification=output_ceiling,
        allowed_input_keys=allowed_keys,
        required_output_tags=request.required_output_tags,
        ttl_seconds=request.requested_session_seconds,
        sub_tools_disclosed=callee_caps.sub_tools,
    )

Notice rule 2: the response classification is bounded above by the input classification. This is the same rule that water cannot flow uphill. If the caller only sent the agent internal data, the agent's response cannot be secret, because there was no secret input for it to derive a secret answer from. This single rule prevents one of the most common sins in agent systems: a chatty specialist that "helpfully" includes more context than it was asked for.

The compartment, where the contract becomes enforceable

A handshake is just an agreement until something enforces it. Without an enforcer, both sides have to remember the contract on every message, and one slip leaks data the contract was supposed to prevent. The compartment is the enforcer.

A compartment has two gates. gate_outbound filters envelopes leaving the caller's side: it checks classification, minimizes the payload to the agreed keys, and redacts known PII patterns. gate_inbound validates envelopes returned by the callee: it checks the response classification, verifies that all required tags are present, and rejects expired envelopes.

@dataclass(frozen=True)
class Compartment:
    contract: SessionContract

    def gate_outbound(self, envelope: ContextEnvelope) -> ContextEnvelope:
        envelope.require_fresh()                              # 1. TTL
        if envelope.classification > self.contract.max_input_classification:
            raise EnvelopeError("exceeds contract input ceiling")  # 2. classification
        out = envelope
        if isinstance(out.payload, dict) and self.contract.allowed_input_keys:
            out = minimize(out, self.contract.allowed_input_keys)  # 3. minimize
        out = redact(out)                                       # 4. redact PII
        return out

    def gate_inbound(self, envelope: ContextEnvelope) -> ContextEnvelope:
        envelope.require_fresh()
        if envelope.classification > self.contract.max_output_classification:
            raise EnvelopeError("response classification too high")
        missing = self.contract.required_output_tags - set(envelope.tags)
        if missing:
            raise EnvelopeError(f"missing required tags: {sorted(missing)}")
        return envelope

The compartment is deliberately small. It does five things on the way out and three on the way in, and that is all. Bigger filters become arguments. The point of a compartment is that callers do not have to remember the rules; they hand an envelope to gate_outbound and get back something safe to send. The rules live in one place.

An end-to-end example

Here is what the whole flow looks like for a realistic case: an orchestrator handing a customer ticket to a summarizer specialist. The orchestrator has the full ticket, including the customer's PII and an internal note that should never reach a third-party model. The summarizer needs the ticket text and the category, nothing else.

# Step 1. Specialist publishes capabilities.
summarizer = AgentCapabilities(
    agent_id="summarizer_v2",
    accepts_max_classification=Classification.CONFIDENTIAL,
    produces_max_classification=Classification.CONFIDENTIAL,
    output_tags=frozenset({"summary", "category"}),
    sub_tools=frozenset({"web_search"}),
    max_session_seconds=120,
)

# Step 2. Orchestrator sends a handshake request.
request = HandshakeRequest(
    caller_id="orchestrator",
    task_summary="summarize this customer ticket",
    input_classification=Classification.CONFIDENTIAL,
    required_output_tags=frozenset({"summary"}),
    need_to_know_keys=frozenset({"ticket_text", "category"}),
)

# Step 3. Negotiate. If incompatible, this raises and no data flows.
contract = negotiate(request, summarizer)
compartment = Compartment(contract=contract)

# Step 4. Wrap the raw context. The orchestrator has lots of fields,
# most of which the summarizer should not see.
raw = ContextEnvelope(
    payload={
        "ticket_text": "card 4111-1111-1111-1111 charged twice. user@x.com",
        "category": "billing",
        "internal_note": "VIP customer, escalate fast",    # NOT in NTK
        "user_pii": "ssn 123-45-6789",                  # NOT in NTK
    },
    producer="orchestrator",
    classification=Classification.CONFIDENTIAL,
)

# Step 5. The compartment minimizes and redacts on the way out.
safe = compartment.gate_outbound(raw)
# safe.payload is now {"ticket_text": "card [REDACTED] charged twice. [REDACTED]",
#                       "category": "billing"}
# internal_note and user_pii are gone. The credit card and email are gone too.

Three things happened in step 5 that would not have happened in a string-based system. The keys internal_note and user_pii were dropped because they were not in the need-to-know set. The credit card number was redacted because it matched a default pattern. The customer email was redacted for the same reason. The summarizer never saw any of it.

On the way back, the specialist returns its summary. The compartment validates that it has the summary tag the orchestrator asked for, that it is not classified higher than the ceiling, and that the envelope is still fresh. If any of those fail, the orchestrator gets a clean exception, not a silently-broken response.

Three failure modes this prevents

To make the value concrete, here are three failures that are common in real agent systems and how each piece of the design prevents them.

Failure modeWhat goes wrongWhich piece prevents it
Surplus context leak Orchestrator dumps full ticket including internal notes; specialist quotes them in the customer reply. Need-to-know keys in the handshake plus minimize in the compartment.
PII in logs Specialist logs the raw input for debugging; the log harvests credit card numbers and emails. Redact in the compartment runs before the envelope ever leaves the caller's side.
Stale data drift Long-running session keeps using a token or document that has since been revoked. TTL on the envelope plus require_fresh on every read.
Quiet privilege creep Specialist starts calling sub-tools the caller never knew about, escalating its effective scope. sub_tools_disclosed in the contract makes the specialist's tool set public up front.
Output-too-classified Specialist receives public data, confabulates a "secret" tag onto its response, gets routed to high-trust callers. Output ceiling rule in negotiate: response classification cannot exceed input classification.

How this fits the seven patterns

Different patterns have different context-exchange profiles. The same three building blocks apply, but how you use them shifts.

What this is not

A few things this chapter is deliberately not promising.

Practical guidance

The smallest version of this idea fits in 200 lines of Python and is checked in next door (companion kit, context_exchange/). Read it, run the demo, port the parts that fit your stack. Skip the rest.