Chapter 20Glossary
Consolidated reference for the book’s architectural vocabulary. Definitions are binding within the book. Where a term has a broader industry meaning, that meaning is noted; the book’s stricter use is what governs every other chapter.
Terms
Action surface
The set of tools and operations an agent is permitted to invoke. One of the six axes of bounded autonomy (Chapter 5). The most consequential axis in defining the agent’s blast radius.
Adaptation
The fourth property of an agent (Chapter 2): the agent incorporates feedback from the environment into subsequent reasoning. Not equivalent to machine learning; can be as simple as revising a plan after a tool call fails.
Affinity routing
A model-gateway strategy that routes requests sharing a static prompt prefix to the same upstream model session or region, preserving the provider-side prompt cache that naive load balancing would destroy (Chapter 15).
Agent
A stateful, goal-directed computational entity that reasons about possible actions, selects among them, executes actions in an environment, and updates internal state based on feedback (Chapter 2). The four-property definition is the book’s principal discriminator.
Agent boundary
The logical line around the closed loop of state, choice, action, and feedback. Not a physical or process boundary; an architectural one.
Agentic system
A system that embeds probabilistic reasoning components inside deterministic infrastructure (Chapter 1). The system has at least one agent, bounded by infrastructure, governed by deterministic enforcement.
Anti-pattern
A structural choice that reliably produces specific failure modes. Cataloged for agentic systems in Chapter 11.
Approval gate
A governance component (Chapter 6) that routes a proposed action to a human reviewer before commit. Has explicit routing rules, context, decision semantics, and timeout behavior. Distinct from the plan approval gate, which operates on a structured plan before execution begins.
Plan approval gate
A governance component (Chapter 6) that routes a structured plan — tools, data scopes, milestones, projected spend — to a human reviewer before any consequential action executes. Distinct from a per-action approval gate. Re-triggers on replanning that changes tools, data scope, or irreversible milestones.
Audit trail
The portion of the trace (Chapter 12) retained for compliance and incident response, with retention and integrity guarantees stronger than for routine operational traces.
Assembled context
The context the harness builds on each turn — scoped retrieval, loaded skills, working memory, and the current goal — rather than an ever-growing transcript accumulated turn by turn. Contrast prompt stuffing (Chapter 19).
Autonomous
Self-directed operation of an agent. In this book, autonomy is always understood as bounded in practice (see Bounded autonomy).
Backend-for-agent (BFA)
An inbound integration pattern in which an agent’s tools wrap a platform’s existing internal APIs, carrying the invoking user’s downscoped token, rather than querying the database directly, so the platform’s own gateway enforces tenancy and access control (Chapter 14).
Bounded authority
In delegation patterns (e.g., orchestrator–worker), the explicit limitation of what a worker can do or decide. Distinct from bounded autonomy (which constrains reasoning loops).
Bounded autonomy
Explicitly constrained reasoning and action loops with cost, iteration, and risk limits enforced by architecture (Chapter 5). The defining discipline of production agentic systems.
Bounding layer
The architectural layer that enforces the six axes of bounded autonomy: iteration, cost, time, action surface, data access, reversibility. Sits between the agent and the rest of the deterministic infrastructure.
Chunked buffering
A resolution to the streaming-versus-validation paradox in which output is accumulated into semantic units, each cleared by a policy gate before it is flushed to the user, so no unvalidated text is rendered. Trades a small increase in time to first token for a hard guarantee; contrast optimistic streaming (Chapter 13).
Coordination pattern
A pattern that structures interaction among multiple agents (Chapter 9). Distinct from a control pattern; presupposes multi-agent structure.
Control pattern
A pattern that structures task execution (Chapter 9). May apply to single-agent or multi-agent systems.
Capability negotiation
The architectural footprint that distinguishes a skill from an ordinary memory entry. A factual memory entry returns text; a skill, on activation, attempts to alter the agent’s action surface, declaring the tools and data scopes it requires to function. That declaration must be negotiated with the bounding layer (Chapter 5) and the governance layer (Chapter 6) — admitted, constrained, or refused — before the skill can be used. Plain memory never touches the action surface; a skill always proposes to. It is what makes skills a distinct pattern rather than a kind of document (Chapter 10).
Cognitive pattern
A pattern that structures the internal reasoning of a single agent (Chapter 4). In 2026, several cognitive patterns have eroded into model-internal behavior.
Confused deputy
A security failure mode (Chapter 11) in which an agent acting for an unprivileged user uses its own higher-privileged access to read or mutate data the user is not entitled to. Defended by propagating the user’s identity to the data layer, the agent inherits the user’s authorization rather than a god-mode service account, not by trusting the agent to self-restrict.
Cost budget
A hard ceiling on the total monetary cost of a run, measured across all model calls, tool invocations, and downstream resource use (Chapter 5). One of the six axes of bounded autonomy.
Critic
A component, typically a model call with a different prompt, role, or model, that evaluates the output of another component (the executor) and provides a verdict. See Critic–Executor Split.
Critic–Executor Split
A governance pattern (Chapter 6) in which generation is separated from evaluation. Effective when the critic has access to information the executor does not.
Curated semantic memory
Semantic memory whose contents are introduced through controlled processes, with explicit freshness tracking and retirement (Chapter 7). Contrasted with uncurated vector indexes.
Data access scope
The set of data an agent is permitted to read or write (Chapter 5). One of the six axes of bounded autonomy. Often confused with action surface; logically distinct.
Debate
A coordination pattern (Chapter 9) in which two or more agents argue opposing positions before a judge renders a verdict.
Deterministic infrastructure
The non-probabilistic components of an agentic system (Chapter 1): tool adapters, memory stores, validators, policy gates, trace stores, orchestration code. Carries the architectural commitments that bound the agent.
Drift
Behavioral change over time without code change. Can come from model upgrades, memory accumulation, tool changes, or shifting input distributions (Chapter 18).
Dynamic capability loading
The architectural pattern of extending a running agent with new procedural capability, instructions, the tools it needs, and its data scopes, without redeployment, by loading a runtime capability payload on demand. The Agent Skill standard is one implementation (Chapter 10).
Execution seam
The architectural boundary between the model deciding to call a tool and the harness executing that call. Tool selection can move into the model (it is part of the inner cognitive loop); tool execution should not, because the call is where a bound can refuse, a gate can escalate, and a trace entry can attach. The discriminating question for any tool is whether the harness can refuse or modify the call before its effect occurs. Where execution returns to the harness’s own code, the seam holds even when selection happened inside the model; where a provider runs the tool and selects it inside a single API call (provider-hosted execution), the seam is lost and the effect has already happened on someone else’s infrastructure before the harness can gate it. Effectful and irreversible capability must run on the harness side of the seam; provider-hosted execution is reserved for the read-only or idempotent (Chapter 19).
Egress filtering
Inspection of an outbound prompt at the model gateway before it leaves the corporate network; a match for regulated data reroutes the request to an internally hosted model rather than a public one. The gateway is the system’s final egress boundary (Chapter 15).
Episodic memory
A persistent record of past tasks, outcomes, and notable observations (Chapter 7). Append-only or versioned; retrieval-mediated; scoped to identity.
Evaluator-optimizer
A control pattern (Chapter 9) in which a generator produces a candidate and an evaluator critiques it; the generator revises until the evaluator accepts. Anthropic’s canonical workflow shape.
Failure mode
A recurring way an agentic system goes wrong (Chapter 11). Each has a typical cause, a cascade, and a structural defense.
glass layer
A term coined in this book for the user interface and client-side interaction state when that surface is load-bearing governance rather than presentation — lowercase by convention, parallel to “bounding layer” and “governance layer.” Not industry-standard: practitioners say presentation layer or human-in-the-loop interface; adjacent usages include “single pane of glass” (ops dashboards) and “glass cockpit” (aviation HMI), neither equivalent. Where a human is the final policy gate, the glass layer is the policy engine (Chapter 13).
Governance layer
Structural enforcement mechanisms including validators, policy gates, approval gates, risk-based escalation, and rollback paths (Chapter 6). Load-bearing architectural element; not a compliance bolt-on.
Handoff
A control pattern (Chapter 9) in which control is transferred from one agent to another at a defined boundary. State passed across the handoff is structured.
Harness
The deterministic envelope that turns a model into an agent (Chapter 19; introduced in Chapter 4): the code that assembles context, calls the model, parses its intent, dispatches each proposed action through the bounding and governance layers, observes the result, persists state, and decides whether to loop again. The inner reason–act loop may be model-internal; the harness is everything around it. The term denotes the engineering artifact (a test or agent harness), not the marketing verb.
Human-in-the-loop
A governance pattern (Chapter 6) in which defined actions are routed to a human reviewer before commit. Effective when used selectively; degrades when used universally (approval fatigue).
Idempotency / idempotency key
A property (and the token that enforces it) ensuring that repeating an operation has the same effect as performing it once. An idempotency key attached to a tool call lets the adapter recognize a retry and avoid double-executing a side effect during a network timeout, the structural defense against cascading tool failures (Chapter 11, Chapter 18).
Ingestion pipeline
The deterministic ETL pipeline that prepares enterprise data for semantic memory, redaction, identity and access-control tagging, lineage, cache invalidation, and structural extraction, before anything is stored. The write path the memory gateway’s read path depends on (Chapter 8).
Iteration limit
The maximum number of reasoning/action steps an agent may take before the loop is aborted (Chapter 5). One of the six axes of bounded autonomy.
Knowledge graph
A store of typed entities and explicit relationships extracted from source data, queried by deterministic traversal rather than fuzzy similarity. The physical realization of an ontology and a complement to vector retrieval in semantic memory (Chapter 7, Chapter 8).
Late-binding authorization
An access-control pattern for semantic memory: each record is tagged at ingestion with its source’s access-control list, and the memory gateway re-evaluates the querying user’s entitlements at read time, kept current by a synchronization worker (Chapter 8).
Lethal trifecta
The combination of untrusted content, sensitive data access, and external action capability, the most-studied class of agent vulnerabilities (Chapter 6, Chapter 11). Defended by layered governance enforced at the action and output layers, not the prompt layer.
Lineage
Immutable provenance metadata, source system, document identifier, version, ingestion time, carried by every chunk and entity in semantic memory, so a flawed retrieval can be traced to its origin and corrected at source (Chapter 8).
LLM-as-a-judge
A probabilistic evaluation pattern in which a model scores or compares another model’s output (Chapter 12). The most common way to test output quality at scale (Layer 3), but inherently weaker than deterministic checks, because the judge shares the failure modes of the agent it grades. Useful for quality signals; insufficient for Layer 1/2 governance, which must be deterministic.
MCP (Model Context Protocol)
A transport protocol for tools, resources, and prompts between an agent and external services. Plumbing layer, not a pattern; distinct from Skills (Chapter 10).
Memory
Structured state influencing reasoning (Chapter 7). Subtypes: working, episodic, semantic.
Memory compaction
Summarization and pruning to keep memory bounded (Chapter 7). The standard mechanism: summarize older entries and discard or move them to cold storage.
Memory gateway
The architectural component through which an agent’s memory reads and writes are mediated (Chapter 7). Enforces retrieval policy, scoping, redaction, and governance.
Model gateway
An internal proxy through which all model inference passes, enforcing data residency, capability-tier routing, affinity caching, failover, and cost attribution. The deterministic boundary between the agent and the probabilistic models it consumes (Chapter 15).
Multi-agent system
A system with more than one agent (Chapter 9). True multi-agent systems require coordination patterns; many “multi-agent” systems are single agents stylized as several.
Ontology
A governed, explicit map of a business domain, its entity types and the relationships among them, that gives an agent deterministic definitions to reason over instead of inferring domain structure from text. Realized physically as a knowledge graph (Chapter 7).
Optimistic streaming
A resolution to the streaming-versus-validation paradox in which tokens stream to the client as provisional output while policy gates run in parallel; a failing gate triggers a visible redaction. Trades a brief exposure risk for ideal time to first token; contrast chunked buffering (Chapter 13).
Orchestrator–Worker
A control pattern (Chapter 9) in which a central agent delegates specialist subtasks to worker agents with bounded authority.
Pattern
A reusable design solution to a recurring architectural problem. In this book, most patterns are documented in a compressed form, Intent / Forces / Tradeoffs / Where-to-read-more, with cross-references to canonical sources, rather than in a uniform long-form template (Chapter 4).
Policy gate
A governance component (Chapter 6) that enforces operational, security, or compliance rules deterministically. Policy is expressed as code or in a rule engine, not as prompts.
Progressive disclosure
The loading mechanism by which an agent accesses skills (Chapter 10): discovery (name + description), activation (full manifest), execution (referenced resources). Reduces context footprint.
Prompt caching
Provider-side reuse of a previously processed static context prefix, so that resending it on later turns costs a fraction of the first call (Chapter 3, Chapter 18). The architectural reason a large static block, semantic memory, tool documentation, a stable system prompt, a loaded skill, can be supplied each turn without destroying per-session economics. Does not help dynamic per-turn content, and does nothing for attention degradation.
Prompt stuffing
Accumulating prior turns, tool outputs, and retrieved chunks into the context window each iteration instead of assembling a fresh, scoped context each turn (Chapter 19). A common precursor to context exhaustion (Chapter 11). Contrast assembled context.
Reasoning model
A language model whose architecture and training emphasize multi-step reasoning. In 2026, reasoning models internalize many patterns (ReAct, Plan–Execute, Reflection) that were previously architectural. They often expose reasoning tokens separately from completion tokens in API billing; routing and cost attribution are developed in Chapter 15, and trajectory observability for reasoning spend in Chapter 12.
Reasoning trajectory
The observable path an agent takes through harness turns on a task — tool calls, state changes, and iteration-level progress — as distinct from final output quality. Scored from structured trace events without reading chain-of-thought prose (Chapter 12).
Thrash signature
A recognizable trace pattern indicating iteration without progress: repeated near-identical tool calls, oscillating plans, or high reasoning-token spend with flat state delta. Early warning for loop failures (Chapter 11, Chapter 12).
Replay
Re-execution of an agent’s deterministic substrate against a captured trace (Chapter 12). The foundation of regression testing, counterfactual analysis, and incident response.
Reversibility envelope
The set of actions an agent may take without explicit human approval, defined by what is reversible from the system’s standpoint (Chapter 5). One of the six axes of bounded autonomy.
Risk-based escalation
A governance pattern (Chapter 6) that routes actions through governance paths of varying strictness based on a risk score.
Rollback
A compensating mechanism for actions that turn out to be wrong (Chapter 6). Reversible actions have direct inverses; partially reversible actions have compensating workflows; truly irreversible actions cannot be rolled back and must be prevented by the reversibility envelope.
Runtime capability payload
The unit loaded by dynamic capability loading: a self-contained bundle of procedural instructions, required tool declarations, and data scopes, injected into the agent’s context when a task calls for it. A skill is one form (Chapter 10).
Saga
A control pattern (Chapter 9) from microservices literature applied to agentic systems: a sequence of actions with compensating actions for each step.
Schema validator
A governance component (Chapter 6) that enforces structural correctness on every output emitted by the agent. Deterministic; always-on; the lowest-cost, highest-impact validator class.
Self-consistency
A cognitive pattern (Chapter 4) and coordination pattern (Chapter 9) in which multiple independent reasoning traces are aggregated to reduce stochastic error.
Semantic layer (metrics layer)
Deterministic middleware that holds governed definitions of business metrics and compiles them to exact queries. The agent requests a metric by name rather than writing SQL, keeping business-rule definitions out of the model’s guesswork and preventing structural hallucination (Chapter 14).
Semantic memory
Structured domain knowledge stored explicitly (Chapter 7): facts, rules, ontologies, schemas, policy. Curated, versioned, treated like a database.
Semantic routing
A model-gateway strategy in which the agent requests a capability tier, frontier reasoning, fast structured output, cheap classification, rather than a named model, and the gateway maps the tier to the most cost-effective model available (Chapter 15).
Session
A coherent unit of agent activity (typically a task or conversation). The unit of identity, scoping, and trace correlation.
Skill
A runtime-loaded packaged capability, folder with a SKILL.md manifest, that extends an agent on demand (Chapter 10). Subordinate to architecture; subject to admission and governance.
Small language model (SLM)
A model small enough, roughly one to several billion parameters, to run locally on a device, container, or bare-metal server. Fixed weights, no network latency, and no egress risk make it behave more like a deterministic tool than a cloud model (Chapter 15).
State hydration (suspend-and-resume)
The operational mechanism by which an agent’s working memory is written to a durable store and its compute process terminated while it waits, typically for a human-in-the-loop approval, then reloaded into fresh compute to resume the session (Chapter 7, Chapter 18). Makes long-running agentic workflows affordable on serverless or autoscaled infrastructure.
Structural hallucination
A confidently wrong result produced by a syntactically valid operation, most often a model-written query that executes cleanly but encodes a guessed business rule. Invisible to schema validation because the output is well-formed; prevented by a semantic layer (Chapter 11, Chapter 14).
Structured friction
Deliberate interface design that forces cognitive engagement before a high-stakes action, showing the execution payload and risk score and requiring explicit acknowledgments, so a human approval gate resists rubber-stamping (Chapter 13).
System prompt
The instructions the agent receives at the beginning of a session. Expresses preferences and orientation; does not enforce constraints. The architectural commitment is that constraints live outside the prompt. Relying on a system prompt to enforce limits or policy is the anti-pattern called prompt-based bounding (Chapter 5) or prompt-based governance (Chapter 6, Chapter 11), a structural mistake, because the model can ignore the prompt.
Text-to-SQL
The anti-pattern of giving a model a raw database schema and asking it to write queries directly, which forces it to guess deterministic business rules. The usual source of structural hallucination; the architectural answer is a semantic layer (Chapter 14).
Time budget
The wall-clock limit for an agentic run (Chapter 5). One of the six axes of bounded autonomy.
Tombstone (eviction)
An explicit deletion record issued by the ingestion pipeline when a source document changes, removing every chunk of the prior version rather than relying on content-keyed upsert, which leaves orphaned fragments behind (Chapter 8).
Tool
A function the agent can invoke to act on the environment. The action surface (Chapter 5) is composed of the tools available to the agent.
Tool injection
A failure mode (Chapter 11) in which a tool’s response contains content that, when read by the agent, alters its behavior against the user’s interest. Tool injection is the spark that ignites the lethal trifecta cascade: combined with sensitive data access and external action capability, an injected tool response becomes a data-exfiltration incident.
Trace
The structured record of every event in an agentic session (Chapter 12). System of record for replay, audit, and regression testing.
Trace progressive disclosure
The glass-layer mapping of granular backend trace events to a small set of human-legible milestones, with detail available on demand, avoiding both the opaque spinner and the debug-log firehose (Chapter 13).
Validator
See Schema validator or Output validator. The deterministic enforcement of correctness at the boundary between agent and world.
Working memory
Task-scoped state used during reasoning (Chapter 7). Lifecycle bounded to the task; not persisted by default.
Maintenance This glossary is maintained continuously with the rest of the book. Terms that appear in chapters but not in this glossary are either standard usage or undefined; raise undefined terms as defects.