Introduction
In the first article of this series, we defined what AI agents are and when to use them. Now it is time to dive into the core mechanisms that govern them. Three fundamental concepts form the backbone of every modern agent system: the OODA loop for decision structure, the ReAct pattern for reasoning-action alternation, and tool calling for interaction with the external world.
These are not abstract academic concepts: they are the implementation patterns that every developer must master to build effective, reliable, and debuggable agents. LangGraph, CrewAI, and AutoGen all implement them internally, but understanding the theoretical foundations is what distinguishes a developer who uses a framework from one who understands its architecture and can solve problems when they arise.
What You Will Learn in This Article
- The OODA loop (Observe-Orient-Decide-Act) and its application to AI systems
- The ReAct pattern: how to alternate reasoning and action in a structured way
- The tool calling mechanism: from concept to implementation
- Comparison between ReAct, Chain-of-Thought, and pure Function Calling
- Advanced patterns: Tree-of-Thought and Plan-First
- Common mistakes to avoid when designing agents
The OODA Loop: Observe-Orient-Decide-Act
The OODA loop (Observe, Orient, Decide, Act) is a decision-making framework developed by Colonel John Boyd of the US Air Force in the 1970s, originally designed to model the decision-making process of fighter pilots in high-speed, high-uncertainty situations. Boyd observed that the pilots who won dogfights were not necessarily those with the fastest aircraft, but those who completed their decision cycle more rapidly than their adversaries.
This model has proven extraordinarily versatile: from military strategy to business, from cybersecurity to artificial intelligence. Applied to agentic systems, the OODA loop provides the structural framework for understanding how an AI agent processes information and makes decisions.
The Four Phases of the OODA Loop
1. Observe
The observation phase consists of gathering data from the environment. For an AI agent, the sources of observation include:
- User input: the message, question, or instruction received
- Tool results: data returned by calls to external tools (search results, API data, file contents)
- Memory state: the context of previous interactions, information stored in long-term memory
- Environment feedback: error messages, HTTP status codes, exceptions, system metrics
- Current task state: which sub-goals have been completed, which remain to be executed
The quality of the observation phase determines the quality of the entire rest of the cycle. An agent that does not collect sufficient data or that ignores important signals (such as an HTTP 429 rate-limit status code) will produce suboptimal decisions.
2. Orient
Boyd considered this the most critical phase of the entire loop. Orientation consists of contextualizing and interpreting the data collected during observation. Having the data is not enough: you need to understand what the data means in the current context.
For an AI agent, orientation encompasses:
- Analysis: decomposing received data into meaningful components
- Pattern matching: recognizing situations similar to those already encountered
- Prioritization: establishing what is urgent, what is important, what can wait
- Context evaluation: how the new data fits into the overall picture of the task
- Information gap identification: what information is missing to make an informed decision
In practice, this phase corresponds to the LLM's reasoning: the model analyzes the complete context (system prompt, conversation history, tool results) and formulates an understanding of the current situation. The quality of the system prompt is fundamental here: a prompt that guides the model to explicitly consider the context, evaluate alternatives, and identify risks produces a much more effective orientation.
3. Decide
Based on the orientation, the agent selects the action to take. The decision may be:
- Invoke a tool: call an external function to obtain data or execute an operation
- Generate a response: produce the final output for the user
- Request additional input: ask the user for clarifications
- Reformulate the plan: abandon the current approach and try an alternative strategy
- Terminate: conclude that the objective has been achieved (or that it is not achievable with the available resources)
4. Act
The agent executes the chosen decision. This phase transforms intention into concrete action: the tool is invoked, the response is generated, the clarification request is formulated. After the action, the loop resumes from the observation phase: the agent collects the results of the action and begins a new cycle.
The OODA Loop Applied to an AI Agent
| OODA Phase | AI Agent | Technical Component |
|---|---|---|
| Observe | Gather user input, tool results, memory state | Input processing, context assembly |
| Orient | Analyze context, identify gaps, prioritize | LLM reasoning, system prompt guidance |
| Decide | Choose: tool call, final response, or request input | LLM output parsing, action selection |
| Act | Execute tool, generate output, modify state | Tool execution, response generation |
Cycle Time: The Key Metric
Boyd insisted that competitive advantage lies not in the quality of a single decision, but in the speed at which the entire cycle is completed. The same principle applies to AI agents: an agent that rapidly completes multiple OODA cycles, adapting at each iteration, is more effective than an agent that tries to achieve perfection in a single long cycle.
In practical terms, this means designing agents with atomic and fast tools rather than monolithic and slow ones. A tool that executes a single fast query and returns results in milliseconds allows the agent to iterate rapidly; a tool that aggregates data from 10 different sources and takes 30 seconds blocks the entire cycle.
OODA in Practice: Designing for Speed
When building AI agents, treat cycle time as a primary design constraint. Each tool should follow the Single Responsibility Principle: do one thing, do it fast, and return structured results. This allows the agent to chain multiple rapid observations rather than waiting for one slow, all-encompassing operation.
Consider this rule of thumb: if a tool takes more than 5 seconds to return, consider splitting it into smaller, composable tools. The agent's reasoning engine is smart enough to chain them together, and the overall system becomes more resilient and debuggable.
The ReAct Pattern: Reasoning + Acting
The ReAct (Reasoning + Acting) pattern, introduced in 2022 by researchers at Princeton and Google in the paper "ReAct: Synergizing Reasoning and Acting in Language Models", represents a fundamental breakthrough in AI agent design. Before ReAct, two separate approaches existed: Chain-of-Thought (CoT), which improved the model's reasoning capabilities but did not include external actions, and action planning systems, which executed actions but without transparent reasoning.
ReAct unifies these two approaches into a single structured loop where the model explicitly alternates between thinking phases and action phases:
Thought: I need to find global EV sales data for 2025.
I will search for the most recent aggregated data first.
Action: web_search("global electric vehicle sales 2025 data")
Observation: Results show that 18.2 million EVs were sold
globally in 2025, a 25% increase over 2024...
Thought: I have the aggregated data. Now I need the breakdown
by manufacturer to identify market leaders.
Action: web_search("top EV manufacturers market share 2025")
Observation: BYD leads with 23% market share, followed by Tesla
at 18%, Hyundai-Kia at 9%...
Thought: I have enough data for the first part of the report.
I can now synthesize the collected information
into a structured format for the user.
Action: generate_response(structured report with data and trends)
The Structure of the ReAct Loop
The ReAct loop follows a rigorous cyclic pattern with three repeating components:
- Thought: the model explicitly articulates its reasoning. What does it know so far? What is it missing? Which tool should it use and why? The thought is visible in the agent's trace, making the behavior transparent and debuggable.
- Action: based on the thought, the model selects a tool to invoke with specific parameters. The action is concrete and verifiable: you can check which tool was called, with which arguments, and whether the call succeeded.
- Observation: the result of the action is returned to the model. The observation becomes part of the context for the next thought, creating a continuous feedback loop that guides the reasoning toward the objective.
ReAct Pseudocode Implementation
def react_loop(user_query, tools, max_iterations=10):
context = [system_prompt, user_query]
for i in range(max_iterations):
# The model generates Thought + Action
response = llm.generate(context)
# Parse the response
thought = extract_thought(response)
action = extract_action(response)
# If the action is "final_answer", terminate the loop
if action.name == "final_answer":
return action.arguments["answer"]
# Execute the tool
try:
observation = execute_tool(action.name, action.arguments)
except ToolError as e:
observation = f"Error: {str(e)}"
# Add thought, action, and observation to context
context.append(f"Thought: {thought}")
context.append(f"Action: {action.name}({action.arguments})")
context.append(f"Observation: {observation}")
# Iteration limit reached
return "Unable to complete the task within the iteration limit"
A Complete ReAct Implementation in Python
Let us build a more realistic ReAct loop that integrates with an actual LLM API and demonstrates proper tool registration, execution, and error handling:
import json
import re
from typing import Callable, Any
from openai import OpenAI
class Tool:
"""Represents a callable tool for the agent."""
def __init__(self, name: str, description: str,
parameters: dict, func: Callable):
self.name = name
self.description = description
self.parameters = parameters
self.func = func
def execute(self, **kwargs) -> str:
try:
result = self.func(**kwargs)
return json.dumps(result, indent=2)
except Exception as e:
return json.dumps({"error": str(e)})
def to_schema(self) -> dict:
return {
"type": "function",
"function": {
"name": self.name,
"description": self.description,
"parameters": self.parameters
}
}
class ReActAgent:
"""A ReAct agent that alternates reasoning and action."""
def __init__(self, model: str = "gpt-4o",
max_iterations: int = 10):
self.client = OpenAI()
self.model = model
self.max_iterations = max_iterations
self.tools: dict[str, Tool] = {}
def register_tool(self, tool: Tool):
self.tools[tool.name] = tool
def run(self, user_query: str) -> str:
messages = [
{"role": "system", "content": self._system_prompt()},
{"role": "user", "content": user_query}
]
for iteration in range(self.max_iterations):
# Call LLM with tool definitions
response = self.client.chat.completions.create(
model=self.model,
messages=messages,
tools=[t.to_schema() for t in self.tools.values()],
tool_choice="auto"
)
message = response.choices[0].message
# If no tool calls, we have the final answer
if not message.tool_calls:
return message.content
# Process each tool call
messages.append(message)
for tool_call in message.tool_calls:
tool_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)
# Execute the tool
if tool_name in self.tools:
result = self.tools[tool_name].execute(**arguments)
else:
result = json.dumps({
"error": f"Unknown tool: {tool_name}"
})
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result
})
return "Maximum iterations reached without resolution."
def _system_prompt(self) -> str:
return """You are a helpful research assistant.
Think step by step before taking action.
Use tools when you need real-world data.
Always verify information from multiple sources.
When you have enough information, provide a clear answer."""
Why ReAct Works Better Than Reasoning Alone
The fundamental advantage of ReAct over pure Chain-of-Thought is that the model is not forced to reason solely based on its internal knowledge (which may be outdated, incomplete, or incorrect). Through actions, the model can acquire fresh information from the real world and use it to refine its reasoning.
The concrete benefits of the ReAct pattern include:
- Reasoning transparency: every logical step is visible in the trace, facilitating debugging and post-mortem analysis when the agent produces unexpected results
- Grounding in facts: observations from tools anchor the reasoning to real data, reducing the model's hallucinations and confabulations
- Adaptability: the agent can change strategy in real time based on action results, without being bound to a rigid plan
- Superior performance: empirical studies demonstrate that ReAct outperforms both CoT alone and action planning alone on reasoning and decision-making benchmarks
Tool Calling: The Bridge Between AI and the Real World
Tool calling (also called "function calling" by OpenAI) is the mechanism through which an AI agent concretely interacts with the external world. Without tool calling, an LLM can only generate text; with tool calling, it can perform web searches, query databases, create files, send emails, call APIs, and much more.
The mechanism works in three fundamental steps:
- The model generates a structured tool call request: instead of producing free-form text, the model generates a JSON object specifying which tool to invoke and with which parameters. The model has been trained to produce this format when it determines that an external action is necessary.
- The runtime executes the corresponding function: the framework (LangGraph, CrewAI, or a custom runtime) receives the request, validates the parameters, executes the function, and captures the result.
- The result is returned to the model: the tool's output is added to the conversation context as an "observation," and the model can continue reasoning with the new information.
Tool Definition: The Schema
Every tool must be defined with a clear schema that the model can understand. The schema includes three fundamental elements: the tool name, a natural language description (which the LLM uses to understand when to use it), and the parameter specification (which the LLM uses to understand how to use it).
{
"name": "search_database",
"description": "Search records in the customer database. Use this tool when the user asks for information about specific customers, orders, or historical data. Supports searching by name, email, or order ID.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search term (customer name, email, or order ID)"
},
"filter_type": {
"type": "string",
"enum": ["name", "email", "order_id"],
"description": "The type of field to filter on"
},
"limit": {
"type": "integer",
"default": 10,
"description": "Maximum number of results to return"
}
},
"required": ["query", "filter_type"]
}
}
Tool Call Flow
Let us examine the complete flow of a tool call, from the moment the model decides to use a tool until the result is returned:
1. The user asks: "Find the orders for customer Mario Rossi"
|
v
2. The LLM analyzes the request and the available tools
|
v
3. The LLM generates a structured tool call:
{
"tool": "search_database",
"arguments": {
"query": "Mario Rossi",
"filter_type": "name",
"limit": 5
}
}
|
v
4. The runtime:
a) Validates arguments against the tool schema
b) Executes search_database("Mario Rossi", "name", 5)
c) Captures the result
|
v
5. The result is returned to the LLM:
{
"results": [
{"order_id": "ORD-2024-1542", "date": "2024-03-15", "total": 249.99},
{"order_id": "ORD-2024-1897", "date": "2024-06-22", "total": 89.50}
],
"total_count": 2
}
|
v
6. The LLM generates the final response for the user:
"I found 2 orders for Mario Rossi:
- ORD-2024-1542 from 03/15/2024: EUR 249.99
- ORD-2024-1897 from 06/22/2024: EUR 89.50"
The Tool Description: The Most Important Element
The tool description is the most critical element of the entire tool calling mechanism, yet it is often overlooked. The LLM does not execute code: it reads the natural language description and decides whether and when to invoke the tool based solely on that description.
An effective description must answer three questions:
- What does the tool do? A clear, concise explanation of the functionality
- When to use it? The specific use cases where the tool is appropriate (and implicitly, when NOT to use it)
- How to use it? Information about parameters, constraints, expected formats
Example: Good vs Bad Description
Bad description: "Search the database."
Too vague. The LLM does not know which database, what type of data it contains, when it is appropriate to use this tool vs. others available, or what format the query should take.
Good description: "Search records in the corporate customer database. Use this tool when the user asks for information about customers, orders, or invoices. Supports searching by customer name (partial or complete), email, order ID (format ORD-YYYY-NNNN), or invoice number. Returns a maximum of 50 results sorted by date descending."
Specific, contextual, and includes information about formats and limits that help the LLM generate correct parameters.
Building Tools with Python Decorators
In modern frameworks, tools are typically defined using decorators that automatically extract the schema from type hints and docstrings. Here is a practical example using the pattern adopted by LangChain:
from langchain_core.tools import tool
from pydantic import BaseModel, Field
from typing import Optional
class SearchInput(BaseModel):
"""Input schema for the customer search tool."""
query: str = Field(description="Search term: name, email, or order ID")
filter_type: str = Field(
description="Field to filter on",
enum=["name", "email", "order_id"]
)
max_results: int = Field(
default=10,
description="Maximum number of results (1-50)"
)
@tool("search_customers", args_schema=SearchInput)
def search_customers(query: str, filter_type: str,
max_results: int = 10) -> dict:
"""Search records in the customer database.
Use this tool when the user asks about specific customers,
their orders, or historical purchase data. Supports partial
name matching, exact email lookup, and order ID search
(format: ORD-YYYY-NNNN).
"""
# Database query logic here
results = db.customers.find(
{filter_type: {"$regex": query, "$options": "i"}},
limit=max_results
)
return {
"results": [r.to_dict() for r in results],
"total_count": len(results),
"query_metadata": {
"filter": filter_type,
"term": query
}
}
# Registering multiple tools with an agent
tools = [search_customers, calculate_metrics, send_notification]
agent = create_react_agent(llm, tools)
The Complete Agent Loop: OODA + ReAct + Tool Calling
The three concepts presented so far are not alternatives but rather complementary and interconnected. OODA provides the high-level decision framework, ReAct implements the reasoning-action pattern within that framework, and tool calling is the technical mechanism that makes actions possible.
How the Three Concepts Integrate
| Level | Concept | Role |
|---|---|---|
| Strategic | OODA Loop | Decision framework: how to structure the reasoning and action cycle |
| Tactical | ReAct Pattern | Implementation pattern: how to alternate thought and action explicitly |
| Operational | Tool Calling | Technical mechanism: how the agent concretely executes external actions |
In a LangGraph agent, for example, the state graph implements the OODA loop (each node corresponds to a phase), the reasoning nodes follow the ReAct pattern (Thought-Action-Observation), and actions materialize through tool calling (the model generates tool call JSON, the runtime executes them). Understanding all three levels is essential for designing agents that are simultaneously effective, debuggable, and maintainable.
The Unified Agent Cycle in Code
Here is a complete implementation that explicitly maps OODA phases to the ReAct loop with tool calling, showing how the three concepts work together:
class OODAReActAgent:
"""Agent that explicitly maps OODA phases to ReAct steps."""
def __init__(self, llm, tools: list[Tool], max_cycles: int = 15):
self.llm = llm
self.tools = {t.name: t for t in tools}
self.max_cycles = max_cycles
self.trace = [] # Full execution trace
def run(self, task: str) -> dict:
state = {
"task": task,
"observations": [],
"actions_taken": [],
"current_plan": None,
"completed": False
}
for cycle in range(self.max_cycles):
# === OBSERVE (OODA Phase 1) ===
context = self._assemble_context(state)
# === ORIENT (OODA Phase 2) ===
# The LLM analyzes context and produces a Thought
thought = self._reason(context)
self.trace.append({"phase": "orient", "thought": thought})
# === DECIDE (OODA Phase 3) ===
decision = self._decide(thought, context)
self.trace.append({"phase": "decide", "decision": decision})
if decision["type"] == "final_answer":
state["completed"] = True
return {
"answer": decision["content"],
"cycles": cycle + 1,
"trace": self.trace
}
# === ACT (OODA Phase 4) ===
tool_name = decision["tool"]
tool_args = decision["arguments"]
observation = self._execute_tool(tool_name, tool_args)
self.trace.append({
"phase": "act",
"tool": tool_name,
"result_preview": observation[:200]
})
# Feed observation back into state for next OODA cycle
state["observations"].append(observation)
state["actions_taken"].append({
"tool": tool_name,
"args": tool_args
})
return {
"answer": "Task could not be completed within cycle limit.",
"cycles": self.max_cycles,
"trace": self.trace
}
def _execute_tool(self, name: str, args: dict) -> str:
if name not in self.tools:
return f"Error: tool '{name}' not found"
try:
return self.tools[name].execute(**args)
except Exception as e:
return f"Tool execution error: {str(e)}"
Chain of Thought vs ReAct: Comparison
To consolidate the understanding of the patterns presented, let us compare them in a table that highlights their structural differences, advantages, and limitations:
Reasoning Pattern Comparison Table
| Feature | Chain-of-Thought (CoT) | Pure Function Calling | ReAct |
|---|---|---|---|
| Explicit reasoning | Yes (step-by-step) | No (implicit) | Yes (visible Thought) |
| External actions | No | Yes | Yes |
| Access to fresh data | No (training data only) | Yes (via tool) | Yes (via tool) |
| Debuggability | High (visible trace) | Medium (tool I/O only) | Very high (thought + action) |
| Hallucination risk | High (no grounding) | Low (real data) | Low (grounding + reasoning) |
| Token cost | Medium | Low | High (thought + action + obs) |
| Ideal use case | Mathematical reasoning, logic | Simple tasks with APIs | Complex multi-step tasks |
| Adopted by | Advanced prompt engineering | OpenAI Function Calling | LangGraph, modern agents |
The choice of pattern depends on the use case. For purely cognitive tasks (mathematical reasoning, logical analysis), Chain-of-Thought is sufficient. For simple tasks that require a single external action (calling an API, querying a database), pure function calling is efficient and cost-effective. For complex tasks that require iterative reasoning with multiple actions, ReAct is the gold standard.
When CoT Fails: The Hallucination Trap
A critical failure mode of pure Chain-of-Thought is that the model may reason flawlessly about incorrect premises. If the model's training data contains outdated information, it will produce logically sound but factually wrong conclusions. ReAct mitigates this by grounding each reasoning step in real-world observations from tools.
For example, a CoT model asked about current stock prices might reason perfectly but use prices from its training data. A ReAct agent would invoke a stock price API, obtain real-time data, and reason about actual current values.
Planning Strategies: Plan-Then-Execute vs Interleaved
Beyond the fundamental patterns, research in AI agents has produced more sophisticated approaches to improve reasoning quality and planning capabilities. Two patterns deserve particular attention.
Tree-of-Thought (ToT)
Tree-of-Thought, proposed in 2023, extends Chain-of-Thought from a linear chain to a tree structure. Instead of following a single reasoning path, the model explores multiple alternatives in parallel, evaluates each branch, and selects the most promising one.
In practice, this means the agent:
- Generates several possible strategies for solving the problem
- Evaluates each strategy with a feasibility and success probability score
- Explores the most promising branch while keeping others as fallbacks
- If the chosen branch fails, backtracks and tries an alternative
class TreeOfThoughtPlanner:
"""Implements Tree-of-Thought for multi-path exploration."""
def __init__(self, llm, breadth: int = 3, depth: int = 5):
self.llm = llm
self.breadth = breadth # Number of branches per node
self.depth = depth # Maximum tree depth
def solve(self, problem: str) -> str:
root = {"problem": problem, "thoughts": [], "score": 0}
best_solution = self._explore(root, depth=0)
return best_solution
def _explore(self, node: dict, depth: int) -> str:
if depth >= self.depth:
return self._evaluate_leaf(node)
# Generate multiple candidate thoughts
candidates = self._generate_thoughts(node, self.breadth)
# Score each candidate
scored = []
for candidate in candidates:
score = self._evaluate_thought(candidate, node)
scored.append((candidate, score))
# Sort by score descending, explore best first
scored.sort(key=lambda x: x[1], reverse=True)
for candidate, score in scored:
child = {
"problem": node["problem"],
"thoughts": node["thoughts"] + [candidate],
"score": score
}
result = self._explore(child, depth + 1)
if self._is_solution(result):
return result
return None # Backtrack
def _generate_thoughts(self, node: dict, n: int) -> list:
prompt = f"""Given the problem: {node['problem']}
And the reasoning so far: {node['thoughts']}
Generate {n} different possible next steps."""
response = self.llm.generate(prompt)
return self._parse_thoughts(response)
ToT is particularly effective for problems with non-obvious solutions (puzzles, strategic planning, decisions with complex trade-offs), but it has a high computational cost because it requires multiple generations for each node of the tree.
Plan-First (Anticipatory Planning)
The Plan-First approach inverts the logic of traditional ReAct. Instead of alternating thought and action incrementally, the agent formulates a complete plan before executing any action, then executes the plan step by step, adapting it if necessary.
The Plan-First flow follows this structure:
[PHASE 1: PLANNING]
Input: "Create a report on the team's Q4 2025 performance"
Plan generated by the agent:
Step 1: Retrieve Q4 2025 completed sprint data
Step 2: Calculate average velocity and trends
Step 3: Identify the most frequent blockers
Step 4: Generate charts for key metrics
Step 5: Synthesize the report in a structured format
[PHASE 2: EXECUTION]
Execute the plan step-by-step, with the ability to:
- Re-plan if a step fails
- Add intermediate steps if necessary
- Skip steps that are no longer relevant
class PlanFirstAgent:
"""Agent that plans before executing."""
def __init__(self, llm, tools: list):
self.llm = llm
self.tools = tools
def run(self, task: str) -> str:
# Phase 1: Generate the plan
plan = self._create_plan(task)
results = []
for i, step in enumerate(plan):
# Execute each step
result = self._execute_step(step)
if result.get("error"):
# Re-plan remaining steps if one fails
remaining = plan[i + 1:]
plan = plan[:i + 1] + self._replan(
task, results, step, result["error"], remaining
)
else:
results.append(result)
# Phase 3: Synthesize final answer
return self._synthesize(task, results)
def _create_plan(self, task: str) -> list[dict]:
prompt = f"""Create a step-by-step plan for: {task}
Each step should specify:
- action: the tool to use
- purpose: why this step is needed
- dependencies: which previous steps it depends on
Return as a JSON array."""
response = self.llm.generate(prompt)
return json.loads(response)
def _replan(self, task, completed, failed_step,
error, remaining) -> list[dict]:
prompt = f"""The plan for '{task}' failed at step:
{json.dumps(failed_step)}
Error: {error}
Completed steps: {json.dumps(completed)}
Remaining steps: {json.dumps(remaining)}
Create a revised plan for the remaining work."""
response = self.llm.generate(prompt)
return json.loads(response)
Plan-First is advantageous for long and complex tasks where an overview is fundamental. The risk is that the initial plan may be based on incorrect assumptions that only emerge during execution, requiring costly re-planning. For this reason, modern implementations combine Plan-First with ReAct's flexibility: plan at the beginning, but maintain the ability to adapt the plan at each iteration.
Practical Patterns: Observer, Reflection, and Self-Correction
Beyond the core reasoning frameworks, several production-grade patterns have emerged that significantly improve agent reliability. These patterns address common failure modes and make agents more robust in real-world scenarios.
The Observer Pattern
The Observer pattern adds a monitoring layer that watches the agent's behavior and intervenes when anomalies are detected. This is analogous to a supervisor who does not perform the work but ensures quality and safety.
class AgentObserver:
"""Monitors agent behavior for anomalies."""
def __init__(self, max_cost: float = 5.0,
max_tool_calls: int = 50,
forbidden_actions: list[str] = None):
self.total_cost = 0.0
self.tool_call_count = 0
self.max_cost = max_cost
self.max_tool_calls = max_tool_calls
self.forbidden_actions = forbidden_actions or []
self.action_history = []
def check_action(self, action: dict) -> dict:
"""Validate an action before execution."""
# Check forbidden actions
if action["tool"] in self.forbidden_actions:
return {
"allowed": False,
"reason": f"Tool '{action['tool']}' is forbidden"
}
# Check cost budget
estimated_cost = self._estimate_cost(action)
if self.total_cost + estimated_cost > self.max_cost:
return {
"allowed": False,
"reason": f"Cost budget exceeded: "
f"






