The AI Agent node transforms n8n from an automation tool into an intelligent system that reasons, decides, and acts. Unlike a simple LLM call that processes one prompt and returns one response, an AI agent observes its environment, evaluates available tools, decides which actions to take, executes those actions, and repeats until it achieves its goal.
This distinction matters. A basic LLM node might answer “What’s the weather in Tokyo?” with a generic response or hallucinated data. An AI agent connected to a weather API tool will recognize it needs real-time data, call the weather service, and return accurate current conditions. The agent doesn’t just generate text. It solves problems.
Why Agents Change Everything
Traditional automation requires you to define every step explicitly. If-then logic handles known scenarios, but unexpected inputs break workflows. AI agents introduce adaptive decision-making. You define the goal and available tools. The agent figures out the path.
This power comes with complexity. Agents can call tools in unexpected orders, get stuck in loops, consume excessive API tokens, or produce inconsistent results. Building production-ready agents requires understanding how they work, not just connecting nodes and hoping for the best.
What You’ll Learn
- When to use the AI Agent node versus simpler LLM calls
- How the agent reasoning loop works under the hood
- Connecting and configuring tools that extend agent capabilities
- Memory options including Vector Store Memory for semantic recall
- Output parsers for structured JSON responses
- Embeddings configuration for RAG and vector search
- Prompt engineering techniques specific to agents
- Error handling patterns that prevent runaway costs
- Agent architecture patterns: routing, orchestrator, sequential, and ReAct
- Debugging techniques when agents misbehave
- Real-world examples with complete configurations
When to Use the AI Agent Node
Not every AI workflow needs an agent. Sometimes a direct LLM call is faster, cheaper, and more predictable. Use this table to decide:
| Scenario | Use AI Agent? | Better Alternative |
|---|---|---|
| Simple text generation | No | Basic LLM node for direct prompts |
| Single API call with LLM formatting | No | HTTP Request + LLM node chain |
| Multi-step task requiring decisions | Yes | Agent can choose which steps to take |
| Dynamic tool selection based on input | Yes | Agent decides which tools to use |
| Conversation with context memory | Yes | Agent with memory maintains state |
| Fixed sequential processing | No | Standard workflow with explicit nodes |
| Research requiring multiple sources | Yes | Agent can query and synthesize |
| Classification with static categories | No | LLM with structured output |
Rule of thumb: If the task requires the AI to decide what to do next based on intermediate results, use an agent. If the steps are predictable and fixed, use a standard workflow.
Agent vs Basic LLM Node
The Basic LLM node (and OpenAI/Anthropic nodes in “message” mode) process a single prompt and return a response. They cannot call external tools, remember previous messages, or make decisions about what to do next.
The AI Agent node operates in a loop:
- Receives input and evaluates the situation
- Decides which tool (if any) would help
- Calls the selected tool and observes the result
- Decides if the goal is achieved or if more actions are needed
- Repeats until done or reaches a limit
This loop is what makes agents powerful and potentially expensive. Each iteration may involve multiple LLM calls and tool invocations.
Understanding AI Agents in n8n
An AI agent is not magic. It’s a structured program that uses an LLM as its reasoning engine. Understanding the components helps you build agents that work reliably.
The Agent Architecture
Every n8n AI Agent consists of four key components:
1. Chat Model (Required) The LLM that powers reasoning. OpenAI, Anthropic, Google, or any compatible model provider. This is the “brain” that interprets inputs, decides on actions, and generates responses.
2. Tools (Required) External capabilities the agent can invoke. Without tools, an agent is just an expensive chatbot. Tools let agents search the web, query databases, call APIs, execute code, and interact with the real world.
3. Memory (Optional) Storage for conversation history and context. Without memory, each agent invocation starts fresh with no knowledge of previous interactions. Memory enables multi-turn conversations and context-aware responses.
4. System Prompt Instructions that define the agent’s behavior, personality, constraints, and goals. A well-crafted system prompt is the difference between a helpful assistant and an unpredictable liability.
The Agent Loop Explained
When you trigger an AI Agent node, here’s what happens internally:
1. RECEIVE INPUT
User message or trigger data arrives
2. THINK
LLM evaluates: "What should I do with this input?"
Considers available tools and their descriptions
3. DECIDE
LLM chooses: call a tool, respond directly, or give up
Returns structured output indicating the action
4. ACT
If tool selected: n8n executes the tool and captures result
Tool output becomes new context for next iteration
5. OBSERVE
LLM receives tool result
Evaluates: "Did this solve the problem?"
6. REPEAT OR RESPOND
If goal achieved: generate final response
If not: return to step 2 with new information
If stuck: error or fallback response
This loop can repeat multiple times per request. A research agent might search the web, read several pages, synthesize findings, and verify accuracy before responding. Each step costs API tokens and time.
LangChain Under the Hood
n8n’s AI Agent node is built on LangChain, a popular framework for building LLM applications. Understanding this helps when debugging or reading n8n logs:
- Tools Agent: The default (and only) agent type since n8n v1.82.0. Uses the LLM’s native function/tool calling capabilities.
- Tool Calling: Modern LLMs support structured tool definitions. The agent sends tool schemas to the LLM, which returns structured JSON indicating which tool to call and with what parameters. See OpenAI’s function calling documentation for how this works under the hood.
- Callbacks: LangChain provides hooks into the agent’s reasoning process, visible in n8n’s execution logs.
For the complete technical reference, see the official n8n AI Agent node documentation.
Setting Up Your First AI Agent
Let’s build a functional agent step by step. We’ll create a simple assistant that can answer questions and perform web searches.
Step 1: Add the AI Agent Node
- Open your n8n workflow
- Click + to add a node
- Search for “AI Agent”
- Select AI Agent from the results
You’ll see a node with multiple connection points on the left side. These are inputs for the required and optional components.
Step 2: Connect a Chat Model
The agent needs an LLM to think. Click the + under “Chat Model” and choose your provider:
For OpenAI:
- Select “OpenAI Chat Model”
- Create or select your OpenAI credentials
- Choose a model that supports function calling (check OpenAI’s current recommendations for agents)
- Leave temperature at 0.7 for balanced creativity
For Anthropic:
- Select “Anthropic Chat Model”
- Create or select your Anthropic credentials
- Choose a model with tool use capabilities
For self-hosted models:
- Select “Ollama Chat Model” or compatible provider
- Configure the endpoint URL
- Select your locally running model
Step 3: Add at Least One Tool
An agent without tools is just a chatbot. Click the + under “Tool” to add capabilities:
For a basic setup, add the Calculator tool:
- Click + under Tool
- Search for “Calculator”
- Select “Calculator” tool
- No configuration needed
Now your agent can perform math operations when questions require calculations.
Step 4: Configure the System Prompt
Click on the AI Agent node to open its settings. The System Message field defines your agent’s behavior:
You are a helpful assistant that answers questions accurately and concisely.
When asked math questions, use the calculator tool to compute exact answers.
When you don't know something, say so honestly rather than guessing.
Keep responses under 200 words unless the user asks for detail.
This prompt establishes:
- The agent’s role and tone
- When to use available tools
- Boundaries and constraints
- Response format expectations
Step 5: Add a Trigger and Test
Connect a trigger node to start your agent:
- Add a Chat Trigger node for interactive testing
- Connect it to the AI Agent node
- Click “Chat” in the bottom panel
- Ask: “What is 847 * 293?”
The agent should recognize this requires calculation, invoke the Calculator tool, and return the exact answer (248,171).
Common Setup Mistakes
Mistake 1: No tools connected
Error: Agent requires at least one tool
Solution: Add at least one tool sub-node to the agent
Mistake 2: Wrong credential type
Error: Invalid API key or authentication failed
Solution: Verify credentials match the selected chat model provider
Mistake 3: Model doesn’t support tool calling
Error: Tool calling not supported
Solution: Use a model that supports function calling (check your provider's documentation)
Tools: Giving Your Agent Capabilities
Tools are what make agents useful. Each tool extends what your agent can do beyond generating text. Choose tools based on what tasks your agent needs to accomplish.
Built-in Tool Nodes
n8n provides several ready-to-use tool nodes:
| Tool | Purpose | Use Case |
|---|---|---|
| Calculator | Math operations | Financial calculations, unit conversions |
| Code | Execute JavaScript/Python | Custom logic, data transformation |
| HTTP Request | Call any API | Integration with external services |
| Wikipedia | Search Wikipedia | General knowledge lookups |
| Wolfram Alpha | Computational knowledge | Scientific queries, data analysis |
| SerpAPI | Google search | Web research, current information |
| Workflow | Call n8n workflows | Delegate to specialized sub-workflows |
The Workflow Tool: Your Secret Weapon
The Workflow tool deserves special attention. It lets your agent call other n8n workflows as tools. This enables powerful patterns:
Pattern: Specialized Sub-Agents Create separate workflows for specific tasks:
lookup-customerworkflow queries your CRMcheck-inventoryworkflow queries your databasesend-notificationworkflow handles alerts
Your main agent can then call these as needed, keeping each workflow focused and maintainable.
Pattern: Permission Boundaries Sensitive operations live in separate workflows with their own credentials and logging. The agent can trigger them but doesn’t have direct access to credentials.
For setting up sub-workflows that your agent can call, see our guide on the Webhook node for creating callable endpoints.
Custom Tools with HTTP Request
The HTTP Request tool connects your agent to any API. Configure it to call your own services or third-party APIs:
- Add “HTTP Request Tool” under Tools
- Configure the request:
- Method: GET, POST, etc.
- URL: The API endpoint
- Authentication: Add credentials if needed
- Define the tool description clearly:
Search our product database.
Input: A search query string for products.
Output: JSON array of matching products with name, price, and availability.
The description tells the LLM when and how to use this tool. Be specific about inputs and outputs.
For details on configuring HTTP requests, see our HTTP Request node guide.
Tool Selection Best Practices
Start minimal. Begin with one or two tools. Add more only when needed. More tools mean more decisions for the agent, which can lead to confusion and increased costs.
Write clear descriptions. The agent decides which tool to use based on descriptions. Vague descriptions lead to poor tool selection:
Bad: "Use this for data"
Good: "Query the customer database by email address. Returns customer profile including name, purchase history, and account status."
Test tool boundaries. Verify your agent chooses the right tool for different inputs. Ask questions that should and shouldn’t trigger each tool.
Consider tool cost. Some tools are expensive (web search, premium APIs). Add usage limits or use cheaper alternatives when appropriate.
Memory: Maintaining Context Across Conversations
Without memory, every message to your agent starts a blank conversation. Memory lets agents remember previous interactions, enabling multi-turn conversations and contextual responses.
Memory Types Compared
| Memory Type | Persistence | Best For | Limitations |
|---|---|---|---|
| Simple Buffer Memory | Session only | Testing, single interactions | Lost when workflow restarts |
| Window Buffer Memory | Session only | Limited context conversations | Fixed message window |
| Postgres Chat Memory | Persistent | Production multi-turn chats | Requires database setup |
| Redis Chat Memory | Persistent | High-performance production | Requires Redis server |
| Motorhead Memory | Persistent | Managed memory service | External dependency |
| Vector Store Memory | Persistent | Semantic recall over long history | Requires embeddings and vector store setup |
Simple Buffer Memory
The easiest option for getting started:
- Click + under “Memory” on your AI Agent
- Select “Simple Memory”
- No configuration needed
Simple Memory stores conversation history in n8n’s internal memory. Messages persist during workflow execution but are lost when the workflow completes or n8n restarts.
Use for: Testing, single-session interactions, demonstrations.
Window Buffer Memory
Limits how many messages are remembered:
- Add “Window Buffer Memory”
- Set Context Window Length (e.g., 10 messages)
This prevents memory from growing infinitely. Older messages are dropped when the window fills. Useful when you want recent context without accumulating a long history.
Use for: Cost control, focused conversations, avoiding context overflow.
Postgres Chat Memory
For production applications requiring persistent memory:
- Add “Postgres Chat Memory”
- Connect your Postgres credentials
- Configure Session ID (unique per conversation)
// Dynamic session ID based on user
{{ $json.userId }}_{{ $json.conversationId }}
The session ID determines which conversation to load/save. Use a consistent ID per user or conversation to maintain continuity.
For Postgres setup guidance, see our Postgres setup guide.
Use for: Production chatbots, customer support, any application requiring conversation history.
Vector Store Memory
For agents that need semantic recall over long conversation histories or knowledge bases, Vector Store Memory provides intelligent context retrieval:
- Add “Vector Store Memory”
- Connect an Embeddings sub-node (required for converting text to vectors)
- Connect a Vector Store sub-node (Pinecone, Qdrant, Supabase, etc.)
- Configure Session ID for conversation separation
// Session ID for user-specific memory
{{ $json.userId }}_session
Unlike buffer-based memory that stores raw message history, Vector Store Memory converts messages into vector embeddings and stores them in a vector database. When retrieving context, it performs semantic search to find the most relevant past messages rather than returning the most recent ones.
Key difference from other memory types: Vector Store Memory retrieves contextually relevant messages, not just chronologically recent ones. A question about “pricing” will retrieve past pricing discussions even if they happened many messages ago.
When to use Vector Store Memory:
- Long-running conversations where recent messages aren’t always relevant
- Knowledge-base-style agents where semantic retrieval matters
- Applications requiring recall beyond typical context window limits
Requirements:
- Embeddings sub-node (see Embeddings section below)
- Vector Store sub-node (Pinecone, Qdrant, Supabase, Chroma, etc.)
- Additional infrastructure costs for vector database
When Memory Causes Problems
Memory isn’t always beneficial:
Context overflow: Very long conversations exceed the LLM’s context window, causing errors or truncation. Use Window Buffer to limit history.
Irrelevant context: Old messages about different topics confuse the agent. Consider clearing memory for new topics.
Cost multiplication: Every message in memory is sent to the LLM on each turn. Long histories increase token usage significantly.
Privacy concerns: Persistent memory stores user conversations. Ensure compliance with data retention policies.
Output Parsers: Structuring Agent Responses
When you need agents to return data in a specific format - JSON for API responses, structured data for downstream processing, or validated schemas for database insertion - output parsers enforce that structure.
Why Output Parsers Matter
LLMs generate free-form text by default. When your workflow needs structured data, you have two options: hope the LLM follows your prompt instructions, or enforce structure with an output parser. Parsers validate LLM output against a schema and reject malformed responses.
Important caveat: Output parsers work more reliably with LLM Chain nodes than with AI Agent nodes. Agents have complex reasoning loops that can interfere with structured output generation. If you need guaranteed structured output, consider routing the agent’s response through a separate LLM Chain with an output parser for the final formatting step.
Structured Output Parser
The most common parser for enforcing JSON structure. Two configuration methods:
Method 1: Generate from JSON Example Provide an example of your desired output, and the parser infers the schema:
{
"customerName": "John Doe",
"orderTotal": 149.99,
"items": ["Widget A", "Widget B"],
"priority": "high"
}
This creates a schema expecting strings, numbers, arrays, and enums matching your example structure.
Method 2: Define Using JSON Schema For precise control, define an explicit JSON Schema:
{
"type": "object",
"properties": {
"customerName": { "type": "string" },
"orderTotal": { "type": "number" },
"items": {
"type": "array",
"items": { "type": "string" }
},
"priority": {
"type": "string",
"enum": ["low", "medium", "high"]
}
},
"required": ["customerName", "orderTotal"]
}
Schema limitations: n8n’s JSON Schema implementation doesn’t support $ref for referencing other schemas. Keep schemas self-contained.
Auto-fixing Output Parser
When LLM outputs almost match your schema but have minor formatting issues, the Auto-fixing Output Parser provides resilience:
- Add “Auto-fixing Output Parser”
- Connect it to a Structured Output Parser (the parser you want to fix)
- Connect both to a Chat Model
When parsing fails, this parser sends the malformed output back to the LLM with instructions to fix the formatting. This adds an extra LLM call but recovers from many common failures.
When to use Auto-fixing:
- Production systems where occasional parsing failures are unacceptable
- Complex schemas where LLMs sometimes make minor format errors
- Situations where retry logic is preferable to hard failures
When to avoid:
- Cost-sensitive applications (each fix attempt costs tokens)
- Simple schemas that rarely fail
- Time-critical applications (retry adds latency)
Item List Output Parser
For responses that should be lists rather than complex objects:
- Add “Item List Output Parser”
- Configure the item type (strings, objects, etc.)
- Connect to your LLM
Use this when you need the LLM to return a simple array of items - tags, keywords, categories, or simple recommendations.
Common Parser Issues
“Model output doesn’t fit required format” error:
- The LLM response doesn’t match the expected schema
- Check if your prompt clearly instructs the format
- Consider using Auto-fixing Output Parser
- Simplify the schema if it’s too complex for the model to follow
Parser not working reliably with agents:
- Agent reasoning steps interfere with structured output
- Solution: Use a separate LLM Chain node after the agent for final formatting
- Pass the agent’s raw response to an LLM Chain with an output parser
Nested object validation failing:
- Deep nesting increases failure probability
- Flatten schemas where possible
- Test with simpler schemas first, then add complexity
Best Practice: Separate Reasoning from Formatting
For reliable structured output from agent workflows:
[AI Agent] → [Set node: extract response] → [LLM Chain + Output Parser] → [Output]
The agent handles reasoning and tool use. A separate LLM Chain with an output parser handles the final formatting. This separation provides more reliable results than trying to get the agent to output structured data directly.
Embeddings: Vector Representations for Semantic Search
Embeddings convert text into numerical vectors that capture semantic meaning. They’re essential for Vector Store Memory, RAG (Retrieval-Augmented Generation) workflows, and semantic search functionality.
When You Need Embeddings
Connect an Embeddings sub-node when using:
- Vector Store Memory - Requires embeddings to store and retrieve conversation context
- Vector Store nodes - For building searchable knowledge bases
- Document loaders with vector stores - Embedding documents for retrieval
Without embeddings, vector-based features won’t function. The embeddings node converts text chunks into vector representations that vector databases can index and search.
Available Embedding Providers
| Provider | Node Name | Best For |
|---|---|---|
| OpenAI | Embeddings OpenAI | Most common, reliable, good documentation |
| Azure OpenAI | Embeddings Azure OpenAI | Enterprise compliance, Azure ecosystem |
| Cohere | Embeddings Cohere | Alternative provider, competitive pricing |
| Ollama | Embeddings Ollama | Self-hosted, local deployment, privacy |
| Google AI | Embeddings Google AI | Google Cloud ecosystem integration |
| AWS Bedrock | Embeddings AWS Bedrock | AWS ecosystem, enterprise deployments |
| Mistral | Embeddings Mistral | Open-source alternative, European hosting |
| HuggingFace | Embeddings HuggingFace | Custom models, open-source flexibility |
| Google Vertex AI | Embeddings Google Vertex AI | Enterprise Google Cloud deployments |
Configuring Embeddings
For most providers, configuration involves:
- Credentials: Add your API key for the embedding provider
- Model selection: Choose the embedding model (providers offer multiple options)
- Batch size: How many texts to embed per API call (default usually works)
- Timeout: Maximum wait time for embedding requests
OpenAI example setup:
- Add “Embeddings OpenAI” sub-node
- Select or create OpenAI credentials
- Choose your embedding model from the dropdown
- Leave batch size at default unless processing large volumes
Matching Dimensions to Vector Stores
Embedding models produce vectors of specific dimensions. Your vector store must be configured for the same dimension size:
| Model Type | Typical Dimensions |
|---|---|
| Small/efficient models | 384-512 |
| Standard models | 1536 |
| Large/advanced models | 3072+ |
Critical: If you change embedding models, you must re-embed all existing data. Different models produce incompatible vector spaces.
Sub-node Behavior Note
Embeddings nodes in n8n are “sub-nodes” that always resolve to the first item when multiple inputs arrive. If you need to embed multiple documents, process them in a loop or use batch operations within a single node rather than expecting the embeddings node to handle arrays automatically.
Cost Considerations
Embedding API calls are typically cheaper than LLM calls but can add up with large document sets:
- Initial indexing: One-time cost when building your knowledge base
- Query time: Small cost per search query
- Re-indexing: Full cost again if you change embedding models
For large knowledge bases, calculate embedding costs before committing to a provider.
Prompt Engineering for Agents
The system prompt shapes agent behavior more than any other configuration. A well-designed prompt produces consistent, helpful results. A poor prompt leads to erratic behavior.
Anatomy of an Effective Agent Prompt
Structure your system prompt with these sections:
ROLE AND IDENTITY
You are [role description]. Your purpose is [primary goal].
CAPABILITIES AND TOOLS
You have access to the following tools:
- [Tool 1]: Use for [specific purpose]
- [Tool 2]: Use for [specific purpose]
BEHAVIOR GUIDELINES
- [Specific instruction 1]
- [Specific instruction 2]
- [Constraint or boundary]
RESPONSE FORMAT
[How to structure outputs]
ERROR HANDLING
If [situation], then [action].
Example: Customer Support Agent Prompt
You are a customer support agent for TechStore, an electronics retailer.
CAPABILITIES:
- order_lookup: Search orders by order ID or customer email
- product_search: Find products in our catalog
- escalate_ticket: Create a support ticket for complex issues
GUIDELINES:
- Always verify customer identity before sharing order details
- For refund requests over $500, use escalate_ticket
- Never share internal pricing or margin information
- If a product is out of stock, suggest similar alternatives
- Keep responses friendly but professional
RESPONSE FORMAT:
- Start with acknowledging the customer's question
- Provide the answer or solution
- End with asking if there's anything else you can help with
WHEN UNSURE:
- If you cannot find an order, ask the customer to verify details
- If a question is outside your scope, explain and use escalate_ticket
Common Prompt Mistakes
Too vague:
Bad: "Help users with their questions."
Good: "Answer questions about order status, returns, and product availability."
No tool guidance:
Bad: "Use tools when needed."
Good: "Use order_lookup when asked about order status. Use product_search for inventory questions."
Missing constraints:
Bad: No limits specified
Good: "Do not process refunds directly. Create an escalation ticket instead."
Conflicting instructions:
Bad: "Be concise" + "Provide detailed explanations"
Good: "Keep responses under 100 words unless the user asks for details."
Dynamic Prompts with Expressions
Use n8n expressions to customize prompts based on context:
You are assisting {{ $json.customerName }}.
Their account type is {{ $json.accountType }}.
{{ $json.accountType === 'premium' ? 'Offer priority support options.' : '' }}
For expression syntax, see our n8n expressions guide.
Error Handling and Production Patterns
The most common complaint about AI agents: they work in demos but fail in production. The difference is error handling. Production agents need graceful degradation, cost controls, and monitoring.
The #1 Mistake: No Error Handling
Most agent tutorials skip error handling entirely. In production, this causes:
- Workflows crashing on API failures
- Runaway token costs from infinite loops
- Silent failures with no debugging information
- Poor user experience on edge cases
Enable Continue On Fail
For the AI Agent node and connected tools:
- Click the node
- Go to Settings (gear icon)
- Enable Continue On Fail
This prevents a single tool failure from crashing the entire workflow. The agent receives an error message and can decide how to proceed.
Configure Timeouts
Agents can get stuck thinking. Set reasonable timeouts:
- In AI Agent settings, look for timeout options
- For HTTP Request tools, set explicit timeouts
- Consider workflow-level timeouts for overall execution
For timeout configuration, see our timeout troubleshooting guide.
Token and Cost Limits
Prevent runaway costs with these patterns:
Max iterations: Some LLM providers support limiting reasoning steps. Check your chat model settings.
Token tracking: Add a Code node to track and log token usage per execution.
Budget alerts: Set up monitoring to alert when costs exceed thresholds.
For API rate limiting strategies, see our rate limits guide.
Production Architecture: Separation of Concerns
Reddit users consistently point out that monolithic “super agents” fail in production. Instead, use this pattern:
Master Orchestrator Agent:
- Receives user input
- Determines which specialized agent to call
- Routes requests appropriately
- Handles fallbacks
Specialized Sub-Workflows:
- Each handles one domain (orders, products, support)
- Isolated error handling
- Independent testing and deployment
- Separate credentials and permissions
This pattern is more maintainable, easier to debug, and fails more gracefully than a single agent trying to do everything.
Fallback Responses
Always provide a fallback when agents fail:
// In a Code node after the agent
if ($json.error || !$json.response) {
return [{
json: {
response: "I'm having trouble processing your request. " +
"Please try again or contact support at [email protected]",
fallback: true
}
}];
}
return $input.all();
Agent Architecture Patterns
As agent workflows grow in complexity, the architecture you choose determines maintainability, reliability, and cost efficiency. Different patterns solve different problems. Choose based on your requirements, not just what seems cool.
Single Agent Pattern
The simplest architecture: one AI Agent node with multiple tools connected.
[Trigger] → [AI Agent with Tools A, B, C] → [Output]
When to use:
- Straightforward tasks with clear boundaries
- Prototyping and proof-of-concept
- Fewer than 5-6 tools needed
- Tasks where tool selection is usually obvious
Limitations:
- Tool overload confuses the agent (more tools = more decisions = more errors)
- Debugging is difficult when behavior is unexpected
- No separation of concerns
- Single point of failure
Best practice: Start here, but don’t stay here if complexity grows.
Routing Pattern (Classification-Based Branching)
Route requests to specialized handlers based on classification:
[Trigger] → [Text Classifier/LLM] → [Switch Node]
├→ [Agent A: Orders]
├→ [Agent B: Products]
└→ [Agent C: Support]
A classifier (simple LLM call or text classifier node) categorizes incoming requests. The Switch node routes to the appropriate specialized agent, each with focused tools and prompts.
When to use:
- Clear, distinct request categories
- Different expertise needed per category
- Teams owning different domains
- When you want deterministic routing
Benefits:
- Clear separation of concerns
- Each agent has focused tools and prompts
- Easier debugging (you know which path executed)
- Independent deployment per branch
Implementation tip: Use an LLM to classify into categories, not a keyword match. “I need to return my broken laptop” should route to “Returns” not get confused by “broken” or “laptop” keywords.
Orchestrator Pattern (Hierarchical Multi-Agent)
A master agent delegates to specialist agents:
[Trigger] → [Master Orchestrator Agent]
├→ [Tool: Email Agent Workflow]
├→ [Tool: Calendar Agent Workflow]
└→ [Tool: Research Agent Workflow]
The master agent has Workflow tools that call other n8n workflows, each containing their own specialized agents. The master decides which specialist to invoke based on the task.
When to use:
- Complex tasks requiring multiple specializations
- When specialists need different tool sets
- Scalable systems that grow over time
- Enterprise applications with clear domain boundaries
Setup in n8n:
- Create specialized workflows with their own AI Agent nodes
- Add Workflow tools to your master agent
- Describe each workflow tool clearly so the master knows when to use it
Example specialist descriptions:
email_agent: Handle email-related tasks including composing,
searching inbox, and managing drafts. Use for any request
involving sending or reading emails.
calendar_agent: Manage calendar events including scheduling,
rescheduling, and checking availability. Use for any request
involving meetings or time slots.
research_agent: Perform web research and gather information
from multiple sources. Use when the user needs information
that requires searching and synthesizing data.
Benefits:
- Modular and scalable
- Each specialist can be developed and tested independently
- Clear ownership and responsibility boundaries
- Easier to add new capabilities
Sequential Chain Pattern
Agents process in stages, each handling one part of a pipeline:
[Trigger] → [Agent A: Research] → [Agent B: Analysis] → [Agent C: Report] → [Output]
Each agent receives the output of the previous agent, performs its specific task, and passes results forward.
When to use:
- Multi-stage processing pipelines
- Tasks with clear sequential phases
- When each phase requires different expertise
- Content generation workflows (research → outline → write → edit)
Example flow:
- Research Agent: Gathers raw information from web search, databases
- Analysis Agent: Processes and structures the raw data
- Report Agent: Generates formatted output from analyzed data
Context passing: Use n8n expressions to pass relevant context between agents:
// System prompt for Analysis Agent
You are analyzing research results from a previous step.
Research findings: {{ $json.researchOutput }}
Your task: Extract key insights and structure them for reporting.
Considerations:
- Each stage adds latency
- Errors compound through the chain
- Earlier stages need to output structured data for later stages to consume
ReAct Pattern (Reasoning with Reflection)
An agent that explicitly reasons through problems with observation-reflection cycles:
Input → [Think] → [Act] → [Observe] → [Reflect] → [Repeat or Respond]
The ReAct (Reason + Act) pattern encourages the agent to verbalize its reasoning before acting, observe results carefully, and reflect on whether the approach is working.
Implementing in n8n: Encode ReAct in your system prompt:
Before taking any action, follow this process:
THINK: Analyze the request and state your reasoning
- What is the user asking for?
- What information do I need?
- Which tool is most appropriate?
ACT: Execute one action based on your reasoning
- Use the appropriate tool
- Be specific about parameters
OBSERVE: Examine the tool's response
- Did I get the expected data?
- Is it complete and accurate?
REFLECT: Evaluate progress
- Does this answer the user's question?
- Do I need additional information?
- Should I try a different approach?
Continue this cycle until you can provide a complete answer.
When to use:
- Complex reasoning tasks
- Multi-step problem solving
- Tasks requiring careful verification
- When reliability is more important than speed
Trade-off: More reliable results but slower and more expensive (more tokens for reasoning).
Pattern Selection Guide
| Pattern | Complexity | Best For | Avoid When |
|---|---|---|---|
| Single Agent | Low | Simple tasks, few tools, prototypes | >5 tools, complex routing needs |
| Routing | Medium | Clear category boundaries, team ownership | Fuzzy or overlapping categories |
| Orchestrator | High | Many specialists, enterprise scale | Simple single-domain tasks |
| Sequential | Medium | Pipeline processing, staged workflows | Tasks requiring iteration/backtracking |
| ReAct | High | Complex reasoning, high-stakes accuracy | Speed is priority, simple tasks |
Combining Patterns
Real-world systems often combine patterns:
- Routing + Orchestrator: Route to domain-specific orchestrators
- Sequential + ReAct: Each sequential stage uses ReAct reasoning
- Orchestrator + Sequential: Master routes to sequential pipelines
Start simple. Add architectural complexity only when you’ve validated the simpler approach doesn’t meet requirements.
Real-World Examples
Example 1: FAQ Support Agent
An agent that answers questions using a knowledge base.
Setup:
- Chat Model: OpenAI (or any provider with function calling support)
- Tools: HTTP Request (queries your FAQ API)
- Memory: Postgres Chat Memory
System Prompt:
You are a support agent for SaaS Product X.
TOOL USAGE:
Use the faq_search tool for any product questions. Search with relevant keywords.
GUIDELINES:
- Always search the FAQ before answering product questions
- If FAQ doesn't have the answer, acknowledge this and offer to escalate
- For billing questions, direct users to [email protected]
- Never make up features or capabilities
RESPONSE FORMAT:
- Answer the question directly
- Include relevant FAQ article link if available
- Ask if they need more help
FAQ Search Tool Configuration:
Method: GET
URL: https://api.yoursite.com/faq/search
Query Parameters:
q: {{ $json.searchQuery }}
Tool Description:
Search the FAQ knowledge base. Input: search keywords.
Output: Array of matching FAQ articles with title, content, and URL.
Example 2: Research Assistant
An agent that gathers information from multiple sources.
Setup:
- Chat Model: Anthropic (or any provider with strong reasoning capabilities)
- Tools: SerpAPI (web search), HTTP Request (for specific sites)
- Memory: Window Buffer (10 messages)
System Prompt:
You are a research assistant that gathers and synthesizes information.
RESEARCH PROCESS:
1. Identify key search terms for the topic
2. Use web_search to find relevant sources
3. Gather information from multiple sources
4. Synthesize findings into a coherent summary
5. Cite sources in your response
QUALITY STANDARDS:
- Verify information across multiple sources when possible
- Distinguish between facts and opinions
- Note when information may be outdated
- Acknowledge limitations in available data
OUTPUT FORMAT:
- Executive summary (2-3 sentences)
- Key findings (bullet points)
- Sources consulted (with links)
Example 3: Data Processing Agent
An agent that processes and transforms data from various sources.
Setup:
- Chat Model: Any provider with function calling support
- Tools: Code (for transformations), Workflow (for database operations)
- Memory: None (stateless processing)
System Prompt:
You are a data processing agent that transforms and analyzes data.
AVAILABLE TOOLS:
- execute_code: Run JavaScript for data transformation
- query_database: Execute read-only database queries
- export_results: Save processed data to storage
PROCESSING RULES:
- Validate data format before processing
- Handle missing or null values gracefully
- Log transformation steps for debugging
- Never modify source data directly
ERROR HANDLING:
- On invalid data, return descriptive error with problem rows
- On query failure, suggest corrected query syntax
- On timeout, recommend smaller batch sizes
For custom logic in your data processing agents, see our Code node guide.
Debugging AI Agent Workflows
When agents misbehave, systematic debugging reveals the problem faster than guessing.
Check Execution Logs
Every agent execution logs its reasoning:
- Go to Executions in n8n
- Click on the failed or unexpected execution
- Click on the AI Agent node
- Expand the output to see:
- Input received
- LLM reasoning steps
- Tool calls made
- Final response
The logs show exactly what the agent “thought” at each step.
Common Failure Patterns
Agent ignores tools:
- Check tool descriptions are clear and specific
- Verify tools are properly connected
- Test if the LLM understands when to use the tool
Agent calls wrong tool:
- Tool descriptions may be ambiguous
- Add clearer differentiation in descriptions
- Consider reducing tool count
Agent loops infinitely:
- Tool returning unclear results
- Prompt doesn’t define completion criteria
- Add max iteration limits
Agent hallucates tool results:
- Tool returning empty or error responses
- Agent not recognizing tool failures
- Add explicit error checking in prompt
Testing Strategies
Unit test each tool: Before connecting tools to agents, test them independently with known inputs.
Test edge cases: Empty inputs, invalid data, unusual formats. See how the agent handles unexpected situations.
Compare against baseline: Keep a simple version working. Compare complex agent behavior against the simple baseline.
Log everything: Add logging nodes to capture inputs, outputs, and intermediate states. Logs are essential for production debugging.
For comprehensive debugging, try our workflow debugger tool.
Pro Tips for Production Agents
1. Start Simple, Add Complexity
Build the simplest agent that could work. Add tools and capabilities only when testing reveals they’re needed. Every addition increases potential failure points.
2. Version Your Prompts
Store prompts in version control or a configuration system. When agent behavior changes unexpectedly, you can diff prompts to find what changed.
3. Monitor Costs
Track token usage per agent and per user. Set up alerts before costs become problems. Consider caching frequent queries.
4. Plan for Latency
Agent responses take time, sometimes 10-30 seconds for complex reasoning. Design UIs with loading states. Consider streaming responses if your platform supports it.
5. Document Tool Contracts
Each tool should have clear documentation:
- What inputs it accepts
- What outputs it returns
- Error conditions and responses
- Rate limits and costs
6. Use Staging Environments
Test agent changes in staging before production. Agent behavior can change subtly with prompt tweaks. Verify with real-world test cases.
For complex agent architectures and custom implementations, our workflow development services can accelerate your project. For architectural guidance, explore our n8n consulting services.
Frequently Asked Questions
How do I make my AI agent remember previous conversations?
Add a Memory node to your AI Agent. For testing, use Simple Memory which stores history in n8n’s memory during execution.
For production, use Postgres Chat Memory or Redis Chat Memory which persist conversations across sessions.
The key is the Session ID parameter. Use a consistent ID per user or conversation (like user_123_conv_456) so the agent loads the correct history. Without memory, every message starts a new conversation with no context.
Why does my agent keep calling the same tool repeatedly in a loop?
This happens when the tool returns results the agent doesn’t understand or can’t use to make progress.
Check three things:
- Verify the tool returns clear, actionable data
- Ensure your prompt explains how to interpret tool results and when the task is complete
- Add a maximum iteration limit in your chat model settings if available
The agent loop continues until the LLM decides the goal is achieved. If it never recognizes completion, it loops forever.
Can I use multiple AI agents in one workflow?
Yes, and this is often the best architecture for complex tasks.
Create separate AI Agent nodes for different responsibilities (one for classification, one for research, one for response generation). Connect them with standard n8n nodes to route data based on the first agent’s output.
You can also use the Workflow tool to let one agent call another agent’s workflow as a sub-routine. This separation of concerns makes debugging easier and failures more isolated.
How do I handle it when the LLM returns unexpected or malformed responses?
Enable Continue On Fail on your AI Agent node so errors don’t crash the workflow.
Add a Code node after the agent to validate the response structure. Check for required fields, reasonable values, and expected formats. If validation fails, route to a fallback response or retry logic.
For structured outputs, use output parsers in your chat model configuration to enforce JSON schemas. Always have a graceful fallback message for when things go wrong.
What’s the best way to limit API costs when using AI agents?
Several strategies work together:
- Limit memory: Use Window Buffer Memory with a small window (5-10 messages) to reduce context sent to the LLM
- Right-size models: Choose the smallest model that handles your use case - often significantly cheaper
- Cache queries: Store responses to frequent identical questions
- Set token limits: Configure maximum tokens in your chat model settings
- Track usage: Add a Code node to log token usage per execution and set up cost alerts
Also consider whether simpler non-agent approaches could handle common cases without LLM reasoning overhead.