n8n AI Agent Node

The AI Agent node transforms n8n from an automation tool into an intelligent system that reasons, decides, and acts. Unlike a simple LLM call that processes one prompt and returns one response, an AI agent observes its environment, evaluates available tools, decides which actions to take, executes those actions, and repeats until it achieves its goal.

This distinction matters. A basic LLM node might answer “What’s the weather in Tokyo?” with a generic response or hallucinated data. An AI agent connected to a weather API tool will recognize it needs real-time data, call the weather service, and return accurate current conditions. The agent doesn’t just generate text. It solves problems.

Why Agents Change Everything

Traditional automation requires you to define every step explicitly. If-then logic handles known scenarios, but unexpected inputs break workflows. AI agents introduce adaptive decision-making. You define the goal and available tools. The agent figures out the path.

This power comes with complexity. Agents can call tools in unexpected orders, get stuck in loops, consume excessive API tokens, or produce inconsistent results. Building production-ready agents requires understanding how they work, not just connecting nodes and hoping for the best.

What You’ll Learn

When to use the AI Agent node versus simpler LLM calls
How the agent reasoning loop works under the hood
Connecting and configuring tools that extend agent capabilities
Memory options including Vector Store Memory for semantic recall
Output parsers for structured JSON responses
Embeddings configuration for RAG and vector search
Prompt engineering techniques specific to agents
Error handling patterns that prevent runaway costs
Agent architecture patterns: routing, orchestrator, sequential, and ReAct
Debugging techniques when agents misbehave
Real-world examples with complete configurations

When to Use the AI Agent Node

Not every AI workflow needs an agent. Sometimes a direct LLM call is faster, cheaper, and more predictable. Use this table to decide:

Scenario	Use AI Agent?	Better Alternative
Simple text generation	No	Basic LLM node for direct prompts
Single API call with LLM formatting	No	HTTP Request + LLM node chain
Multi-step task requiring decisions	Yes	Agent can choose which steps to take
Dynamic tool selection based on input	Yes	Agent decides which tools to use
Conversation with context memory	Yes	Agent with memory maintains state
Fixed sequential processing	No	Standard workflow with explicit nodes
Research requiring multiple sources	Yes	Agent can query and synthesize
Classification with static categories	No	LLM with structured output

Rule of thumb: If the task requires the AI to decide what to do next based on intermediate results, use an agent. If the steps are predictable and fixed, use a standard workflow.

Agent vs Basic LLM Node

The Basic LLM node (and OpenAI/Anthropic nodes in “message” mode) process a single prompt and return a response. They cannot call external tools, remember previous messages, or make decisions about what to do next.

The AI Agent node operates in a loop:

Receives input and evaluates the situation
Decides which tool (if any) would help
Calls the selected tool and observes the result
Decides if the goal is achieved or if more actions are needed
Repeats until done or reaches a limit

This loop is what makes agents powerful and potentially expensive. Each iteration may involve multiple LLM calls and tool invocations.

Understanding AI Agents in n8n

An AI agent is not magic. It’s a structured program that uses an LLM as its reasoning engine. Understanding the components helps you build agents that work reliably.

The Agent Architecture

Every n8n AI Agent consists of four key components:

1. Chat Model (Required) The LLM that powers reasoning. OpenAI, Anthropic, Google, or any compatible model provider. This is the “brain” that interprets inputs, decides on actions, and generates responses.

2. Tools (Required) External capabilities the agent can invoke. Without tools, an agent is just an expensive chatbot. Tools let agents search the web, query databases, call APIs, execute code, and interact with the real world.

3. Memory (Optional) Storage for conversation history and context. Without memory, each agent invocation starts fresh with no knowledge of previous interactions. Memory enables multi-turn conversations and context-aware responses.

4. System Prompt Instructions that define the agent’s behavior, personality, constraints, and goals. A well-crafted system prompt is the difference between a helpful assistant and an unpredictable liability.

The Agent Loop Explained

When you trigger an AI Agent node, here’s what happens internally:

1. RECEIVE INPUT
   User message or trigger data arrives

2. THINK
   LLM evaluates: "What should I do with this input?"
   Considers available tools and their descriptions

3. DECIDE
   LLM chooses: call a tool, respond directly, or give up
   Returns structured output indicating the action

4. ACT
   If tool selected: n8n executes the tool and captures result
   Tool output becomes new context for next iteration

5. OBSERVE
   LLM receives tool result
   Evaluates: "Did this solve the problem?"

6. REPEAT OR RESPOND
   If goal achieved: generate final response
   If not: return to step 2 with new information
   If stuck: error or fallback response

This loop can repeat multiple times per request. A research agent might search the web, read several pages, synthesize findings, and verify accuracy before responding. Each step costs API tokens and time.

LangChain Under the Hood

n8n’s AI Agent node is built on LangChain, a popular framework for building LLM applications. Understanding this helps when debugging or reading n8n logs:

Tools Agent: The default (and only) agent type since n8n v1.82.0. Uses the LLM’s native function/tool calling capabilities.
Tool Calling: Modern LLMs support structured tool definitions. The agent sends tool schemas to the LLM, which returns structured JSON indicating which tool to call and with what parameters. See OpenAI’s function calling documentation for how this works under the hood.
Callbacks: LangChain provides hooks into the agent’s reasoning process, visible in n8n’s execution logs.

For the complete technical reference, see the official n8n AI Agent node documentation.

Setting Up Your First AI Agent

Let’s build a functional agent step by step. We’ll create a simple assistant that can answer questions and perform web searches.

Step 1: Add the AI Agent Node

Open your n8n workflow
Click + to add a node
Search for “AI Agent”
Select AI Agent from the results

You’ll see a node with multiple connection points on the left side. These are inputs for the required and optional components.

Step 2: Connect a Chat Model

The agent needs an LLM to think. Click the + under “Chat Model” and choose your provider:

For OpenAI:

Select “OpenAI Chat Model”
Create or select your OpenAI credentials
Choose a model that supports function calling (check OpenAI’s current recommendations for agents)
Leave temperature at 0.7 for balanced creativity

For Anthropic:

Select “Anthropic Chat Model”
Create or select your Anthropic credentials
Choose a model with tool use capabilities

For self-hosted models:

Select “Ollama Chat Model” or compatible provider
Configure the endpoint URL
Select your locally running model

Step 3: Add at Least One Tool

An agent without tools is just a chatbot. Click the + under “Tool” to add capabilities:

For a basic setup, add the Calculator tool:

Click + under Tool
Search for “Calculator”
Select “Calculator” tool
No configuration needed

Now your agent can perform math operations when questions require calculations.

Step 4: Configure the System Prompt

Click on the AI Agent node to open its settings. The System Message field defines your agent’s behavior:

You are a helpful assistant that answers questions accurately and concisely.

When asked math questions, use the calculator tool to compute exact answers.
When you don't know something, say so honestly rather than guessing.
Keep responses under 200 words unless the user asks for detail.

This prompt establishes:

The agent’s role and tone
When to use available tools
Boundaries and constraints
Response format expectations

Step 5: Add a Trigger and Test

Connect a trigger node to start your agent:

Add a Chat Trigger node for interactive testing
Connect it to the AI Agent node
Click “Chat” in the bottom panel
Ask: “What is 847 * 293?”

The agent should recognize this requires calculation, invoke the Calculator tool, and return the exact answer (248,171).

Common Setup Mistakes

Mistake 1: No tools connected

Error: Agent requires at least one tool
Solution: Add at least one tool sub-node to the agent

Mistake 2: Wrong credential type

Error: Invalid API key or authentication failed
Solution: Verify credentials match the selected chat model provider

Mistake 3: Model doesn’t support tool calling

Error: Tool calling not supported
Solution: Use a model that supports function calling (check your provider's documentation)

Tools: Giving Your Agent Capabilities

Tools are what make agents useful. Each tool extends what your agent can do beyond generating text. Choose tools based on what tasks your agent needs to accomplish.

Built-in Tool Nodes

n8n provides several ready-to-use tool nodes:

Tool	Purpose	Use Case
Calculator	Math operations	Financial calculations, unit conversions
Code	Execute JavaScript/Python	Custom logic, data transformation
HTTP Request	Call any API	Integration with external services
Wikipedia	Search Wikipedia	General knowledge lookups
Wolfram Alpha	Computational knowledge	Scientific queries, data analysis
SerpAPI	Google search	Web research, current information
Workflow	Call n8n workflows	Delegate to specialized sub-workflows

The Workflow Tool: Your Secret Weapon

The Workflow tool deserves special attention. It lets your agent call other n8n workflows as tools. This enables powerful patterns:

Pattern: Specialized Sub-Agents Create separate workflows for specific tasks:

lookup-customer workflow queries your CRM
check-inventory workflow queries your database
send-notification workflow handles alerts

Your main agent can then call these as needed, keeping each workflow focused and maintainable.

Pattern: Permission Boundaries Sensitive operations live in separate workflows with their own credentials and logging. The agent can trigger them but doesn’t have direct access to credentials.

For setting up sub-workflows that your agent can call, see our guide on the Webhook node for creating callable endpoints.

Custom Tools with HTTP Request

The HTTP Request tool connects your agent to any API. Configure it to call your own services or third-party APIs:

Add “HTTP Request Tool” under Tools
Configure the request:
- Method: GET, POST, etc.
- URL: The API endpoint
- Authentication: Add credentials if needed
Define the tool description clearly:

Search our product database.
Input: A search query string for products.
Output: JSON array of matching products with name, price, and availability.

The description tells the LLM when and how to use this tool. Be specific about inputs and outputs.

For details on configuring HTTP requests, see our HTTP Request node guide.

Tool Selection Best Practices

Start minimal. Begin with one or two tools. Add more only when needed. More tools mean more decisions for the agent, which can lead to confusion and increased costs.

Write clear descriptions. The agent decides which tool to use based on descriptions. Vague descriptions lead to poor tool selection:

Bad: "Use this for data"
Good: "Query the customer database by email address. Returns customer profile including name, purchase history, and account status."

Test tool boundaries. Verify your agent chooses the right tool for different inputs. Ask questions that should and shouldn’t trigger each tool.

Consider tool cost. Some tools are expensive (web search, premium APIs). Add usage limits or use cheaper alternatives when appropriate.

Memory: Maintaining Context Across Conversations

Without memory, every message to your agent starts a blank conversation. Memory lets agents remember previous interactions, enabling multi-turn conversations and contextual responses.

Memory Types Compared

Memory Type	Persistence	Best For	Limitations
Simple Buffer Memory	Session only	Testing, single interactions	Lost when workflow restarts
Window Buffer Memory	Session only	Limited context conversations	Fixed message window
Postgres Chat Memory	Persistent	Production multi-turn chats	Requires database setup
Redis Chat Memory	Persistent	High-performance production	Requires Redis server
Motorhead Memory	Persistent	Managed memory service	External dependency
Vector Store Memory	Persistent	Semantic recall over long history	Requires embeddings and vector store setup

Simple Buffer Memory

The easiest option for getting started:

Click + under “Memory” on your AI Agent
Select “Simple Memory”
No configuration needed

Simple Memory stores conversation history in n8n’s internal memory. Messages persist during workflow execution but are lost when the workflow completes or n8n restarts.

Use for: Testing, single-session interactions, demonstrations.

Window Buffer Memory

Limits how many messages are remembered:

Add “Window Buffer Memory”
Set Context Window Length (e.g., 10 messages)

This prevents memory from growing infinitely. Older messages are dropped when the window fills. Useful when you want recent context without accumulating a long history.

Use for: Cost control, focused conversations, avoiding context overflow.

Postgres Chat Memory

For production applications requiring persistent memory:

Add “Postgres Chat Memory”
Connect your Postgres credentials
Configure Session ID (unique per conversation)

// Dynamic session ID based on user
{{ $json.userId }}_{{ $json.conversationId }}

The session ID determines which conversation to load/save. Use a consistent ID per user or conversation to maintain continuity.

For Postgres setup guidance, see our Postgres setup guide.

Use for: Production chatbots, customer support, any application requiring conversation history.

Vector Store Memory

For agents that need semantic recall over long conversation histories or knowledge bases, Vector Store Memory provides intelligent context retrieval:

Add “Vector Store Memory”
Connect an Embeddings sub-node (required for converting text to vectors)
Connect a Vector Store sub-node (Pinecone, Qdrant, Supabase, etc.)
Configure Session ID for conversation separation

// Session ID for user-specific memory
{{ $json.userId }}_session

Unlike buffer-based memory that stores raw message history, Vector Store Memory converts messages into vector embeddings and stores them in a vector database. When retrieving context, it performs semantic search to find the most relevant past messages rather than returning the most recent ones.

Key difference from other memory types: Vector Store Memory retrieves contextually relevant messages, not just chronologically recent ones. A question about “pricing” will retrieve past pricing discussions even if they happened many messages ago.

When to use Vector Store Memory:

Long-running conversations where recent messages aren’t always relevant
Knowledge-base-style agents where semantic retrieval matters
Applications requiring recall beyond typical context window limits

Requirements:

Embeddings sub-node (see Embeddings section below)
Vector Store sub-node (Pinecone, Qdrant, Supabase, Chroma, etc.)
Additional infrastructure costs for vector database

When Memory Causes Problems

Memory isn’t always beneficial:

Context overflow: Very long conversations exceed the LLM’s context window, causing errors or truncation. Use Window Buffer to limit history.

Irrelevant context: Old messages about different topics confuse the agent. Consider clearing memory for new topics.

Cost multiplication: Every message in memory is sent to the LLM on each turn. Long histories increase token usage significantly.

Privacy concerns: Persistent memory stores user conversations. Ensure compliance with data retention policies.

Output Parsers: Structuring Agent Responses

When you need agents to return data in a specific format - JSON for API responses, structured data for downstream processing, or validated schemas for database insertion - output parsers enforce that structure.

Why Output Parsers Matter

LLMs generate free-form text by default. When your workflow needs structured data, you have two options: hope the LLM follows your prompt instructions, or enforce structure with an output parser. Parsers validate LLM output against a schema and reject malformed responses.

Important caveat: Output parsers work more reliably with LLM Chain nodes than with AI Agent nodes. Agents have complex reasoning loops that can interfere with structured output generation. If you need guaranteed structured output, consider routing the agent’s response through a separate LLM Chain with an output parser for the final formatting step.

Structured Output Parser

The most common parser for enforcing JSON structure. Two configuration methods:

Method 1: Generate from JSON Example Provide an example of your desired output, and the parser infers the schema:

{
  "customerName": "John Doe",
  "orderTotal": 149.99,
  "items": ["Widget A", "Widget B"],
  "priority": "high"
}

This creates a schema expecting strings, numbers, arrays, and enums matching your example structure.

Method 2: Define Using JSON Schema For precise control, define an explicit JSON Schema:

{
  "type": "object",
  "properties": {
    "customerName": { "type": "string" },
    "orderTotal": { "type": "number" },
    "items": {
      "type": "array",
      "items": { "type": "string" }
    },
    "priority": {
      "type": "string",
      "enum": ["low", "medium", "high"]
    }
  },
  "required": ["customerName", "orderTotal"]
}

Schema limitations: n8n’s JSON Schema implementation doesn’t support $ref for referencing other schemas. Keep schemas self-contained.

Auto-fixing Output Parser

When LLM outputs almost match your schema but have minor formatting issues, the Auto-fixing Output Parser provides resilience:

Add “Auto-fixing Output Parser”
Connect it to a Structured Output Parser (the parser you want to fix)
Connect both to a Chat Model

When parsing fails, this parser sends the malformed output back to the LLM with instructions to fix the formatting. This adds an extra LLM call but recovers from many common failures.

When to use Auto-fixing:

Production systems where occasional parsing failures are unacceptable
Complex schemas where LLMs sometimes make minor format errors
Situations where retry logic is preferable to hard failures

When to avoid:

Cost-sensitive applications (each fix attempt costs tokens)
Simple schemas that rarely fail
Time-critical applications (retry adds latency)

Item List Output Parser

For responses that should be lists rather than complex objects:

Add “Item List Output Parser”
Configure the item type (strings, objects, etc.)
Connect to your LLM

Use this when you need the LLM to return a simple array of items - tags, keywords, categories, or simple recommendations.

Common Parser Issues

“Model output doesn’t fit required format” error:

The LLM response doesn’t match the expected schema
Check if your prompt clearly instructs the format
Consider using Auto-fixing Output Parser
Simplify the schema if it’s too complex for the model to follow

Parser not working reliably with agents:

Agent reasoning steps interfere with structured output
Solution: Use a separate LLM Chain node after the agent for final formatting
Pass the agent’s raw response to an LLM Chain with an output parser

Nested object validation failing:

Deep nesting increases failure probability
Flatten schemas where possible
Test with simpler schemas first, then add complexity

Best Practice: Separate Reasoning from Formatting

For reliable structured output from agent workflows:

[AI Agent] → [Set node: extract response] → [LLM Chain + Output Parser] → [Output]

The agent handles reasoning and tool use. A separate LLM Chain with an output parser handles the final formatting. This separation provides more reliable results than trying to get the agent to output structured data directly.

Embeddings: Vector Representations for Semantic Search

Embeddings convert text into numerical vectors that capture semantic meaning. They’re essential for Vector Store Memory, RAG (Retrieval-Augmented Generation) workflows, and semantic search functionality.

When You Need Embeddings

Connect an Embeddings sub-node when using:

Vector Store Memory - Requires embeddings to store and retrieve conversation context
Vector Store nodes - For building searchable knowledge bases
Document loaders with vector stores - Embedding documents for retrieval

Without embeddings, vector-based features won’t function. The embeddings node converts text chunks into vector representations that vector databases can index and search.

Available Embedding Providers

Provider	Node Name	Best For
OpenAI	Embeddings OpenAI	Most common, reliable, good documentation
Azure OpenAI	Embeddings Azure OpenAI	Enterprise compliance, Azure ecosystem
Cohere	Embeddings Cohere	Alternative provider, competitive pricing
Ollama	Embeddings Ollama	Self-hosted, local deployment, privacy
Google AI	Embeddings Google AI	Google Cloud ecosystem integration
AWS Bedrock	Embeddings AWS Bedrock	AWS ecosystem, enterprise deployments
Mistral	Embeddings Mistral	Open-source alternative, European hosting
HuggingFace	Embeddings HuggingFace	Custom models, open-source flexibility
Google Vertex AI	Embeddings Google Vertex AI	Enterprise Google Cloud deployments

Configuring Embeddings

For most providers, configuration involves:

Credentials: Add your API key for the embedding provider
Model selection: Choose the embedding model (providers offer multiple options)
Batch size: How many texts to embed per API call (default usually works)
Timeout: Maximum wait time for embedding requests

OpenAI example setup:

Add “Embeddings OpenAI” sub-node
Select or create OpenAI credentials
Choose your embedding model from the dropdown
Leave batch size at default unless processing large volumes

Matching Dimensions to Vector Stores

Embedding models produce vectors of specific dimensions. Your vector store must be configured for the same dimension size:

Model Type	Typical Dimensions
Small/efficient models	384-512
Standard models	1536
Large/advanced models	3072+

Critical: If you change embedding models, you must re-embed all existing data. Different models produce incompatible vector spaces.

Sub-node Behavior Note

Embeddings nodes in n8n are “sub-nodes” that always resolve to the first item when multiple inputs arrive. If you need to embed multiple documents, process them in a loop or use batch operations within a single node rather than expecting the embeddings node to handle arrays automatically.

Cost Considerations

Embedding API calls are typically cheaper than LLM calls but can add up with large document sets:

Initial indexing: One-time cost when building your knowledge base
Query time: Small cost per search query
Re-indexing: Full cost again if you change embedding models

For large knowledge bases, calculate embedding costs before committing to a provider.

Prompt Engineering for Agents

The system prompt shapes agent behavior more than any other configuration. A well-designed prompt produces consistent, helpful results. A poor prompt leads to erratic behavior.

Anatomy of an Effective Agent Prompt

Structure your system prompt with these sections:

ROLE AND IDENTITY
You are [role description]. Your purpose is [primary goal].

CAPABILITIES AND TOOLS
You have access to the following tools:
- [Tool 1]: Use for [specific purpose]
- [Tool 2]: Use for [specific purpose]

BEHAVIOR GUIDELINES
- [Specific instruction 1]
- [Specific instruction 2]
- [Constraint or boundary]

RESPONSE FORMAT
[How to structure outputs]

ERROR HANDLING
If [situation], then [action].

Example: Customer Support Agent Prompt

You are a customer support agent for TechStore, an electronics retailer.

CAPABILITIES:
- order_lookup: Search orders by order ID or customer email
- product_search: Find products in our catalog
- escalate_ticket: Create a support ticket for complex issues

GUIDELINES:
- Always verify customer identity before sharing order details
- For refund requests over $500, use escalate_ticket
- Never share internal pricing or margin information
- If a product is out of stock, suggest similar alternatives
- Keep responses friendly but professional

RESPONSE FORMAT:
- Start with acknowledging the customer's question
- Provide the answer or solution
- End with asking if there's anything else you can help with

WHEN UNSURE:
- If you cannot find an order, ask the customer to verify details
- If a question is outside your scope, explain and use escalate_ticket

Common Prompt Mistakes

Too vague:

Bad: "Help users with their questions."
Good: "Answer questions about order status, returns, and product availability."

No tool guidance:

Bad: "Use tools when needed."
Good: "Use order_lookup when asked about order status. Use product_search for inventory questions."

Missing constraints:

Bad: No limits specified
Good: "Do not process refunds directly. Create an escalation ticket instead."

Conflicting instructions:

Bad: "Be concise" + "Provide detailed explanations"
Good: "Keep responses under 100 words unless the user asks for details."

Dynamic Prompts with Expressions

Use n8n expressions to customize prompts based on context:

You are assisting {{ $json.customerName }}.
Their account type is {{ $json.accountType }}.
{{ $json.accountType === 'premium' ? 'Offer priority support options.' : '' }}

For expression syntax, see our n8n expressions guide.

Error Handling and Production Patterns

The most common complaint about AI agents: they work in demos but fail in production. The difference is error handling. Production agents need graceful degradation, cost controls, and monitoring.

The #1 Mistake: No Error Handling

Most agent tutorials skip error handling entirely. In production, this causes:

Workflows crashing on API failures
Runaway token costs from infinite loops
Silent failures with no debugging information
Poor user experience on edge cases

Enable Continue On Fail

For the AI Agent node and connected tools:

Click the node
Go to Settings (gear icon)
Enable Continue On Fail

This prevents a single tool failure from crashing the entire workflow. The agent receives an error message and can decide how to proceed.

Configure Timeouts

Agents can get stuck thinking. Set reasonable timeouts:

In AI Agent settings, look for timeout options
For HTTP Request tools, set explicit timeouts
Consider workflow-level timeouts for overall execution

For timeout configuration, see our timeout troubleshooting guide.

Token and Cost Limits

Prevent runaway costs with these patterns:

Max iterations: Some LLM providers support limiting reasoning steps. Check your chat model settings.

Token tracking: Add a Code node to track and log token usage per execution.

Budget alerts: Set up monitoring to alert when costs exceed thresholds.

For API rate limiting strategies, see our rate limits guide.

Production Architecture: Separation of Concerns

Reddit users consistently point out that monolithic “super agents” fail in production. Instead, use this pattern:

Master Orchestrator Agent:

Receives user input
Determines which specialized agent to call
Routes requests appropriately
Handles fallbacks

Specialized Sub-Workflows:

Each handles one domain (orders, products, support)
Isolated error handling
Independent testing and deployment
Separate credentials and permissions

This pattern is more maintainable, easier to debug, and fails more gracefully than a single agent trying to do everything.

Fallback Responses

Always provide a fallback when agents fail:

// In a Code node after the agent
if ($json.error || !$json.response) {
  return [{
    json: {
      response: "I'm having trouble processing your request. " +
                "Please try again or contact support at [email protected]",
      fallback: true
    }
  }];
}
return $input.all();

Agent Architecture Patterns

As agent workflows grow in complexity, the architecture you choose determines maintainability, reliability, and cost efficiency. Different patterns solve different problems. Choose based on your requirements, not just what seems cool.

Single Agent Pattern

The simplest architecture: one AI Agent node with multiple tools connected.

[Trigger] → [AI Agent with Tools A, B, C] → [Output]

When to use:

Straightforward tasks with clear boundaries
Prototyping and proof-of-concept
Fewer than 5-6 tools needed
Tasks where tool selection is usually obvious

Limitations:

Tool overload confuses the agent (more tools = more decisions = more errors)
Debugging is difficult when behavior is unexpected
No separation of concerns
Single point of failure

Best practice: Start here, but don’t stay here if complexity grows.

Routing Pattern (Classification-Based Branching)

Route requests to specialized handlers based on classification:

[Trigger] → [Text Classifier/LLM] → [Switch Node]
                                      ├→ [Agent A: Orders]
                                      ├→ [Agent B: Products]
                                      └→ [Agent C: Support]

A classifier (simple LLM call or text classifier node) categorizes incoming requests. The Switch node routes to the appropriate specialized agent, each with focused tools and prompts.

When to use:

Clear, distinct request categories
Different expertise needed per category
Teams owning different domains
When you want deterministic routing

Benefits:

Clear separation of concerns
Each agent has focused tools and prompts
Easier debugging (you know which path executed)
Independent deployment per branch

Implementation tip: Use an LLM to classify into categories, not a keyword match. “I need to return my broken laptop” should route to “Returns” not get confused by “broken” or “laptop” keywords.

Orchestrator Pattern (Hierarchical Multi-Agent)

A master agent delegates to specialist agents:

[Trigger] → [Master Orchestrator Agent]
               ├→ [Tool: Email Agent Workflow]
               ├→ [Tool: Calendar Agent Workflow]
               └→ [Tool: Research Agent Workflow]

The master agent has Workflow tools that call other n8n workflows, each containing their own specialized agents. The master decides which specialist to invoke based on the task.

When to use:

Complex tasks requiring multiple specializations
When specialists need different tool sets
Scalable systems that grow over time
Enterprise applications with clear domain boundaries

Setup in n8n:

Create specialized workflows with their own AI Agent nodes
Add Workflow tools to your master agent
Describe each workflow tool clearly so the master knows when to use it

Example specialist descriptions:

email_agent: Handle email-related tasks including composing,
searching inbox, and managing drafts. Use for any request
involving sending or reading emails.

calendar_agent: Manage calendar events including scheduling,
rescheduling, and checking availability. Use for any request
involving meetings or time slots.

research_agent: Perform web research and gather information
from multiple sources. Use when the user needs information
that requires searching and synthesizing data.

Benefits:

Modular and scalable
Each specialist can be developed and tested independently
Clear ownership and responsibility boundaries
Easier to add new capabilities

Sequential Chain Pattern

Agents process in stages, each handling one part of a pipeline:

[Trigger] → [Agent A: Research] → [Agent B: Analysis] → [Agent C: Report] → [Output]

Each agent receives the output of the previous agent, performs its specific task, and passes results forward.

When to use:

Multi-stage processing pipelines
Tasks with clear sequential phases
When each phase requires different expertise
Content generation workflows (research → outline → write → edit)

Example flow:

Research Agent: Gathers raw information from web search, databases
Analysis Agent: Processes and structures the raw data
Report Agent: Generates formatted output from analyzed data

Context passing: Use n8n expressions to pass relevant context between agents:

// System prompt for Analysis Agent
You are analyzing research results from a previous step.
Research findings: {{ $json.researchOutput }}
Your task: Extract key insights and structure them for reporting.

Considerations:

Each stage adds latency
Errors compound through the chain
Earlier stages need to output structured data for later stages to consume

ReAct Pattern (Reasoning with Reflection)

An agent that explicitly reasons through problems with observation-reflection cycles:

Input → [Think] → [Act] → [Observe] → [Reflect] → [Repeat or Respond]

The ReAct (Reason + Act) pattern encourages the agent to verbalize its reasoning before acting, observe results carefully, and reflect on whether the approach is working.

Implementing in n8n: Encode ReAct in your system prompt:

Before taking any action, follow this process:

THINK: Analyze the request and state your reasoning
- What is the user asking for?
- What information do I need?
- Which tool is most appropriate?

ACT: Execute one action based on your reasoning
- Use the appropriate tool
- Be specific about parameters

OBSERVE: Examine the tool's response
- Did I get the expected data?
- Is it complete and accurate?

REFLECT: Evaluate progress
- Does this answer the user's question?
- Do I need additional information?
- Should I try a different approach?

Continue this cycle until you can provide a complete answer.

When to use:

Complex reasoning tasks
Multi-step problem solving
Tasks requiring careful verification
When reliability is more important than speed

Trade-off: More reliable results but slower and more expensive (more tokens for reasoning).

Pattern Selection Guide

Pattern	Complexity	Best For	Avoid When
Single Agent	Low	Simple tasks, few tools, prototypes	>5 tools, complex routing needs
Routing	Medium	Clear category boundaries, team ownership	Fuzzy or overlapping categories
Orchestrator	High	Many specialists, enterprise scale	Simple single-domain tasks
Sequential	Medium	Pipeline processing, staged workflows	Tasks requiring iteration/backtracking
ReAct	High	Complex reasoning, high-stakes accuracy	Speed is priority, simple tasks

Combining Patterns

Real-world systems often combine patterns:

Routing + Orchestrator: Route to domain-specific orchestrators
Sequential + ReAct: Each sequential stage uses ReAct reasoning
Orchestrator + Sequential: Master routes to sequential pipelines

Start simple. Add architectural complexity only when you’ve validated the simpler approach doesn’t meet requirements.

Real-World Examples

Example 1: FAQ Support Agent

An agent that answers questions using a knowledge base.

Setup:

Chat Model: OpenAI (or any provider with function calling support)
Tools: HTTP Request (queries your FAQ API)
Memory: Postgres Chat Memory

System Prompt:

You are a support agent for SaaS Product X.

TOOL USAGE:
Use the faq_search tool for any product questions. Search with relevant keywords.

GUIDELINES:
- Always search the FAQ before answering product questions
- If FAQ doesn't have the answer, acknowledge this and offer to escalate
- For billing questions, direct users to [email protected]
- Never make up features or capabilities

RESPONSE FORMAT:
- Answer the question directly
- Include relevant FAQ article link if available
- Ask if they need more help

FAQ Search Tool Configuration:

Method: GET
URL: https://api.yoursite.com/faq/search
Query Parameters:
  q: {{ $json.searchQuery }}

Tool Description:
Search the FAQ knowledge base. Input: search keywords.
Output: Array of matching FAQ articles with title, content, and URL.

Example 2: Research Assistant

An agent that gathers information from multiple sources.

Setup:

Chat Model: Anthropic (or any provider with strong reasoning capabilities)
Tools: SerpAPI (web search), HTTP Request (for specific sites)
Memory: Window Buffer (10 messages)

System Prompt:

You are a research assistant that gathers and synthesizes information.

RESEARCH PROCESS:
1. Identify key search terms for the topic
2. Use web_search to find relevant sources
3. Gather information from multiple sources
4. Synthesize findings into a coherent summary
5. Cite sources in your response

QUALITY STANDARDS:
- Verify information across multiple sources when possible
- Distinguish between facts and opinions
- Note when information may be outdated
- Acknowledge limitations in available data

OUTPUT FORMAT:
- Executive summary (2-3 sentences)
- Key findings (bullet points)
- Sources consulted (with links)

Example 3: Data Processing Agent

An agent that processes and transforms data from various sources.

Setup:

Chat Model: Any provider with function calling support
Tools: Code (for transformations), Workflow (for database operations)
Memory: None (stateless processing)

System Prompt:

You are a data processing agent that transforms and analyzes data.

AVAILABLE TOOLS:
- execute_code: Run JavaScript for data transformation
- query_database: Execute read-only database queries
- export_results: Save processed data to storage

PROCESSING RULES:
- Validate data format before processing
- Handle missing or null values gracefully
- Log transformation steps for debugging
- Never modify source data directly

ERROR HANDLING:
- On invalid data, return descriptive error with problem rows
- On query failure, suggest corrected query syntax
- On timeout, recommend smaller batch sizes

For custom logic in your data processing agents, see our Code node guide.

Debugging AI Agent Workflows

When agents misbehave, systematic debugging reveals the problem faster than guessing.

Check Execution Logs

Every agent execution logs its reasoning:

Go to Executions in n8n
Click on the failed or unexpected execution
Click on the AI Agent node
Expand the output to see:
- Input received
- LLM reasoning steps
- Tool calls made
- Final response

The logs show exactly what the agent “thought” at each step.

Common Failure Patterns

Agent ignores tools:

Check tool descriptions are clear and specific
Verify tools are properly connected
Test if the LLM understands when to use the tool

Agent calls wrong tool:

Tool descriptions may be ambiguous
Add clearer differentiation in descriptions
Consider reducing tool count

Agent loops infinitely:

Tool returning unclear results
Prompt doesn’t define completion criteria
Add max iteration limits

Agent hallucates tool results:

Tool returning empty or error responses
Agent not recognizing tool failures
Add explicit error checking in prompt

Testing Strategies

Unit test each tool: Before connecting tools to agents, test them independently with known inputs.

Test edge cases: Empty inputs, invalid data, unusual formats. See how the agent handles unexpected situations.

Compare against baseline: Keep a simple version working. Compare complex agent behavior against the simple baseline.

Log everything: Add logging nodes to capture inputs, outputs, and intermediate states. Logs are essential for production debugging.

For comprehensive debugging, try our workflow debugger tool.

Pro Tips for Production Agents

1. Start Simple, Add Complexity

Build the simplest agent that could work. Add tools and capabilities only when testing reveals they’re needed. Every addition increases potential failure points.

2. Version Your Prompts

Store prompts in version control or a configuration system. When agent behavior changes unexpectedly, you can diff prompts to find what changed.

3. Monitor Costs

Track token usage per agent and per user. Set up alerts before costs become problems. Consider caching frequent queries.

4. Plan for Latency

Agent responses take time, sometimes 10-30 seconds for complex reasoning. Design UIs with loading states. Consider streaming responses if your platform supports it.

5. Document Tool Contracts

Each tool should have clear documentation:

What inputs it accepts
What outputs it returns
Error conditions and responses
Rate limits and costs

6. Use Staging Environments

Test agent changes in staging before production. Agent behavior can change subtly with prompt tweaks. Verify with real-world test cases.

For complex agent architectures and custom implementations, our workflow development services can accelerate your project. For architectural guidance, explore our n8n consulting services.

Frequently Asked Questions

How do I make my AI agent remember previous conversations?

Add a Memory node to your AI Agent. For testing, use Simple Memory which stores history in n8n’s memory during execution.

For production, use Postgres Chat Memory or Redis Chat Memory which persist conversations across sessions.

The key is the Session ID parameter. Use a consistent ID per user or conversation (like user_123_conv_456) so the agent loads the correct history. Without memory, every message starts a new conversation with no context.

Why does my agent keep calling the same tool repeatedly in a loop?

This happens when the tool returns results the agent doesn’t understand or can’t use to make progress.

Check three things:

Verify the tool returns clear, actionable data
Ensure your prompt explains how to interpret tool results and when the task is complete
Add a maximum iteration limit in your chat model settings if available

The agent loop continues until the LLM decides the goal is achieved. If it never recognizes completion, it loops forever.

Can I use multiple AI agents in one workflow?

Yes, and this is often the best architecture for complex tasks.

Create separate AI Agent nodes for different responsibilities (one for classification, one for research, one for response generation). Connect them with standard n8n nodes to route data based on the first agent’s output.

You can also use the Workflow tool to let one agent call another agent’s workflow as a sub-routine. This separation of concerns makes debugging easier and failures more isolated.

How do I handle it when the LLM returns unexpected or malformed responses?

Enable Continue On Fail on your AI Agent node so errors don’t crash the workflow.

Add a Code node after the agent to validate the response structure. Check for required fields, reasonable values, and expected formats. If validation fails, route to a fallback response or retry logic.

For structured outputs, use output parsers in your chat model configuration to enforce JSON schemas. Always have a graceful fallback message for when things go wrong.

What’s the best way to limit API costs when using AI agents?

Several strategies work together:

Limit memory: Use Window Buffer Memory with a small window (5-10 messages) to reduce context sent to the LLM
Right-size models: Choose the smallest model that handles your use case - often significantly cheaper
Cache queries: Store responses to frequent identical questions
Set token limits: Configure maximum tokens in your chat model settings
Track usage: Add a Code node to log token usage per execution and set up cost alerts

Also consider whether simpler non-agent approaches could handle common cases without LLM reasoning overhead.