n8n Database Sync: Keep Your Systems in Perfect Harmony
Your databases are slowly drifting apart, and every minute makes reconciliation harder.
That customer updated their email in your CRM last week. The billing system still has the old address. Your support team just spent 20 minutes tracking down why a subscription renewal email bounced. Multiply this by hundreds of records across multiple systems, and you have a data consistency nightmare that costs real money and erodes customer trust.
Most businesses run multiple systems that need to share data: CRMs feeding billing platforms, e-commerce backends syncing with inventory systems, analytics databases pulling from production. When these systems fall out of sync, the consequences compound daily.
The Silent Data Drift Problem
Data inconsistency rarely announces itself with loud errors. Instead, it creates a slow accumulation of problems:
- Sales teams contact churned customers because the CRM never received the cancellation
- Inventory counts diverge between your warehouse system and online store
- Financial reports show different numbers than operational dashboards
- Customer support lacks visibility into order history from connected systems
Manual synchronization attempts fail at scale. Spreadsheet exports, copy-paste operations, and scheduled “someone checks this weekly” processes break down as data volumes grow.
Why Database Sync Automation Matters
Automated database synchronization solves these problems by establishing continuous data flow between systems. When a record changes in one database, the change propagates to related systems automatically.
n8n excels at this task because it connects to virtually any database and provides the visual workflow tools to build sophisticated sync logic without custom code. You can implement one-way replication, bidirectional sync, or selective synchronization based on business rules.
What You’ll Learn
- Sync patterns explained: one-way, two-way, and incremental approaches
- n8n sync architecture and which nodes handle each responsibility
- Change detection strategies using timestamps, watermarks, and hashing
- Database node configuration for reliable sync operations
- A complete sync workflow built step by step with code examples
- Error handling patterns that make sync workflows production-ready
- Performance optimization for large-scale synchronization
Understanding Database Sync Patterns
Before building sync workflows, you need to choose the right pattern for your use case. Each pattern has distinct characteristics that affect reliability, complexity, and conflict handling.
One-Way Sync (Source of Truth Model)
In one-way synchronization, data flows in a single direction. One system serves as the authoritative source, and other systems receive updates from it.
How it works:
- Changes occur only in the source system
- The sync workflow detects those changes
- Changes replicate to target systems
- Target systems never write back to the source
Best for:
- Reporting databases that mirror production data
- Read-only analytics systems
- Backup or archival systems
- Downstream systems that consume but never modify records
Advantages:
- No conflict resolution needed
- Simpler to implement and debug
- Clear data ownership
Two-Way Sync (Bidirectional)
Two-way synchronization allows changes in either system to propagate to the other. Both databases can serve as sources of truth for different scenarios.
How it works:
- Changes can occur in either system
- The sync workflow monitors both systems
- Changes from each propagate to the other
- Conflict resolution handles simultaneous changes
Best for:
- CRM and email platform integration
- Multi-system workflows where different teams update different fields
- Offline-capable applications that sync when reconnected
Challenges:
- Requires conflict detection and resolution
- More complex workflow logic
- Risk of sync loops without proper safeguards
Incremental Sync vs Full Sync
Full sync compares entire datasets on each run. Every record in the source gets checked against the target.
Incremental sync only processes records that changed since the last sync. This requires tracking what was previously synchronized.
| Aspect | Full Sync | Incremental Sync |
|---|---|---|
| Data volume processed | Everything | Only changes |
| Performance | Slower, scales poorly | Fast, constant time |
| Complexity | Simpler logic | Requires change tracking |
| Missed changes risk | None | Possible if tracking fails |
| Best for | Small datasets, initial loads | Large datasets, frequent syncs |
Rule of thumb: Use full sync for initial data loads and periodic reconciliation. Use incremental sync for ongoing synchronization with datasets over a few hundred records.
Pattern Selection Guide
| Scenario | Recommended Pattern |
|---|---|
| Production DB to reporting DB | One-way incremental |
| CRM to email marketing platform | Two-way incremental |
| E-commerce to inventory system | Two-way incremental |
| Data warehouse population | One-way full (nightly) |
| Real-time dashboard updates | One-way incremental |
| Multi-region database copies | One-way incremental |
The n8n Sync Architecture
Every database sync workflow in n8n follows a common structure. Understanding these components helps you build reliable synchronization.
Core Components
A typical sync workflow includes five stages:
1. Trigger
Starts the workflow. Options include:
- Schedule Trigger: Run sync at fixed intervals (every 5 minutes, hourly, daily)
- Webhook: Triggered by external events (database triggers, application events)
- Manual: For testing or on-demand synchronization
2. Read (Extract)
Query both source and target databases to retrieve current state. Use database nodes (MySQL, PostgreSQL, MongoDB, etc.) with appropriate filters.
3. Compare (Detect Changes)
Identify what changed between source and target. The Compare Datasets node handles this automatically, outputting four categories:
- Records only in source (new records to add)
- Records only in target (potentially deleted records)
- Records that match exactly (no action needed)
- Records that exist in both but differ (updates needed)
4. Transform
Reshape data to match the target system’s requirements. Use the Edit Fields node or Code node for field mapping and data transformation. Our data transformation guide covers this in depth.
5. Write (Load)
Apply changes to the target database. Handle inserts, updates, and deletes based on what the comparison revealed.
Visual Workflow Structure
[Schedule Trigger]
↓
[Query Source DB] ──────────────────────────┐
↓ │
[Query Target DB] ───┐ │
↓ ↓
[Compare Datasets] ←───────────┘
│
┌────────────────┼────────────────┬─────────────────┐
↓ ↓ ↓ ↓
[In A Only] [Different] [In B Only] [Same]
↓ ↓ ↓ ↓
[Insert to [Update in [Handle [No Action]
Target] Target] Deletions]
Key Nodes for Sync Workflows
| Node | Role in Sync |
|---|---|
| Database nodes (MySQL, PostgreSQL, etc.) | Read from and write to databases |
| Compare Datasets | Detect differences between source and target |
| Merge | Combine data streams, enrich records |
| Edit Fields | Transform data structure between systems |
| Code | Custom transformation and business logic |
| If | Conditional routing based on data values |
| Error Trigger | Catch and handle sync failures |
Change Detection Strategies
Detecting which records changed is the foundation of efficient synchronization. Several strategies work depending on your database capabilities and requirements.
Timestamp-Based Detection
The most common approach uses timestamp columns that track when records were last modified.
Requirements:
- Source database has
created_atandupdated_at(or similar) columns - These columns update automatically on insert and modification
- Timestamps use consistent timezone handling
Query pattern:
SELECT * FROM customers
WHERE updated_at > :last_sync_timestamp
ORDER BY updated_at ASC
Tracking the watermark:
Store the highest updated_at value from each sync run. On the next run, query only records modified after that timestamp.
// In a Code node - store last sync time
const staticData = $getWorkflowStaticData('global');
const items = $input.all();
if (items.length > 0) {
// Find the most recent timestamp
const maxTimestamp = items.reduce((max, item) => {
const itemTime = new Date(item.json.updated_at).getTime();
return itemTime > max ? itemTime : max;
}, 0);
staticData.lastSyncTimestamp = new Date(maxTimestamp).toISOString();
}
return items;
Limitations:
- Requires schema changes if timestamps don’t exist
- Deleted records don’t update timestamps (need soft delete pattern)
- Clock skew between systems can cause missed records
Sequential ID Tracking
For systems without timestamps, track the highest ID processed.
Query pattern:
SELECT * FROM orders
WHERE id > :last_processed_id
ORDER BY id ASC
When it works:
- IDs are sequential and monotonically increasing
- Records are never modified after creation (append-only data)
- You only need to capture new records, not updates
Limitations:
- Doesn’t detect modifications to existing records
- Doesn’t work with UUID or non-sequential primary keys
Hash-Based Comparison
Calculate a hash of each record’s relevant fields. Compare hashes between source and target to detect changes.
Implementation:
// Generate hash for comparison
const crypto = require('crypto');
return items.map(item => {
const data = item.json;
// Create hash from relevant fields only
const hashInput = JSON.stringify({
name: data.name,
email: data.email,
status: data.status
// Exclude: id, created_at, updated_at, internal fields
});
const hash = crypto
.createHash('md5')
.update(hashInput)
.digest('hex');
return {
json: {
...data,
_syncHash: hash
}
};
});
Advantages:
- Works without timestamp columns
- Detects any field-level change
- Database-agnostic
Limitations:
- Requires full dataset comparison (can’t query “only changed records”)
- Computationally expensive for large datasets
- Hash collisions (extremely rare but possible)
Handling Soft Deletes
Hard deletes (removing rows entirely) are problematic for sync because you can’t query something that no longer exists.
Soft delete pattern:
Add a deleted_at timestamp column. Instead of deleting rows, set this timestamp:
UPDATE customers
SET deleted_at = NOW()
WHERE id = :customer_id
Your sync query includes deleted records:
SELECT * FROM customers
WHERE updated_at > :last_sync_timestamp
OR deleted_at > :last_sync_timestamp
The sync workflow then handles soft-deleted records by removing them from the target or marking them as deleted there.
Database Node Configuration
n8n supports numerous database integrations. While node names differ, configuration principles remain consistent across database types.
Connection Best Practices
Use credentials, not hardcoded values:
Store connection strings, passwords, and API keys in n8n credentials. Never put sensitive data directly in node parameters.
Use internal hostnames:
When n8n and databases run in the same network (Docker, Kubernetes), use internal DNS names rather than public IPs. This improves security and reduces latency.
Test connections independently:
Before building complex workflows, create a simple test workflow that connects to your database and runs a basic query. Verify connectivity first.
Parameterized Queries
Never concatenate user input or variable data directly into SQL strings. This creates SQL injection vulnerabilities.
Wrong approach:
SELECT * FROM users WHERE email = '{{ $json.email }}'
Correct approach (parameterized):
Most n8n database nodes support query parameters. Use placeholders and bind values separately:
SELECT * FROM users WHERE email = $1
With the parameter value {{ $json.email }} passed through the node’s parameter configuration.
Query Patterns for Sync Operations
Select with timestamp filter:
SELECT id, name, email, status, updated_at
FROM customers
WHERE updated_at > $1
ORDER BY updated_at ASC
LIMIT 1000
The LIMIT clause prevents overwhelming the workflow with too many records at once. Process in batches for large datasets.
Upsert (Insert or Update):
Many databases support upsert operations that insert new records or update existing ones:
INSERT INTO customers (id, name, email, updated_at)
VALUES ($1, $2, $3, NOW())
ON CONFLICT (id) DO UPDATE SET
name = EXCLUDED.name,
email = EXCLUDED.email,
updated_at = NOW()
This pattern is ideal for sync operations because you don’t need to first check whether a record exists.
Batch operations:
For better performance, insert or update multiple records in a single query:
INSERT INTO customers (id, name, email)
VALUES
($1, $2, $3),
($4, $5, $6),
($7, $8, $9)
ON CONFLICT (id) DO UPDATE SET
name = EXCLUDED.name,
email = EXCLUDED.email
Construct these queries dynamically in a Code node based on your item count.
Handling Different Database Types
While SQL syntax varies slightly between databases, n8n’s database nodes abstract most differences. Key considerations:
| Database | Upsert Syntax | Timestamp Type |
|---|---|---|
| PostgreSQL | ON CONFLICT ... DO UPDATE | TIMESTAMP WITH TIME ZONE |
| MySQL | ON DUPLICATE KEY UPDATE | DATETIME or TIMESTAMP |
| SQLite | ON CONFLICT ... DO UPDATE | TEXT (ISO format) |
| MongoDB | updateOne with upsert: true | Date |
For complex multi-database scenarios, consider using the Merge node’s SQL Query mode which uses AlaSQL for in-workflow data manipulation.
Building a Complete Sync Workflow
Let’s build a production-ready sync workflow step by step. This example synchronizes customer records from a source database to a target database using incremental sync.
Step 1: Set Up the Trigger
Add a Schedule Trigger node configured to run every 15 minutes:
- Mode: At Regular Intervals
- Interval: 15 minutes
For more frequent syncing, consider webhook-based triggers from your source application.
Step 2: Retrieve Last Sync Timestamp
Add a Code node to get the stored watermark:
const staticData = $getWorkflowStaticData('global');
// Default to 24 hours ago if no previous sync
const defaultStart = new Date();
defaultStart.setHours(defaultStart.getHours() - 24);
const lastSync = staticData.lastSyncTimestamp || defaultStart.toISOString();
return [{
json: {
lastSyncTimestamp: lastSync
}
}];
Step 3: Query Source Database
Add your database node (MySQL, PostgreSQL, etc.) with this query:
SELECT id, name, email, phone, status, updated_at
FROM customers
WHERE updated_at > $1
ORDER BY updated_at ASC
LIMIT 500
Pass {{ $json.lastSyncTimestamp }} as the parameter.
Step 4: Query Target Database
Add another database node querying the target. Use the same timestamp filter to get only potentially relevant records for comparison:
SELECT id, name, email, phone, status, updated_at
FROM customers
WHERE updated_at > $1
ORDER BY id ASC
Step 5: Compare Datasets
Add a Compare Datasets node:
- Input A Field:
id - Input B Field:
id
Connect the source database output to Input A, target database output to Input B.
The node outputs four branches:
- In A Only: New records to insert
- In B Only: Records to potentially delete (or investigate)
- Same: Identical records, no action needed
- Different: Records that need updating
Step 6: Handle New Records (In A Only)
Connect a database node to the “In A Only” output:
INSERT INTO customers (id, name, email, phone, status, updated_at)
VALUES ($1, $2, $3, $4, $5, $6)
Map the fields from the incoming items.
Step 7: Handle Updates (Different)
Connect a database node to the “Different” output:
UPDATE customers
SET name = $2, email = $3, phone = $4, status = $5, updated_at = $6
WHERE id = $1
Step 8: Update Watermark
After processing, add a Code node to store the new watermark:
const staticData = $getWorkflowStaticData('global');
const items = $input.all();
if (items.length > 0) {
const timestamps = items
.map(item => new Date(item.json.updated_at).getTime())
.filter(t => !isNaN(t));
if (timestamps.length > 0) {
const maxTimestamp = Math.max(...timestamps);
staticData.lastSyncTimestamp = new Date(maxTimestamp).toISOString();
}
}
return [{ json: { status: 'sync_complete', processed: items.length } }];
Step 9: Add Notifications
Connect the final output to a notification node (Slack, email, webhook) to report sync status:
// Summary message
const message = `Database sync completed. Processed ${items.length} records.`;
Error Handling and Reliability
Production sync workflows must handle failures gracefully. A sync that crashes halfway through can leave data in an inconsistent state.
Transaction Batching
n8n database nodes often support transaction modes:
- Single Statement: Execute all data in one query
- Independently: One query per item, each commits separately
- Transaction: All queries in one transaction, rollback on any failure
For sync operations, “Transaction” mode ensures atomicity. Either all changes apply or none do.
Retry Patterns
Configure retry behavior on database nodes:
- Open node Settings
- Enable Retry On Fail
- Set Max Tries: 3
- Set Wait Between Tries: 5000ms
For transient database connection issues, retries often succeed.
Idempotency
Design sync operations to be safely repeatable. If a sync runs twice with the same data, the result should be identical.
Upsert operations are inherently idempotent. Inserting a record that already exists updates it instead of failing or creating a duplicate.
Avoid:
- Incrementing counters on each sync
- Appending to arrays without deduplication
- Operations that depend on current state
Error Notifications
Add an Error Trigger workflow that catches failures:
[Error Trigger] → [Format Error Message] → [Send to Slack/Email]
Include in your error notification:
- Workflow name
- Error message
- Execution timestamp
- Link to execution log
Dead Letter Handling
For records that consistently fail, implement a dead letter pattern:
// In a Code node after catching errors
const failedItems = $input.all();
const staticData = $getWorkflowStaticData('global');
if (!staticData.deadLetter) {
staticData.deadLetter = [];
}
// Store failed items for later investigation
failedItems.forEach(item => {
staticData.deadLetter.push({
data: item.json,
failedAt: new Date().toISOString(),
reason: item.json._error || 'Unknown error'
});
});
// Keep only last 100 failures
staticData.deadLetter = staticData.deadLetter.slice(-100);
return [{ json: { deadLetterCount: staticData.deadLetter.length } }];
Periodically review dead letter records to identify systemic issues.
For comprehensive error handling patterns, see our guide on fixing n8n timeout errors and the error trigger node documentation.
Performance Optimization
As data volumes grow, sync workflows need optimization to maintain acceptable performance.
Batch Size Tuning
Process records in batches rather than all at once. The Loop Over Items node (Split in Batches) handles this:
[Query Results] → [Loop Over Items (batch: 100)] → [Database Insert] → [Loop]
↓
[Done: Log Summary]
Recommended batch sizes:
| Operation Type | Batch Size |
|---|---|
| Simple inserts | 100-500 |
| Complex updates with validation | 25-50 |
| Operations with external API calls | 10-25 |
Our batch processing guide covers this in detail.
Incremental Sync Scheduling
Balance freshness against system load:
| Requirement | Sync Interval |
|---|---|
| Near real-time | 1-5 minutes |
| Timely | 15-30 minutes |
| Daily reporting | Once per day (off-peak hours) |
| Archival | Weekly or monthly |
More frequent syncs process fewer records per run, reducing peak load.
Database Indexing
Ensure your databases have indexes on:
- Primary key columns used for matching (usually automatic)
- Timestamp columns used for incremental queries
- Foreign key columns used in joins
Without proper indexes, sync queries scan entire tables, degrading performance exponentially with data growth.
Using Sub-Workflows
For complex sync logic, break workflows into sub-workflows:
Main: [Trigger] → [Query Changes] → [Execute Sub-Workflow per batch]
Sub: [Webhook Trigger] → [Transform] → [Insert/Update] → [Respond]
Sub-workflows provide memory isolation and cleaner error handling. Learn more in our workflow best practices guide.
When to Consider Queue Mode
For very high-volume synchronization (thousands of records per minute), consider n8n queue mode. Queue mode distributes execution across multiple workers, enabling horizontal scaling.
Common Sync Gotchas
These issues cause silent data corruption or unexpected failures. Address them before going to production.
Timezone Mismatches
Databases store timestamps differently. One system might use UTC, another local time. A record updated at “2024-03-15 10:00:00” in UTC appears as a different time in a system using EST.
Fix: Normalize all timestamps to UTC in your transformation step:
// Convert to UTC ISO format regardless of source timezone
const normalizedTime = new Date(item.json.updated_at).toISOString();
NULL vs Empty String
Some databases treat NULL and '' (empty string) as equivalent. Others don’t. A field that’s NULL in the source might become '' in the target, causing the Compare Datasets node to flag it as “different” on every sync.
Fix: Normalize empty values in your transformation:
// Treat empty strings as null for consistency
const cleanValue = item.json.phone?.trim() || null;
Character Encoding Issues
Latin-1 encoded data imported into a UTF-8 database creates garbled text. Customer names with accents or special characters become unreadable.
Fix: Ensure both databases use UTF-8. When querying, explicitly set character encoding in your connection string if supported.
Decimal Precision Differences
19.99 stored as DECIMAL(10,2) in one database might become 19.990000000001 as a FLOAT in another. These tiny differences cause records to appear “different” indefinitely.
Fix: Round decimal values to consistent precision:
const price = Math.round(item.json.price * 100) / 100;
Auto-Increment ID Conflicts
If both databases generate their own auto-increment IDs, the same customer might be ID 1001 in one system and ID 5847 in another. Syncing by ID fails entirely.
Fix: Use a business identifier (email, external_id, SKU) for matching instead of database-generated IDs. Or establish one system as the ID authority and propagate its IDs to others.
Frequently Asked Questions
How do I sync data between two different database types in n8n?
n8n treats all databases uniformly at the workflow level. To sync between different database types (for example, MySQL to PostgreSQL):
- Add separate database nodes for each type
- Query the source using its native node
- Use the Compare Datasets node to identify changes
- Transform data as needed (field names, data types)
- Write to the target using its native node
The key is the transformation step. Different databases may use different column names, data types, or structures. Use the Edit Fields node or Code node to map source schema to target schema.
For type conversions, JavaScript in the Code node handles most cases:
return items.map(item => ({
json: {
...item.json,
// Convert MySQL DATETIME to PostgreSQL TIMESTAMP format
created_at: new Date(item.json.created_at).toISOString()
}
}));
What’s the best way to detect changes in a database for syncing?
The best method depends on your database capabilities:
If your database has updated_at timestamps: Use timestamp-based incremental sync. Query records where updated_at > last_sync_time. This is the most efficient and reliable method.
If timestamps aren’t available but IDs are sequential: Track the highest processed ID. This only catches new records, not modifications.
If you need to detect any change without timestamps: Use hash-based comparison. Calculate a hash of each record’s fields and compare against stored hashes. This requires processing the full dataset.
For true real-time sync: Use database triggers (if supported) that fire webhooks when records change. n8n receives the webhook and processes only the changed record immediately.
The n8n blog on data synchronization provides additional context on these approaches.
Can n8n handle two-way database synchronization?
Yes, but two-way sync requires careful design to avoid conflicts and sync loops.
Key considerations:
-
Conflict detection: When the same record changes in both systems between syncs, you need rules to decide which change wins. Options include “last write wins” (based on timestamp) or “source system wins” (designated authority).
-
Loop prevention: If System A syncs to System B, and System B syncs back to System A, you can create infinite loops. Prevent this by:
- Using a
synced_fromfield that records the origin of changes - Comparing timestamps to avoid re-syncing changes that originated from the other system
- Using separate “source” and “sync” timestamp columns
- Using a
-
Workflow structure: Run two separate sync workflows (A→B and B→A) rather than one bidirectional workflow. This simplifies debugging.
For most use cases, designating one system as the source of truth (one-way sync) is simpler and more reliable.
How do I prevent duplicate records during sync operations?
Duplicates typically occur when sync workflows don’t properly track what’s been processed. Prevention strategies:
Use upsert operations:
INSERT INTO customers (id, email, name)
VALUES ($1, $2, $3)
ON CONFLICT (id) DO UPDATE SET
email = EXCLUDED.email,
name = EXCLUDED.name
Upsert inserts new records and updates existing ones, making it impossible to create duplicates on the primary key.
Add unique constraints:
Beyond primary keys, add unique constraints on business identifiers (email, external_id, etc.). The database rejects duplicates automatically.
Check before insert:
If upsert isn’t available, query for existence first:
// Pseudocode flow
const exists = await queryTargetForRecord(record.id);
if (exists) {
await updateRecord(record);
} else {
await insertRecord(record);
}
Use the Compare Datasets node:
The Compare Datasets node explicitly separates “new records” from “existing but different” records, routing them to appropriate insert or update operations.
What happens if my sync workflow fails mid-execution?
When a sync fails partway through, some records may have synced while others didn’t. Recovery depends on your workflow design:
With idempotent operations (upserts):
Simply re-run the sync. Records that already synced update to the same values (no harm). Records that didn’t sync now process normally.
With non-idempotent operations:
You may have partial data. Options include:
- Manual reconciliation: Run a full comparison to identify missing records
- Checkpointing: Store progress after each batch, resume from the last checkpoint
- Transaction rollback: If using database transactions, the partial changes roll back automatically
Best practices for recovery:
- Design for idempotency from the start
- Use batch processing with progress tracking
- Implement error notifications so you know about failures immediately
- Log which records were processed for post-failure analysis
For mission-critical synchronization, consider queue mode which provides better failure isolation and recovery.
Making Sync Workflows Production-Ready
Database synchronization sits at the heart of data integrity across your organization. A well-designed sync workflow runs reliably for months without intervention. A poorly designed one creates more problems than it solves.
Start simple: Begin with one-way incremental sync. Add complexity (two-way, real-time, conflict resolution) only when genuinely needed.
Test thoroughly: Use representative data volumes. A sync that works with 100 records might fail with 10,000.
Monitor continuously: Set up alerts for sync failures, unusual record counts, and performance degradation.
Document your design: Future you (or your team) will thank present you for clear documentation of sync logic, conflict rules, and recovery procedures.
For complex synchronization requirements or help optimizing existing sync workflows, our n8n consulting services provide expert guidance. If you need a complete sync solution built to your specifications, explore our workflow development services.
Use our free workflow debugger to troubleshoot sync issues, and check the n8n community forum for additional patterns and community solutions.