In This Article
- 1. How Self-Hosted n8n Breaks in Production
- 2. The Root Cause: One Event Loop Doing Everything
- 3. First-Line Defence: Timeout and Concurrency Environment Variables
- 4. Nginx Configuration for Long-Running Webhooks
- 5. Memory Management: Stopping the OOM Killer
- 6. Recognising the Signals That Tell You Queue Mode Is Needed
- 7. Queue Mode Architecture: How It Works
- 8. Queue Mode Setup on DigitalOcean / AWS (Docker Compose)
- 9. Scaling Workers and Sizing Your Droplet or EC2 Instance
- 10. Production Self-Hosting Checklist
- 11. Frequently Asked Questions
How Self-Hosted n8n Breaks in Production
A self-hosted n8n instance installed on a DigitalOcean Droplet or an AWS EC2 box from a getting-started guide will run fine for days or weeks. Then, as your automation footprint grows, it starts failing in a handful of very specific ways. Knowing these failure modes by name is the first step to fixing them.
504 Gateway Timeout on Webhooks
An external service posts to your n8n webhook URL and receives a 504 back after 30–60 seconds. The workflow was still executing — n8n just couldn't respond in time. Your Nginx or Caddy reverse proxy gave up waiting. The incoming data may be lost entirely unless the caller retries.
The Frozen Editor
You open the n8n UI to edit a workflow and it loads slowly — or not at all. Buttons stop responding. Save operations spin indefinitely. The instance is alive but unresponsive to UI interactions. This happens because the editor and the execution engine share the same Node.js event loop, and heavy executions are consuming all of it.
Out-of-Memory Process Kill
The n8n process disappears without warning. No graceful shutdown. Your process manager (PM2, systemd, or Docker) restarts it. In-flight executions are lost. The culprit is almost always a workflow that pulled a large response into memory — a 20MB API response, a workflow processing hundreds of items in a single batch, or a Code node that built a large in-memory array.
Webhook Queue Backup
A traffic spike sends 50 webhook calls in rapid succession. n8n processes them serially (in default mode). Callers that posted early have their webhooks respond after 90 seconds. Callers that posted later time out entirely. Nothing in the n8n UI indicates this is happening — executions appear to run successfully, just slowly.
All four of these failures are solvable. But they require different interventions at different layers of the stack. Starting from the inside out — environment configuration, then proxy settings, then architecture — is the correct order.
The Root Cause: One Event Loop Doing Everything
n8n is built on Node.js, which runs on a single-threaded event loop. In a default single-instance deployment, that one event loop handles everything simultaneously: responding to incoming webhooks, executing workflow steps, serving the editor UI, writing execution logs to the database, and processing the results of API calls as they return.
Most of the time this works well. Node.js is extremely efficient at I/O-bound work — waiting for HTTP responses from external APIs is the dominant activity in most n8n workflows, and the event loop handles concurrent waits gracefully. The problem arises when something CPU-intensive or memory-intensive enters the loop.
The event loop is not interruptible
When a Code node in n8n runs a tight JavaScript loop over thousands of items — sorting, transforming, aggregating — that computation runs synchronously on the event loop. Nothing else can happen until it finishes. Incoming webhook requests queue up behind it. The editor becomes unresponsive. If the computation takes 8 seconds, every webhook that arrived during those 8 seconds is already dangerously close to a reverse proxy timeout.
The same applies to large JSON serialisation and deserialisation. Parsing a 15MB API response into n8n's internal data format is a synchronous CPU operation. While it's running, the event loop is blocked. Add three workflows doing this in parallel and you have a completely frozen instance.
Understanding this constraint tells you what the fixes are: either prevent the event loop from being blocked (configuration and workflow design), or move the execution work off the main process entirely (queue mode).
First-Line Defence: Timeout and Concurrency Environment Variables
Before touching your architecture, there are several environment variables that every production n8n deployment should have explicitly configured. The defaults are designed for development environments — they are not safe for production.
n8n Production Environment Variables — Core
# ── Execution timeouts ───────────────────────────────────────────── # Maximum time (seconds) any single workflow execution is allowed to run. # Default is -1 (unlimited). On a shared VPS, set this to prevent runaway # executions from monopolising the event loop indefinitely. EXECUTIONS_TIMEOUT=3600 # Hard ceiling. Users and admins cannot set a timeout higher than this. # Set to match or slightly exceed your longest legitimate workflow. EXECUTIONS_TIMEOUT_MAX=7200 # ── Concurrency ───────────────────────────────────────────────────── # Maximum concurrent production executions on this instance. # Default is unlimited. On a 2 vCPU / 4GB Droplet, 5–10 is a safe start. # Increase only after confirming memory headroom per execution. N8N_CONCURRENCY_PRODUCTION_LIMIT=10 # ── Webhook response timeout ───────────────────────────────────────── # How long n8n waits (ms) for a "respond to webhook" node to complete # before returning a timeout response to the caller. Default: 5000ms. # Set to match your slowest legitimate synchronous webhook workflow. N8N_DEFAULT_WEBHOOK_TIMEOUT=30000 # ── Payload limits ─────────────────────────────────────────────────── # Maximum inbound request body size (bytes). Default: 16MB. # For most webhook integrations 16MB is already large; lower it to # prevent deliberate or accidental oversized payloads from hitting memory. N8N_PAYLOAD_SIZE_MAX=8388608 # ── Process hygiene ────────────────────────────────────────────────── # Prune execution history to prevent the database from growing without bound. # Keep the last 1000 executions per workflow; delete anything older than 30 days. EXECUTIONS_DATA_PRUNE=true EXECUTIONS_DATA_MAX_AGE=720 EXECUTIONS_DATA_SAVE_ON_ERROR=all EXECUTIONS_DATA_SAVE_ON_SUCCESS=none
A few of these deserve elaboration.
EXECUTIONS_TIMEOUT=3600 is a protection against the runaway workflow — one that gets into an unexpected loop, or calls an external API that never responds. Without a timeout, a single stuck execution can hold the event loop partially occupied indefinitely. Set this to the maximum time any of your workflows could legitimately need; 3600 seconds (one hour) is a conservative ceiling that catches most problems.
N8N_CONCURRENCY_PRODUCTION_LIMIT is the most impactful single variable for preventing memory spikes. By default, n8n will attempt to run as many concurrent executions as it receives. If 30 webhook calls arrive simultaneously, n8n attempts to start 30 executions. On a 4GB VPS where each execution consumes 200–400MB, that's an immediate OOM kill. Setting this to 10 means n8n queues incoming executions and processes them at a steady rate rather than attempting to parallelise infinitely.
EXECUTIONS_DATA_SAVE_ON_SUCCESS=none is a significant quality-of-life change for busy instances. By default n8n saves the full data for every successful execution. On a high-volume instance processing thousands of executions daily, this grows your database by gigabytes per week and adds write overhead to every execution. Save on error only unless you have a specific debugging need for successful execution data.
Database matters here
If you're running n8n with SQLite (the default for quick installs), these pruning settings matter even more — SQLite does not handle concurrent write load well, and a large database on a spinning disk or a crowded NVMe will become a significant bottleneck. For any production n8n deployment expecting more than a few hundred executions per day, migrate to PostgreSQL 13+. The official n8n docs recommend this clearly; the SQLite default is a development convenience, not a production recommendation.
Nginx Configuration for Long-Running Webhooks
The vast majority of n8n webhook timeout issues reported on forums and community threads are not n8n problems at all — they're proxy problems. Nginx's default proxy_read_timeout is 60 seconds. If your workflow takes longer than 60 seconds to produce a response (even a simple "received" response), Nginx closes the connection and returns a 504 to the caller. n8n never had a chance to respond.
There are two categories of n8n webhook workflows and they need different proxy handling.
Respond Immediately
The webhook receives the payload, returns 200 OK immediately, and the rest of the workflow executes asynchronously. The caller doesn't wait. Use the "Respond to Webhook" node early in the workflow before any slow steps. Most webhook integrations (Stripe, HubSpot, GitHub) work this way.
Synchronous Response Required
The calling system expects a meaningful response body from the workflow before it continues. Your workflow must complete before responding. These need elevated proxy timeouts and a clearly bounded execution time. AI chatbot integrations often fall into this category.
For production instances handling both types, here is the Nginx server block configuration that covers both cases correctly:
Nginx — /etc/nginx/sites-available/n8n
server {
listen 443 ssl http2;
server_name n8n.yourdomain.com;
ssl_certificate /etc/letsencrypt/live/n8n.yourdomain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/n8n.yourdomain.com/privkey.pem;
# ── Standard proxy settings ─────────────────────────────────────
client_max_body_size 16m;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# ── Timeouts for standard editor and API traffic ─────────────────
proxy_read_timeout 120s;
proxy_send_timeout 120s;
proxy_connect_timeout 10s;
location / {
proxy_pass http://127.0.0.1:5678;
# WebSocket support — required for the n8n editor's live UI
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
# ── Elevated timeouts for synchronous webhook workflows ──────────
# Apply only to /webhook/ paths. This does NOT affect the editor.
# Match N8N_DEFAULT_WEBHOOK_TIMEOUT plus a 30s buffer.
location /webhook/ {
proxy_pass http://127.0.0.1:5678;
proxy_read_timeout 300s;
proxy_send_timeout 300s;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
# Test/manual webhook endpoints use the same elevated timeout
location /webhook-test/ {
proxy_pass http://127.0.0.1:5678;
proxy_read_timeout 300s;
proxy_send_timeout 300s;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
The critical detail here is the path-scoped timeout on /webhook/. You do not want proxy_read_timeout 300s on the entire server block — that would allow broken clients to hold editor connections open for five minutes each, which is a different kind of resource exhaustion. Scope the elevated timeout to webhook paths only, and keep the editor on a shorter leash.
Memory Management: Stopping the OOM Killer
Memory is the most common cause of unexpected n8n process death on self-hosted instances. The Linux OOM (out-of-memory) killer terminates the largest memory consumer when the system runs out of RAM. On a 2GB Droplet running n8n, that's always n8n.
The memory profile of a running n8n instance looks like this: the base process consumes around 300–450MB at rest. Each active execution adds 50–200MB depending on data volume. A workflow that fetches a large paginated API response, or one that uses a Code node to build a large array, can spike to 800MB for a single execution. Three such executions running simultaneously on a 2GB VPS will kill the process.
Set Node.js max heap size explicitly
By default, Node.js allocates heap based on available system RAM — on a 4GB server it may attempt to use 2–3GB of heap before garbage collecting. Set the V8 max heap size to a value that leaves room for the OS and other processes. Add this to your environment:
NODE_OPTIONS=--max-old-space-size=1536
On a 4GB VPS, 1536MB (1.5GB) cap leaves sufficient headroom for the OS, PostgreSQL, Nginx, and Redis. On a 2GB VPS, cap at 768MB and keep concurrency at 3–5.
Paginate large data fetches in workflows
The most common cause of memory spikes is fetching all items from a paginated API in one shot and storing them in n8n's execution data. A workflow that pulls 10,000 CRM records in one HTTP call and then runs them through a Code node will hold all 10,000 records in memory simultaneously. Use the Loop Over Items node with batch sizes of 50–200 and process pages sequentially. Your execution will take longer, but your memory footprint stays bounded.
Add swap on any instance under 4GB
Swap is not a substitute for RAM — swapping under load will make your instance painfully slow. But it prevents the OOM kill, which turns a slowdown into a process restart with lost in-flight executions. On DigitalOcean or AWS, add 2GB swap as a safety net:
sudo fallocate -l 2G /swapfile sudo chmod 600 /swapfile sudo mkswap /swapfile sudo swapon /swapfile echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
Configure n8n's binary data storage for large files
When workflows handle attachments, images, or PDF files, n8n by default stores binary data in memory during execution. For workflows handling files over 1MB regularly, switch to filesystem binary data storage so files are written to disk rather than held in the Node.js heap:
N8N_DEFAULT_BINARY_DATA_MODE=filesystem N8N_BINARY_DATA_STORAGE_PATH=/home/n8n/.n8n/binaryData
Recognising the Signals That Tell You Queue Mode Is Needed
Environment variable tuning and proxy configuration will take a well-configured single-instance n8n surprisingly far. But there is a ceiling. When you hit it, the only real fix is queue mode — and the signals are unmistakable once you know what to look for.
The editor becomes unresponsive during peak execution periods
You can only use the n8n UI reliably at times when no heavy workflows are running. This is not a network or browser issue — it's the event loop being starved by execution work.
External callers report webhook timeouts during traffic bursts
Your workflows complete fine under normal load, but when 20+ webhooks arrive in a minute, callers start getting 504s. The concurrency limit is working as designed — but there's now a queue building in front of it that the single process can't drain fast enough.
You need to run more concurrent executions than a single VPS can safely support
Your workflows have grown to the point where the business requires 20+ concurrent executions. Sizing up to a 16GB server to run everything on one process is technically possible but wasteful — queue mode lets you scale horizontally with smaller, cheaper nodes.
A single bad workflow can take down all your other automations
On a single-process instance, one memory-leaking workflow or one infinite loop kills everything. Queue mode isolates executions — a worker process that OOM-kills takes only its in-flight executions with it; the main instance and other workers continue running.
If you're seeing two or more of these symptoms, adding more environment variables won't help. The architecture has to change.
Running n8n for a Real Business — Not a Home Lab?
Queue mode setup, production tuning, and ongoing maintenance for self-hosted n8n on AWS or DigitalOcean is work we do regularly. If you'd rather have it done right without spending three days in the docs, get in touch with your current setup and what's breaking.
Queue Mode Architecture: How It Works
Queue mode splits the single n8n process into three distinct roles. Each role runs as a separate container (or process), and each one can be scaled independently.
Queue Mode Component Roles
Main Instance
Runs the editor UI and API. Receives incoming webhook calls. Fires scheduled triggers. When a workflow execution is needed, it writes an execution record to PostgreSQL and pushes a job onto the Redis queue — but does not execute the workflow itself. The main instance's event loop is now mostly free, which is why the editor stays responsive.
Execution job pushed to Redis queue
Redis (Message Broker)
Maintains the queue of pending executions as a list. Workers poll Redis for new jobs. When a worker finishes a job, it acknowledges completion. Redis also stores the execution status so the main instance can report results back to the UI and API.
Worker picks up execution job
Worker(s) — one or many
Each worker is a separate n8n process that only executes workflows. It polls the Redis queue, picks up jobs, runs the workflow steps, writes results to PostgreSQL, and updates the execution status. Workers have no UI, no webhook listener, no scheduler. You can run as many as your hardware supports, and add more at any time to handle increased load without touching the main instance.
The architectural win is separation of concerns. Incoming webhooks now hit the main instance, which responds immediately by queueing the job — the caller gets an acknowledgement in milliseconds, not after the workflow finishes. Workers execute the actual workflow in parallel, as many as you've configured. A worker that crashes takes only its own current execution with it; the main instance and the other workers continue unaffected.
This also means you can schedule maintenance on workers independently. Drain a worker's queue (stop sending it new jobs), wait for its current execution to finish, update the container, restart it. Zero-downtime maintenance on the execution layer.
Queue Mode Setup on DigitalOcean / AWS (Docker Compose)
The following docker-compose.yml is a production-ready queue mode setup for a single Droplet or EC2 instance. It runs the main n8n instance, two workers, and Redis on the same host. For higher loads, workers can be moved to separate hosts — but starting with this collocated configuration lets you validate the setup before adding network complexity.
docker-compose.yml — n8n Queue Mode (Single Host)
version: "3.8"
# Shared environment variables for all n8n containers.
# Store these in a .env file alongside docker-compose.yml — never hardcode.
x-n8n-env: &n8n-env
DB_TYPE: postgresdb
DB_POSTGRESDB_HOST: your-postgres-host
DB_POSTGRESDB_PORT: 5432
DB_POSTGRESDB_DATABASE: n8n
DB_POSTGRESDB_USER: n8n_user
DB_POSTGRESDB_PASSWORD: ${POSTGRES_PASSWORD}
# Queue mode activation
EXECUTIONS_MODE: queue
QUEUE_BULL_REDIS_HOST: redis
QUEUE_BULL_REDIS_PORT: 6379
# CRITICAL: This key must be IDENTICAL across main and all workers.
# Different keys mean workers cannot decrypt credentials — every workflow fails.
N8N_ENCRYPTION_KEY: ${N8N_ENCRYPTION_KEY}
# Timeouts and limits (match Nginx proxy_read_timeout)
EXECUTIONS_TIMEOUT: 3600
EXECUTIONS_TIMEOUT_MAX: 7200
N8N_DEFAULT_WEBHOOK_TIMEOUT: 30000
N8N_CONCURRENCY_PRODUCTION_LIMIT: 10
# Memory and payload
NODE_OPTIONS: --max-old-space-size=1024
N8N_PAYLOAD_SIZE_MAX: 8388608
N8N_DEFAULT_BINARY_DATA_MODE: filesystem
N8N_BINARY_DATA_STORAGE_PATH: /home/node/.n8n/binaryData
# Execution data pruning
EXECUTIONS_DATA_PRUNE: "true"
EXECUTIONS_DATA_MAX_AGE: 720
EXECUTIONS_DATA_SAVE_ON_ERROR: all
EXECUTIONS_DATA_SAVE_ON_SUCCESS: none
N8N_HOST: n8n.yourdomain.com
N8N_PROTOCOL: https
WEBHOOK_URL: https://n8n.yourdomain.com/
services:
redis:
image: redis:7-alpine
restart: unless-stopped
volumes:
- redis_data:/data
command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
n8n-main:
image: n8nio/n8n:latest
restart: unless-stopped
environment:
<<: *n8n-env
ports:
- "127.0.0.1:5678:5678"
volumes:
- n8n_data:/home/node/.n8n
depends_on:
redis:
condition: service_healthy
command: ["n8n", "start"]
n8n-worker-1:
image: n8nio/n8n:latest
restart: unless-stopped
environment:
<<: *n8n-env
# Each worker runs up to 10 jobs concurrently.
# I/O-heavy workflows: set 10–20. CPU-heavy: set 2–5.
N8N_CONCURRENCY_PRODUCTION_LIMIT: 10
volumes:
- n8n_data:/home/node/.n8n
depends_on:
redis:
condition: service_healthy
n8n-main:
condition: service_started
command: ["n8n", "worker", "--concurrency=10"]
n8n-worker-2:
image: n8nio/n8n:latest
restart: unless-stopped
environment:
<<: *n8n-env
N8N_CONCURRENCY_PRODUCTION_LIMIT: 10
volumes:
- n8n_data:/home/node/.n8n
depends_on:
redis:
condition: service_healthy
n8n-main:
condition: service_started
command: ["n8n", "worker", "--concurrency=10"]
volumes:
redis_data:
n8n_data:
There are three things in this configuration that will silently destroy your deployment if you get them wrong:
N8N_ENCRYPTION_KEY must be identical across all containers
n8n encrypts all stored credentials (API keys, OAuth tokens, database passwords) using this key. Workers decrypt credentials at execution time. If a worker has a different key — even by a single character — it cannot decrypt any credentials, and every workflow execution that uses credentials will fail silently. Generate one key and use it everywhere. Store it in a .env file that all containers source. Never hardcode it in the compose file.
All containers must share the same n8n_data volume
The main instance and all workers share workflow definitions and binary data through the same mounted volume. If workers have a separate volume mount, they cannot access binary files produced by other workflow steps (attachments downloaded by the main instance and needed by a worker, for example). On multi-host deployments, replace the local volume with an NFS mount or an S3-backed filesystem.
SQLite is incompatible with queue mode
Queue mode requires PostgreSQL. Multiple processes writing to the same SQLite file simultaneously will produce database corruption. If you're upgrading a single-instance SQLite deployment to queue mode, migrate to PostgreSQL first. n8n provides a migration guide in the official docs.
Scaling Workers and Sizing Your Droplet or EC2 Instance
The question most teams get wrong when planning queue mode: they calculate the number of workers they need based on peak concurrent executions, without accounting for the memory each worker consumes.
Each n8n worker process consumes 200–500MB of RAM at steady state (not counting execution data spikes). The actual ceiling depends on how memory-intensive your workflows are. Before choosing hardware, profile your current single-instance setup:
Bash — Profile Per-Execution Memory Usage
# Watch n8n memory every 5 seconds while workflows run.
# Record the delta between peak and baseline to estimate per-execution cost.
watch -n 5 'ps aux | grep n8n | grep -v grep | awk "{print \$6/1024 \" MB\"}"'
# Or with Docker:
watch -n 5 'docker stats n8n-main --no-stream --format "{{.MemUsage}}"'
Use this sizing table as a starting point, then adjust based on your measured per-execution memory:
| Server Size | Workers | Concurrency / Worker | Total Concurrent Executions |
|---|---|---|---|
| 2 vCPU / 4GB DO $24/mo · t3.medium |
2 | 5 | 10 |
| 4 vCPU / 8GB DO $48/mo · t3.xlarge |
3 | 10 | 30 |
| 8 vCPU / 16GB DO $96/mo · t3.2xlarge |
5 | 15 | 75 |
| Separate worker hosts Multi-host Docker / ECS |
Unlimited | 10–20 | Scales horizontally |
For CPU-heavy workflows (large Code nodes, image processing, complex data transformations), halve the concurrency-per-worker figures in this table and add workers instead. For I/O-heavy workflows (HTTP requests, database queries, API calls — the dominant pattern in most n8n deployments), the table values are appropriate.
On AWS, prefer memory-optimised instances (r-series) over compute-optimised (c-series) for n8n workers — most of the time is spent waiting, not computing. DigitalOcean's General Purpose Droplets are a cost-effective choice for the same reason; the dedicated vCPUs prevent noisy-neighbour CPU interference that affects latency on the Premium Intel/AMD lines.
Production Self-Hosting Checklist
Before declaring a self-hosted n8n instance production-ready, go through this list. Each item represents a failure mode we've seen take down a live instance.
EXECUTIONS_TIMEOUT is set to a finite value
The default is unlimited. A workflow that enters an unexpected loop will run forever without this set.
N8N_CONCURRENCY_PRODUCTION_LIMIT is set based on actual memory profile
Measure per-execution memory first. A limit of 10 on a 2GB VPS with 300MB-per-execution workflows will kill the process. Calculate: (available RAM - base process - OS overhead) / memory per execution.
Nginx proxy_read_timeout matches N8N_DEFAULT_WEBHOOK_TIMEOUT
If the proxy times out before n8n does, the caller gets a 504 and n8n continues executing pointlessly. Keep them aligned with a 30-second buffer on the proxy side.
PostgreSQL is used, not SQLite
Mandatory for queue mode. Strongly recommended for any instance processing more than a few hundred executions per day. SQLite under concurrent write load degrades quickly and can corrupt.
All queue mode containers share the same N8N_ENCRYPTION_KEY
The most common queue mode mistake. Verify this by running a workflow that uses credentials on a fresh worker deployment — a silent credential decryption failure looks like a workflow that runs but takes no action.
Redis has a maxmemory policy configured
Without maxmemory-policy allkeys-lru, Redis will consume all available memory and cause the OS to kill it. A dead Redis in queue mode means the main instance can no longer enqueue executions — new webhooks are rejected entirely.
Execution data pruning is configured
Without pruning, your PostgreSQL database grows without bound. A 10GB execution history table on a shared database server will degrade query performance across the board. Set EXECUTIONS_DATA_PRUNE=true and EXECUTIONS_DATA_MAX_AGE to match your debugging window.
Swap is configured on instances under 8GB RAM
Not a performance solution — a crash-prevention measure. 2GB swap on a 4GB instance converts an OOM kill (lost in-flight executions, unexpected restart) into a period of degraded performance (slow but alive).
Container restart policies are set to unless-stopped
All n8n containers (main, workers, Redis) must restart automatically on failure. The default Docker behaviour is no restart — a dead worker container stays dead until you manually intervene.
A memory alert is configured at the server level
DigitalOcean Droplet alerts and AWS CloudWatch both support memory utilisation thresholds. Set a warning at 75% and a critical alert at 90%. The first alert gives you time to investigate; the second means you're minutes from an OOM kill.
Frequently Asked Questions
Is n8n Cloud just a managed version of this? Should I bother self-hosting at all?
n8n Cloud handles all of this for you — infrastructure, scaling, updates, and uptime. The case for self-hosting is data sovereignty and cost at volume. If your workflows process sensitive data (patient records, financial data, confidential contracts) and your organisation's data policies require it to stay within your own infrastructure, self-hosting is not optional. At high execution volumes, n8n Cloud's execution-based pricing can also become substantial — a well-configured self-hosted instance on a $96/month server handling millions of executions per month will cost significantly less than the equivalent Cloud plan. If neither of those factors applies to you, Cloud is the correct choice and everything in this article is someone else's problem.
My webhook works fine manually but times out in production traffic — why?
Manual tests run one at a time and the instance is under no other load. In production, your webhook fires while other workflows are already running — the event loop is partially occupied, and your new execution has to wait. Under concurrent load, what takes 8 seconds in isolation may take 45 seconds because the event loop is contested. The fix is either reducing concurrency (via N8N_CONCURRENCY_PRODUCTION_LIMIT) so the event loop isn't saturated when your webhook arrives, or moving to queue mode so the main instance is never burdened with execution work at all. For webhooks that must respond synchronously within a short window (under 10 seconds), queue mode is the only reliable solution — you cannot guarantee event loop availability with configuration alone.
Does queue mode affect how "Respond to Webhook" nodes work?
Yes, and this is a detail that trips up most queue mode migrations. In default mode, the "Respond to Webhook" node can return a response mid-execution because the same process that received the webhook is also executing the workflow. In queue mode, the main instance receives the webhook and immediately queues the execution to a worker — it no longer controls the execution. Synchronous webhook responses (where the caller waits for a meaningful result) must complete within N8N_DEFAULT_WEBHOOK_TIMEOUT milliseconds on the main instance. If your synchronous webhook workflow takes longer than that, queue mode forces you to restructure it: accept the webhook immediately with a job ID, execute asynchronously, and provide a status endpoint the caller can poll. This is actually a better pattern for long-running operations regardless of n8n's architecture.
How do I migrate from SQLite to PostgreSQL without losing execution history?
The n8n docs include an official migration path using the n8n export and n8n import commands for workflows and credentials. Execution history itself cannot be migrated — it is tied to the database schema in a way that doesn't transfer cleanly between SQLite and PostgreSQL. In practice, most teams treat the migration as a clean cut: export all workflows and credentials (which are the valuable artefacts), bring up the new PostgreSQL-backed instance, import, and accept the loss of historical execution data. If historical data matters for compliance or auditing, export your execution logs to CSV before the migration using n8n's API endpoint before switching databases.
What's the minimum viable setup for a data-privacy-conscious small business that can't use n8n Cloud?
A DigitalOcean 4GB General Purpose Droplet ($24/month) running Docker Compose with n8n, managed PostgreSQL (DO's managed database starts at $15/month), and Nginx reverse proxy. Total infrastructure cost: around $40–50/month. With the environment variables from Section 3 configured correctly, this setup handles 50–100 concurrent executions comfortably for typical I/O-bound automation workflows. Add queue mode and a second worker container on the same Droplet when you start seeing editor sluggishness. Move workers to dedicated Droplets only when you're running 30+ concurrent executions regularly. This is a sensible growth path: start simple, add queue mode when you see the signals, scale workers when you need capacity.
Can I use Caddy instead of Nginx as the reverse proxy?
Yes. Caddy handles TLS automatically (no Certbot needed) and its configuration for n8n is simpler. The equivalent Caddy configuration for elevated webhook timeouts uses the transport block in the reverse proxy directive: set read_timeout 300s and write_timeout 300s within a handle /webhook/* block while leaving the root handler on shorter timeouts. Caddy is a reasonable choice for teams comfortable with its syntax; Nginx is preferred when you need fine-grained per-location control or are managing it alongside other services that already run on Nginx.
Written by
Brendan Andrew Chase
AI agent specialist and digital marketing consultant with 10+ years building automation systems for small and mid-sized businesses across the US, UK, and EU. 200+ projects delivered. Founder of Extra Large Marketing Digital, based in Rio de Janeiro.