Deployments & App Management

Local Cluster Mode

We are introducing cluster support running apps as systemd template units, automatically starting necessary number of copies of your app and stopping previous ones. This works really well with reuse port (SO_REUSEPORT) in Linux.

Apr 23, 2026 — Ruslan Gainutdinov

Local Cluster Mode — Cluster mode for apps with systemd template units and SO_REUSEPORT

Normally, we run your app as isolated systemd unit (service) and that works well for simple app or non-production workloads. Cluster support improves performance, distributes load across multiple CPU cores and runs your app as multiple instances, automatically starting necessary number of copies of your app and stopping previous ones. This works really well with reuse port approach in modern Linux and other operating systems.

How it works

Modern Linux (and other operating systems) support SO_REUSEPORT — a socket option that lets multiple processes bind to the exact same port simultaneously. The kernel evenly distributes incoming connections across all of them. No additional load balancer, no glue code, no multiple ports needed. Your app just listens on its port like normal, and the OS handles the rest.

const { createServer } = require("node:http");

const server = createServer((req, res) => {
  res.writeHead(200, {"Content-Type": "text/html"});
  res.write("Hello, world!\n");
  res.end();
})

server.listen({ host: '127.0.0.1', port: 3000, reusePort: true });

Will run server in reuse port mode, allowing multiple copies of the app listen on the same port.

DollarDeploy pairs this with systemd template units — a single unit file (app@.service) that systemd instantiates as many times as you need (app@1.service, app@2.service, …). Each instance gets the same environment and configuration. When you push a new deploy, we start the new generation of instances first, wait for all of them to be active, then stop the previous ones. At no point is the port unhandled.

NodeJS 22+ support SO_REUSEPORT out of the box or with a single option as shown above. Python, Java, Go and other frameworks and languages have similar options.

Configuration

Enable cluster mode in your app's environment variables:

SYSTEMD_CLUSTER_MODE=1

Default is 0 (single service, existing behavior). Setting this to 1 switches to template unit mode. Requires SYSTEMD_CLUSTER_NODES to be at least 1.

SYSTEMD_CLUSTER_NODES=2

Default is 2. This controls how many instances run in parallel. All instances share the same configuration and environment variables — the OS automatically distributes traffic between them. A good starting point is one instance per CPU core, but for I/O-bound apps two to four is often plenty regardless of core count.

Why not just use a reverse proxy?

The traditional alternative to running multiple processes is configuring Nginx to reach out to multiple servers running on predetermined ports. With SO_REUSEPORT, you use one single port and one or more apps listening on that port will work properly with Nginx. There's no configuration process and no need of managing processes and ports.

What the kernel actually does with multiple listeners

When two or more processes bind to the same port with SO_REUSEPORT, the kernel doesn't round-robin connections between them. It hashes the connection's 4-tuple — source IP, source port, destination IP, destination port — and uses that hash to assign it to one specific listener. The same client will consistently land on the same process instance as long as the pool size doesn't change.

This matters because it keeps related data warm in that CPU core's L1/L2 cache. Round-robin would bounce connections between cores constantly, causing cache misses on every packet. The hash approach means each core mostly talks to its own set of connections without stepping on others.

Each process gets its own dedicated accept queue. There's no shared lock, no thundering herd problem, no cores racing each other for the right to call accept(). This is the perfect for CPU-bound workloads: instead of one core at 100% and the rest idle, you get even distribution across all of them.

How the zero-downtime works

When you redeploy, DollarDeploy starts the new instances first. As SO_REUSEPORT allows multiple processes to bind the same port simultaneously, new instances join the kernel's listener group immediately — the port is never closed. For a brief window, both old and new instances are accepting connections. Once all new instances are active, the old ones stop accepting new connections while finishing any in-flight requests or tasks. The kernel automatically stops routing to a socket once it's closed. No dropped packets, no disconnections, nothing visible to the client.

Enabling it in your app

Most runtimes support SO_REUSEPORT with little or no code change.

Node.js — if you're using the cluster module you're already close, but for cluster mode with DollarDeploy you don't need cluster at all. Each systemd instance is a separate process; just call server.listen({ reusePort: true }).

Python (Gunicorn) — pass --reuse-port:

> gunicorn app:app --bind 0.0.0.0:8000 --reuse-port

Python (raw socket) — set the option before bind():

sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)

Go — the standard net.Listen doesn't expose it directly; use net.ListenConfig with a Control function to call setsockopt on the raw file descriptor before binding.

uWSGI — add --so-keepalive and --reuse-port to your config. Each worker process created by systemd will bind independently.

If your framework or runtime doesn't support it explicitly, you'll get an "Address already in use" error when the second instance starts. Check your runtime's docs for a reuse_port, --reuse-port, or SO_REUSEPORT option — it's almost universally available in anything built-in the last five years.

Caveats worth knowing

Uneven load with long-lived connections. The kernel's hash distributes connections, not request volume. If some clients hold connections open for a long time (WebSockets, HTTP/2, long-polling), one instance can end up holding several heavy connections while another is mostly idle. If that's your workload, cluster mode still helps but you may want to think about connection limits per instance.

Scaling down drops queued connections. When an instance stops and closes its socket, any connections sitting in that instance's accept queue that hadn't been accept()'d yet get a RST. DollarDeploy handles this by waiting for the new instances to be fully active before stopping old ones — old instances are never stopped mid-request — but it's worth knowing if you're managing the lifecycle yourself.

OS support. SO_REUSEPORT has been in Linux since kernel 3.9 (2013), so any modern server you'd actually deploy to has it. BSD has had its own variant longer, but with slightly different semantics. Windows and MacOS does not support it in the same way. This is a non-issue for DollarDeploy since all deployments run on Linux.

Stateless apps. Your app should be stateless and not keep any internal state in the application memory, because it can be restarted or user's request can be routed to a different server. Use Redis or any other cache or database to keep state and read it when processing user's request. This is also important for WebSocket or http streaming - make sure client can handle reconnect and stream state available to other cluster instances.

Sample systemd configuration and code

Create the following systemd unit file, change where your app and NodeJS located:

# /etc/systemd/system/myapp@.service

[Unit]
Description=My Node.js App (instance %i)
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=ubuntu
Group=ubuntu
WorkingDirectory=/home/ubuntu/myapp

# All instances share the same port — SO_REUSEPORT lets the kernel distribute connections
Environment=NODE_ENV=production
Environment=PORT=3000
ExecStart=/usr/bin/node /home/ubuntu/myapp/index.js
Restart=on-failure
RestartSec=5s

# Give the app time to finish in-flight requests on shutdown
TimeoutStopSec=30
KillMode=mixed
KillSignal=SIGTERM

# Basic hardening
NoNewPrivileges=yes
ProtectSystem=strict
PrivateTmp=yes
ReadWritePaths=/home/ubuntu/myapp /var/log/myapp

[Install]
WantedBy=multi-user.target

Here is minimal index.js that correctly enables SO_REUSEPORT:

// index.js
const http = require('http')
const PORT = process.env.PORT || 3000

const server = http.createServer((req, res) => {
  res.writeHead(200)
  res.end(`hello from pid ${process.pid}\n`)
})

// Listen with SO_REUSEPORT set
server.listen({
  hostname: "127.0.0.1",
  port: PORT,
  reusePort: true
}, () => {
  console.log(`[pid ${process.pid}] listening on :${PORT}`)
})

// Graceful shutdown — finish in-flight requests before exiting
process.on('SIGTERM', () => {
  server.close(() => process.exit(0))
})

How to create new instances and stop previous ones:

# Start two instances
sudo systemctl start myapp@1.service myapp@2.service

# Check both are up
sudo systemctl status "myapp@*.service"

# Tail logs from all instances merged
sudo journalctl -u "myapp@*.service" -f

# Rolling deploy: start new, wait, stop old
systemctl start myapp@3.service myapp@4.service
# ... wait for health ...
systemctl stop myapp@1.service myapp@2.service
systemctl reset-failed "myapp@1.service" "myapp@2.service"

The only thing that makes this a cluster rather than a regular unit is the @ in the systemd unit filename. The PORT is identical across all instances — the kernel handles distribution via SO_REUSEPORT transparently. The %i in Description= is just for systemctl status readability so you can tell instances apart in the output.

When to use it

Cluster mode is useful if you're:

Running a CPU-bound app and want to distribute workload on multiple cores without complicated process manager inside your app
Deploying frequently and need zero-downtime for your app

As always, the existing single-service mode remains the default — cluster mode is opt-in. Give it a try and let us know how it goes.

Read more technical details here