Vertical scaling (same server)

Add more workers

Each worker handles one call at a time. To increase capacity, add more worker instances:
  1. Create new systemd instances and add sockets to nginx:
# Enable workers 17-20
for i in $(seq 17 20); do
  systemctl enable "voxcore@$i"
  systemctl start "voxcore@$i"
done
  1. Add new sockets to nginx upstream:
upstream voxcore_workers {
    least_conn;
    server unix:/tmp/voxcore_1.sock max_conns=1 fail_timeout=10;
    ...
    server unix:/tmp/voxcore_16.sock max_conns=1 fail_timeout=10;
    # New workers
    server unix:/tmp/voxcore_17.sock max_conns=1 fail_timeout=10;
    server unix:/tmp/voxcore_18.sock max_conns=1 fail_timeout=10;
    server unix:/tmp/voxcore_19.sock max_conns=1 fail_timeout=10;
    server unix:/tmp/voxcore_20.sock max_conns=1 fail_timeout=10;
}
  1. Reload nginx and update .env:
nginx -t && systemctl reload nginx
# Update VOXCORE_WORKERS in .env to match (used by health endpoint)
Memory budget: each idle worker uses ~260 MB. On an 8 GB server, 16 workers is the safe maximum (4 GB baseline + headroom for active calls). Add more RAM before adding more workers.

Horizontal scaling (multiple servers)

Add a new fleet server

Adding a fleet touches two independent paths — inbound and outbound are wired separately. Do both, or the new fleet only serves half of your traffic.
1

Deploy VoxCore

Set up VoxCore on the new server with the same code, .env, and systemd template. Start 16 workers. Confirm https://<new-host>/health/fleet responds before continuing.
2

Configure fleet nginx

Set up nginx with the same Unix socket upstream pattern (max_conns=1 per socket), the 429-retry blocks for /attach, /livekit/dialout, /livekit/widget, and SSL.
3

(1) Inbound — register with the WSS ingress

If inbound WebSocket calls (iCallMate, Exotel) are fronted by the HAProxy ingress, run add-fleet.sh <new-host> on the ingress host. It clones the last server line, validates with haproxy -c, reloads zero-downtime, and rolls back automatically if the new backend does not come UP. Do not hand-edit HAProxy server lines. See WSS ingress.
4

(2) Outbound — register in VoxBridge

Add the new server URL to VoxBridge system settings (voxcore_fleet in VoxUI Settings page) so outbound calls are distributed to it. pick_fleet_server() health-checks the list on every dialout and picks it up immediately — no restart, no code change.
VoxBridge distributes outbound calls across fleet servers using health checks — it picks the server with the most available capacity.
The HAProxy WSS ingress is a single point of failure until a second ingress + managed LB/VRRP is added. Plan the second ingress before crossing ~3 fleets. The ingress is WSS-only, so the VoxBridge voxcore_fleet registration (path 2) is what makes a new fleet usable for outbound calls regardless of the ingress.

Zero-downtime restarts

Restart workers one at a time. While one restarts, the other 15 continue serving:
for i in $(seq 1 16); do
  systemctl restart "voxcore@$i"
  sleep 5
done
Active calls on the restarting worker will be dropped. Schedule restarts during low-traffic periods.