Principle

Dockerization should first change deployment mechanics, not product behavior. The first target is not a full rewrite, not Kubernetes, and not a new call router. The first target is:
  • same APIs
  • same runtime config flow
  • same call lifecycle
  • same post-call result flow
  • same one-call-per-worker capacity model
  • repeatable container images

Phase 1: local Docker Compose

Purpose: make every repo runnable with one command for development and staging smoke tests. Services:
ServiceImage/process
voxbridgeFastAPI app
voxuiVite build served by nginx/Caddy
voxcore-workerOne VoxCore worker process
voxdiallerCampaign loop + health server
mongoLocal development MongoDB
redisLocal development Redis
minioLocal recording storage
This phase should not try to model 1K channels. It should make integration testing cheap.

Phase 2: cheapest production Docker host

Purpose: replace systemd app management on a VPS with container management. Recommended host shape:
fleet host:
  haproxy/nginx container
  voxcore-worker-1
  voxcore-worker-2
  ...
  voxcore-worker-N
Each worker container runs MAX_CONCURRENT_CALLS=1. The local LB sets maxconn 1 per backend. The public URL remains:
wss://fleet.client.com/ws/{bot_id}
https://fleet.client.com/livekit/dialout
https://fleet.client.com/attach
Part of this “future” ingress scaffolding already exists in production. Aetherix fronts two fleets behind an HAProxy ingress at calls.vohci.com, generated by render.sh and extended by add-fleet.sh. So Phase 2’s containerized per-host LB is the remaining step, not the HAProxy concept itself — the templated, brand-portable HAProxy config is already battle-tested. See Fleet capacity and WSS ingress.

Phase 3: image registry and scripted deploys

Purpose: stop rsyncing code. Workflow:
  1. Build images for voxcore, voxbridge, voxui, and voxdialler.
  2. Push to registry.
  3. Pull on target servers.
  4. Restart containers with health checks.
  5. Roll back by pinning a previous image tag.
Image tags should include git SHA and environment-friendly aliases such as staging and production.

Phase 4: capacity registry

Purpose: prepare for hundreds to thousands of call slots. Replace “poll every fleet URL” with a central registry: The registry must support:
  • worker registration with TTL
  • atomic slot reservation
  • reservation expiry if attach/start fails
  • drain mode for deploys
  • per-client or per-campaign quotas later

Phase 5: orchestration

Only introduce Kubernetes/Nomad/RKE2 when the operational pain justifies it:
  • many fleet hosts
  • repeated rolling deploys
  • need for autoscaling
  • centralized service discovery
  • stronger isolation between tenants
For India cost sensitivity, managed Kubernetes on hyperscalers is not automatically the cheapest path. Containers on negotiated VPS/bare-metal capacity may be cheaper until scale and operations justify orchestration.