Platform overview

Vox is split into five working repos. The important architectural boundary is that VoxCore runs calls, while VoxBridge owns durable business state.
RepoPlaneOwns
voxcoreRuntime/media planeLive call workers, transport adapters, Pipecat pipeline, recording upload, post-call packaging, result delivery, Agent Desk handoff hooks
voxbridgeControl planeAuth, bots, campaigns, system settings, runtime config, call records, CRM integrations, API keys, recordings, fleet selection
voxuiOperator surfaceBot builder, campaign builder, call logs, analytics, settings, Agent Desk, knowledge bases
voxdiallerCampaign execution planeCampaign leases, pacing, retry scheduling, LiveKit SIP dialing, AMD screening, answered-call attach
vohci-widgetEmbeddable web surfaceBrowser call widget (React + LiveKit client) that joins a LiveKit room handled by VoxCore via /livekit/widget

VoxCore runtime architecture

VoxCore is a FastAPI app running multiple single-process uvicorn workers behind nginx. Each production worker is one call slot. Production settings:
SettingCurrent production intent
MAX_CONCURRENT_CALLS1 per worker
VOXCORE_WORKERSActual worker count reported in health, commonly 16 on 8 GB / 4 CPU hosts
nginx max_conns=1Prevents a busy worker socket from receiving another call
uvicorn --workersAlways 1; systemd instance count is the worker count
This model is intentionally simple: one stuck/crashed worker can drop one call, not a batch of calls.

Call entry points

Entry pointRouteUsed byNotes
iCallMate WebSocketwss://fleet.example.com/ws/{bot_id}iCallMate dialler8 kHz LINEAR16
Exotel Voicebotwss://fleet.example.com/exotel/{bot_id}Exotel App Bazaar8 kHz LINEAR16, built-in Pipecat serializer
LiveKit SIP inboundPOST /livekit/dispatchLiveKit SIP webhookDID -> LiveKit room -> bot
LiveKit SIP outboundPOST /livekit/dialoutVoxBridge direct dialoutVoxCore creates room and dials SIP participant
Campaign attachPOST /attachVoxDiallerDialler creates/screens SIP call, then VoxCore joins answered room
Widget callPOST /livekit/widgetVoxUI/web widgetUses LiveKit room path
All call paths converge on build_and_run_pipeline() in src/voxcore/pipeline/factory.py and run_post_call() in src/voxcore/routes/_post_call.py.

Pipeline

Current provider families:
LayerProviders
STTDeepgram Nova, Deepgram Flux, Soniox
LLMGemini via Google AI, OpenAI, Google Vertex AI
TTSElevenLabs, Sarvam, Soniox
VAD / turnsSilero ONNX, SmartTurnV3 ONNX, or Flux external turn detection
Built-in tools include end_call, transfer_call, detected_voicemail, and search_knowledge. The search_knowledge tool queries the bot’s Knowledge Base, which VoxBridge backs with Qdrant vector search over uploaded, chunked, and embedded documents. Custom HTTP tools run through the tool runtime and emit telemetry events. When live prompt caching is active, VoxCore serves the static policy/system prompt from a Gemini/Vertex CachedContent entry (or OpenAI prompt_cache_key hints). In that mode factory.py passes tools=NOT_GIVEN and folds tools and the system instruction into the cached content, because Gemini rejects per-request tools combined with cached_content.

Outbound and campaign flows

Direct outbound calls and campaign calls are intentionally different: VoxBridge currently distributes direct outbound calls by polling configured voxcore_fleet URLs and choosing the server with the most available capacity. VoxDialler performs campaign pacing and attach orchestration separately.

State boundaries

VoxCore should be treated as mostly stateless call execution:
  • It does not store bots, users, campaigns, analytics, or CRM config.
  • It fetches runtime config at call start.
  • It uploads recordings to configured S3-compatible storage.
  • It sends final results to VoxBridge through a durable result outbox.
  • It persists Agent Desk handoff enqueue attempts through a separate outbox.
VoxBridge is the source of truth for product state. MongoDB contains durable records; Redis is used for runtime/cache concerns; Qdrant stores knowledge-base vectors.

Failure model

FailureExpected behavior
One VoxCore worker crashesActive call on that worker is lost; other workers continue.
VoxBridge temporarily unavailable during result deliveryResult remains in VoxCore outbox and is retried.
Agent Desk enqueue temporarily failsHandoff remains in Agent Desk outbox and is retried.
Fleet host fullnginx/app capacity returns busy; VoxBridge/VoxDialler should pick another host or retry later.
Per-host MinIO used for recordingsPlayback depends on finding the originating host; shared object storage is preferred.

Scaling direction

The current production architecture is good for controlled fleet growth, but 1K-channel scale needs more than “more workers”:
  • worker/container registration with TTL
  • atomic capacity reservation
  • partitioned VoxDialler workers or campaign leases
  • shared object storage everywhere
  • stronger metrics, tracing, and per-provider cost visibility
  • scripted or orchestrated host/container provisioning
See Fleet capacity and Docker roadmap for the scale plan.