Platform overview
Vox is split into five working repos. The important architectural boundary is that VoxCore runs calls, while VoxBridge owns durable business state.| Repo | Plane | Owns |
|---|---|---|
voxcore | Runtime/media plane | Live call workers, transport adapters, Pipecat pipeline, recording upload, post-call packaging, result delivery, Agent Desk handoff hooks |
voxbridge | Control plane | Auth, bots, campaigns, system settings, runtime config, call records, CRM integrations, API keys, recordings, fleet selection |
voxui | Operator surface | Bot builder, campaign builder, call logs, analytics, settings, Agent Desk, knowledge bases |
voxdialler | Campaign execution plane | Campaign leases, pacing, retry scheduling, LiveKit SIP dialing, AMD screening, answered-call attach |
vohci-widget | Embeddable web surface | Browser call widget (React + LiveKit client) that joins a LiveKit room handled by VoxCore via /livekit/widget |
VoxCore runtime architecture
VoxCore is a FastAPI app running multiple single-process uvicorn workers behind nginx. Each production worker is one call slot. Production settings:| Setting | Current production intent |
|---|---|
MAX_CONCURRENT_CALLS | 1 per worker |
VOXCORE_WORKERS | Actual worker count reported in health, commonly 16 on 8 GB / 4 CPU hosts |
nginx max_conns=1 | Prevents a busy worker socket from receiving another call |
uvicorn --workers | Always 1; systemd instance count is the worker count |
Call entry points
| Entry point | Route | Used by | Notes |
|---|---|---|---|
| iCallMate WebSocket | wss://fleet.example.com/ws/{bot_id} | iCallMate dialler | 8 kHz LINEAR16 |
| Exotel Voicebot | wss://fleet.example.com/exotel/{bot_id} | Exotel App Bazaar | 8 kHz LINEAR16, built-in Pipecat serializer |
| LiveKit SIP inbound | POST /livekit/dispatch | LiveKit SIP webhook | DID -> LiveKit room -> bot |
| LiveKit SIP outbound | POST /livekit/dialout | VoxBridge direct dialout | VoxCore creates room and dials SIP participant |
| Campaign attach | POST /attach | VoxDialler | Dialler creates/screens SIP call, then VoxCore joins answered room |
| Widget call | POST /livekit/widget | VoxUI/web widget | Uses LiveKit room path |
build_and_run_pipeline() in src/voxcore/pipeline/factory.py and run_post_call() in src/voxcore/routes/_post_call.py.
Pipeline
Current provider families:| Layer | Providers |
|---|---|
| STT | Deepgram Nova, Deepgram Flux, Soniox |
| LLM | Gemini via Google AI, OpenAI, Google Vertex AI |
| TTS | ElevenLabs, Sarvam, Soniox |
| VAD / turns | Silero ONNX, SmartTurnV3 ONNX, or Flux external turn detection |
end_call, transfer_call, detected_voicemail, and search_knowledge. The search_knowledge tool queries the bot’s Knowledge Base, which VoxBridge backs with Qdrant vector search over uploaded, chunked, and embedded documents. Custom HTTP tools run through the tool runtime and emit telemetry events.
When live prompt caching is active, VoxCore serves the static policy/system prompt from a Gemini/Vertex CachedContent entry (or OpenAI prompt_cache_key hints). In that mode factory.py passes tools=NOT_GIVEN and folds tools and the system instruction into the cached content, because Gemini rejects per-request tools combined with cached_content.
Outbound and campaign flows
Direct outbound calls and campaign calls are intentionally different: VoxBridge currently distributes direct outbound calls by polling configuredvoxcore_fleet URLs and choosing the server with the most available capacity. VoxDialler performs campaign pacing and attach orchestration separately.
State boundaries
VoxCore should be treated as mostly stateless call execution:- It does not store bots, users, campaigns, analytics, or CRM config.
- It fetches runtime config at call start.
- It uploads recordings to configured S3-compatible storage.
- It sends final results to VoxBridge through a durable result outbox.
- It persists Agent Desk handoff enqueue attempts through a separate outbox.
Failure model
| Failure | Expected behavior |
|---|---|
| One VoxCore worker crashes | Active call on that worker is lost; other workers continue. |
| VoxBridge temporarily unavailable during result delivery | Result remains in VoxCore outbox and is retried. |
| Agent Desk enqueue temporarily fails | Handoff remains in Agent Desk outbox and is retried. |
| Fleet host full | nginx/app capacity returns busy; VoxBridge/VoxDialler should pick another host or retry later. |
| Per-host MinIO used for recordings | Playback depends on finding the originating host; shared object storage is preferred. |
Scaling direction
The current production architecture is good for controlled fleet growth, but 1K-channel scale needs more than “more workers”:- worker/container registration with TTL
- atomic capacity reservation
- partitioned VoxDialler workers or campaign leases
- shared object storage everywhere
- stronger metrics, tracing, and per-provider cost visibility
- scripted or orchestrated host/container provisioning