search_knowledge tool. VoxBridge owns ingestion (extract → chunk → embed → index) and the internal search endpoint; the vectors live in Qdrant and the metadata in MongoDB.
Code: routes/knowledge.py, services/knowledge_service.py, knowledge_documents.py, knowledge_chunking.py, knowledge_embedding.py, knowledge_qdrant.py, models/knowledge.py.
Stores
| Store | Collection / object | Holds |
|---|---|---|
| MongoDB | knowledge_bases | KB metadata, counts, status (active/disabled), default_language, tenant_id |
| MongoDB | knowledge_documents | Per-document parse_status, chunk_count, text_char_count, storage_key |
| Qdrant | collection vox_knowledge_chunks (configurable) | Chunk vectors + payload (tenant_id, kb_id, document_id, chunk_id, chunk_text, active, …) |
kb_<hex>, doc_<hex>, and chunk IDs {document_id}_chunk_{i}. Qdrant point IDs are a deterministic UUID5 of the chunk ID, so re-ingestion is idempotent.
Admin endpoints
Under/api/v1/knowledge-bases, all requiring admin:
| Method | Path | Purpose |
|---|---|---|
POST | / | Create a KB |
GET | / | List KBs for the tenant |
GET | /{kb_id} | Get one KB |
PATCH | /{kb_id} | Update name/description/status/language |
DELETE | /{kb_id} | Soft-disable (sets status=disabled) |
POST | /{kb_id}/documents | Upload + ingest a document |
GET | /{kb_id}/documents | List documents |
DELETE | /{kb_id}/documents/{document_id} | Delete document + its Qdrant chunks |
Ingestion pipeline
ingest_document runs synchronously within the upload request and records progress on the document:
| Stage | Module | Notes |
|---|---|---|
| Extract | knowledge_documents.extract_text | Supported: .pdf, .txt, .md, .csv, .docx. PDF via pypdf, DOCX via python-docx (paragraphs + tables). Unsupported types raise KnowledgeDocumentError. |
| Chunk | knowledge_chunking.chunk_text | Normalizes whitespace, then slides a window of chunk_size chars with overlap, preferring a \n/. /space boundary past 50% of the window. Defaults chunk_size_chars=1200, chunk_overlap_chars=180. |
| Embed | knowledge_embedding.embed_texts | google-genai client, model from knowledge.embedding_model (settings default gemini-embedding-001; the route falls back to text-embedding-004 if the field is unset), output_dimensionality = embedding_dimensions (768). Requires api_keys.google. |
| Index | knowledge_qdrant.upsert_chunks | Ensures the collection (cosine distance) + payload indexes on tenant_id/kb_id/document_id/active. |
max_upload_mb (default 20) and returns 413 if exceeded. On any ingestion error the document is marked failed with a truncated parse_error (the upload still returns 200 with that status).
Upload is not deferred to a worker — extraction, embedding, and Qdrant upsert all happen inside the request. Large documents therefore make the upload call slow rather than returning a queued status.
Bot attachment
A bot’s KB attachment lives inbot.knowledge (BotKnowledgeConfig in models/knowledge.py):
| Field | Default | Purpose |
|---|---|---|
enabled | false | Master toggle |
kb_ids | [] | Attached KBs (deduped) |
top_k | 4 (1–10) | Max chunks returned |
score_threshold | 0.55 (0–1) | Min cosine score |
strict | true | If true, emit fallback_message when no hit |
trigger_instructions | "" | Natural-language guidance injected into the search_knowledge tool description in VoxCore |
fallback_message | default sentence | Spoken when nothing is found in strict mode |
Search contract (VoxCore)
VoxCore callsPOST /api/v1/internal/knowledge/search, authenticated by X-VoxCore-Secret. The request carries bot_id, session_id, query, and optional kb_ids/top_k/score_threshold/strict.
search_knowledge enforces and accelerates access:
- Access guard:
validate_bot_kb_access_with_metaintersects the requestedkb_idswith the bot’s attached,activeKBs. Unattached or disabled KBs are silently dropped; no active KB → empty hits. - Redis caching (layered): KB-access (60s), query embeddings (24h), and full results (30m on hit, 2m on no-hit). Result cache keys include a
kb_revisionderived from each KB’supdated_at/counts/status, so editing a KB invalidates cached answers. - Qdrant filter:
tenant_id+kb_id ∈ active+active=true, top-k withscore_threshold. - Response includes
hits[](withscore,source_name,chunk_text) and ametricsblock (embedding/qdrant/cache timings and cache-hit flags). In strict mode with no hits,fallback_messageis returned.
Related docs
VoxCore pipeline
How
search_knowledge is registered and invoked mid-call.Settings
The
knowledge system-settings block (Qdrant URL, embedding model, chunk sizes).Tools & integrations
Bot-level tool configuration including knowledge.
VoxBridge overview
Where the KB pipeline fits in the control plane.