A knowledge base (KB) is a tenant-scoped collection of documents that VoxCore can query at call time via the search_knowledge tool. VoxBridge owns ingestion (extract → chunk → embed → index) and the internal search endpoint; the vectors live in Qdrant and the metadata in MongoDB. Code: routes/knowledge.py, services/knowledge_service.py, knowledge_documents.py, knowledge_chunking.py, knowledge_embedding.py, knowledge_qdrant.py, models/knowledge.py.

Stores

StoreCollection / objectHolds
MongoDBknowledge_basesKB metadata, counts, status (active/disabled), default_language, tenant_id
MongoDBknowledge_documentsPer-document parse_status, chunk_count, text_char_count, storage_key
Qdrantcollection vox_knowledge_chunks (configurable)Chunk vectors + payload (tenant_id, kb_id, document_id, chunk_id, chunk_text, active, …)
KB and document IDs are prefixed: kb_<hex>, doc_<hex>, and chunk IDs {document_id}_chunk_{i}. Qdrant point IDs are a deterministic UUID5 of the chunk ID, so re-ingestion is idempotent.

Admin endpoints

Under /api/v1/knowledge-bases, all requiring admin:
MethodPathPurpose
POST/Create a KB
GET/List KBs for the tenant
GET/{kb_id}Get one KB
PATCH/{kb_id}Update name/description/status/language
DELETE/{kb_id}Soft-disable (sets status=disabled)
POST/{kb_id}/documentsUpload + ingest a document
GET/{kb_id}/documentsList documents
DELETE/{kb_id}/documents/{document_id}Delete document + its Qdrant chunks

Ingestion pipeline

ingest_document runs synchronously within the upload request and records progress on the document:
StageModuleNotes
Extractknowledge_documents.extract_textSupported: .pdf, .txt, .md, .csv, .docx. PDF via pypdf, DOCX via python-docx (paragraphs + tables). Unsupported types raise KnowledgeDocumentError.
Chunkknowledge_chunking.chunk_textNormalizes whitespace, then slides a window of chunk_size chars with overlap, preferring a \n/. /space boundary past 50% of the window. Defaults chunk_size_chars=1200, chunk_overlap_chars=180.
Embedknowledge_embedding.embed_textsgoogle-genai client, model from knowledge.embedding_model (settings default gemini-embedding-001; the route falls back to text-embedding-004 if the field is unset), output_dimensionality = embedding_dimensions (768). Requires api_keys.google.
Indexknowledge_qdrant.upsert_chunksEnsures the collection (cosine distance) + payload indexes on tenant_id/kb_id/document_id/active.
The upload route enforces max_upload_mb (default 20) and returns 413 if exceeded. On any ingestion error the document is marked failed with a truncated parse_error (the upload still returns 200 with that status).
Upload is not deferred to a worker — extraction, embedding, and Qdrant upsert all happen inside the request. Large documents therefore make the upload call slow rather than returning a queued status.

Bot attachment

A bot’s KB attachment lives in bot.knowledge (BotKnowledgeConfig in models/knowledge.py):
FieldDefaultPurpose
enabledfalseMaster toggle
kb_ids[]Attached KBs (deduped)
top_k4 (1–10)Max chunks returned
score_threshold0.55 (0–1)Min cosine score
stricttrueIf true, emit fallback_message when no hit
trigger_instructions""Natural-language guidance injected into the search_knowledge tool description in VoxCore
fallback_messagedefault sentenceSpoken when nothing is found in strict mode

Search contract (VoxCore)

VoxCore calls POST /api/v1/internal/knowledge/search, authenticated by X-VoxCore-Secret. The request carries bot_id, session_id, query, and optional kb_ids/top_k/score_threshold/strict. search_knowledge enforces and accelerates access:
  • Access guard: validate_bot_kb_access_with_meta intersects the requested kb_ids with the bot’s attached, active KBs. Unattached or disabled KBs are silently dropped; no active KB → empty hits.
  • Redis caching (layered): KB-access (60s), query embeddings (24h), and full results (30m on hit, 2m on no-hit). Result cache keys include a kb_revision derived from each KB’s updated_at/counts/status, so editing a KB invalidates cached answers.
  • Qdrant filter: tenant_id + kb_id ∈ active + active=true, top-k with score_threshold.
  • Response includes hits[] (with score, source_name, chunk_text) and a metrics block (embedding/qdrant/cache timings and cache-hit flags). In strict mode with no hits, fallback_message is returned.

VoxCore pipeline

How search_knowledge is registered and invoked mid-call.

Settings

The knowledge system-settings block (Qdrant URL, embedding model, chunk sizes).

Tools & integrations

Bot-level tool configuration including knowledge.

VoxBridge overview

Where the KB pipeline fits in the control plane.