Skip to Content
DocumentationKnowledge BaseOverview

What is the Knowledge Base?

The Knowledge Base (KB) is a document store with semantic search. You upload PDFs, Word documents, or plain text files; Oshara automatically chunks them and creates vector embeddings. During a call, the agent can search the KB using natural language queries and synthesise answers from the retrieved passages.

Pipeline

Upload document (PDF / DOCX / TXT) Text extraction Chunking (≈500 tokens per chunk) Embedding (OpenAI text-embedding-ada-002) pgvector storage (cosine similarity index) Agent calls kb tool → semantic search → top-k passages → LLM answer

Document statuses

StatusMeaning
PENDINGDocument uploaded, awaiting processing.
PROCESSEDChunks extracted and embedded. Ready for search.
FAILEDProcessing error (unsupported format, corrupt file, etc.).

Attaching the KB to an agent

Documents are attached per-character. Upload documents to a character’s slug and they are automatically available to any kb tool defined on that character.

See:

Best practices

TipDetails
One topic per fileSplitting content by topic (e.g. refund-policy.pdf, shipping-faq.pdf) improves retrieval precision.
Descriptive filenamesThe filename is returned in search results as metadata.source — the LLM uses it to cite sources.
Clean textPDFs with complex layouts, scanned images, or lots of tables may extract poorly. Use TXT or DOCX for structured content.
Remove boilerplateLegal disclaimers and headers repeated across many pages consume embedding budget and dilute results.
Size limitIndividual files up to 50 MB. No limit on total documents per character.
Last updated on