Building a Local Knowledge Base

Witsy allows you to transform your local documents into a searchable knowledge base using Retrieval-Augmented Generation (RAG). By indexing your files, you can ask your AI assistant questions about specific technical manuals, project notes, or legal documents without uploading them to a third-party service (if using local models).

Prerequisites: Setting up Embeddings

Before building a knowledge base, you must configure an Embedding Provider. This is the engine that converts your text into numerical vectors for searching.

Open Settings > Embeddings.
Choose your provider:
- Ollama: Best for privacy. Use models like mxbai-embed-large or nomic-embed-text.
- OpenAI: Faster setup, requires an API key. Uses text-embedding-3-small or text-embedding-3-large.
Save your settings to enable indexing capabilities.

How to Create a Knowledge Base

A knowledge base (referred to in the UI as a Document Repository) is a collection of files grouped together for specific context.

Navigate to the Knowledge Base section in the sidebar.
Click Create New Repository.
Give it a descriptive name (e.g., "Project X Documentation" or "Company HR Policies").
Drag and drop files or select a folder to start the indexing process.

[!TIP] Witsy supports standard text formats, PDFs, and Markdown files. During indexing, you will see a progress bar as the application chunks the text and generates embeddings.

Usage Scenarios

Chatting with a Specific Repository

If you want the AI to focus exclusively on a set of documents:

Start a new chat.
In the chat configuration area (bottom or side panel), look for the Knowledge Base dropdown.
Select your desired repository.
Ask questions like: "Based on the attached documentation, how do I configure the firewall?"

Using Knowledge Bases with Folders

You can automate context by attaching a knowledge base to a specific folder. Any chat created within that folder will automatically inherit those documents as context.

Right-click a folder in the sidebar and select Edit Folder.
In the Defaults tab, select a Document Repository.
Save changes.

Recipe: Building a "Local First" Private Research Assistant

If you want to ensure your data never leaves your machine while using RAG, follow this configuration:

Run Ollama locally: Ensure Ollama is running on your machine.
Set Chat Engine: In Witsy settings, set your Chat Engine to Ollama and select a model (e.g., llama3).
Set Embedding Engine: Set your Embedding Engine to Ollama and select mxbai-embed-large.
Create Repository: Add your sensitive PDF/text files to a new repository.
Query: You can now search and chat with your documents. All processing stays on your local hardware.

Managing Your Index

To keep your knowledge base performant and relevant:

Refresh Index: If you modify the files in a linked folder, click the Refresh icon in the Knowledge Base view to re-index changed files.
Clear Index: If the AI is providing outdated information, you can clear the repository index and re-upload the files.
Multi-Repo Search: You can attach multiple repositories to a single chat if your research spans across different projects.

Troubleshooting

| Issue | Solution | | :--- | :--- | | "No Embedding Provider Configured" | Go to Settings > Embeddings and ensure a provider is selected and the API key (if applicable) is valid. | | Search results are irrelevant | Ensure the model used for embeddings is fully downloaded in Ollama, or try a more "capable" embedding model like text-embedding-3-large. | | Indexing is very slow | Large PDF files with many images can slow down extraction. Try converting them to Markdown or plain text for faster processing. |