Building a Local Knowledge Base
Witsy allows you to transform your local documents into a searchable knowledge base using Retrieval-Augmented Generation (RAG). By indexing your files, you can ask your AI assistant questions about specific technical manuals, project notes, or legal documents without uploading them to a third-party service (if using local models).
Prerequisites: Setting up Embeddings
Before building a knowledge base, you must configure an Embedding Provider. This is the engine that converts your text into numerical vectors for searching.
- Open Settings > Embeddings.
- Choose your provider:
- Ollama: Best for privacy. Use models like
mxbai-embed-largeornomic-embed-text. - OpenAI: Faster setup, requires an API key. Uses
text-embedding-3-smallortext-embedding-3-large.
- Ollama: Best for privacy. Use models like
- Save your settings to enable indexing capabilities.
How to Create a Knowledge Base
A knowledge base (referred to in the UI as a Document Repository) is a collection of files grouped together for specific context.
- Navigate to the Knowledge Base section in the sidebar.
- Click Create New Repository.
- Give it a descriptive name (e.g., "Project X Documentation" or "Company HR Policies").
- Drag and drop files or select a folder to start the indexing process.
[!TIP] Witsy supports standard text formats, PDFs, and Markdown files. During indexing, you will see a progress bar as the application chunks the text and generates embeddings.
Usage Scenarios
Chatting with a Specific Repository
If you want the AI to focus exclusively on a set of documents:
- Start a new chat.
- In the chat configuration area (bottom or side panel), look for the Knowledge Base dropdown.
- Select your desired repository.
- Ask questions like: "Based on the attached documentation, how do I configure the firewall?"
Using Knowledge Bases with Folders
You can automate context by attaching a knowledge base to a specific folder. Any chat created within that folder will automatically inherit those documents as context.
- Right-click a folder in the sidebar and select Edit Folder.
- In the Defaults tab, select a Document Repository.
- Save changes.
Recipe: Building a "Local First" Private Research Assistant
If you want to ensure your data never leaves your machine while using RAG, follow this configuration:
- Run Ollama locally: Ensure Ollama is running on your machine.
- Set Chat Engine: In Witsy settings, set your Chat Engine to Ollama and select a model (e.g.,
llama3). - Set Embedding Engine: Set your Embedding Engine to Ollama and select
mxbai-embed-large. - Create Repository: Add your sensitive PDF/text files to a new repository.
- Query: You can now search and chat with your documents. All processing stays on your local hardware.
Managing Your Index
To keep your knowledge base performant and relevant:
- Refresh Index: If you modify the files in a linked folder, click the Refresh icon in the Knowledge Base view to re-index changed files.
- Clear Index: If the AI is providing outdated information, you can clear the repository index and re-upload the files.
- Multi-Repo Search: You can attach multiple repositories to a single chat if your research spans across different projects.
Troubleshooting
| Issue | Solution |
| :--- | :--- |
| "No Embedding Provider Configured" | Go to Settings > Embeddings and ensure a provider is selected and the API key (if applicable) is valid. |
| Search results are irrelevant | Ensure the model used for embeddings is fully downloaded in Ollama, or try a more "capable" embedding model like text-embedding-3-large. |
| Indexing is very slow | Large PDF files with many images can slow down extraction. Try converting them to Markdown or plain text for faster processing. |