How to Build a Knowledge Base That Actually Works with AI
2026-03-16
How to Build a Knowledge Base That Actually Works with AI
You've decided to deploy an AI agent. You upload your documentation, hit "go," and wait for the magic to happen.
But instead of helpful answers, your agent gives vague responses, cites irrelevant information, or worse — makes things up entirely.
The problem isn't the AI. It's your knowledge base.
AI agents powered by Retrieval-Augmented Generation (RAG) are only as good as the documents you feed them. Garbage in, garbage out.
This guide shows you how to build a knowledge base that actually works with AI — from document structure to maintenance strategies.
Why Most Knowledge Bases Fail with AI
Problem 1: Content Written for Humans, Not AI
Humans can infer context. AI agents can't (yet).
Human-friendly but AI-unfriendly:
Q: How much does it cost?
A: Check our pricing page.
AI-friendly:
Q: How much does it cost?
A: We offer four pricing plans:
- Free: $0/month — 1 agent, 100 messages/month
- Starter: $24/month — 3 agents, 5,000 messages/month
- Growth: $79/month — 10 agents, 25,000 messages/month
- Scale: $199/month — 50 agents, 100,000 messages/month
Annual billing saves 17%. Enterprise plans available for 100+ agents.
Why it matters: AI agents need complete, self-contained information. Links to "check the pricing page" force users to leave the conversation.
Problem 2: Outdated or Conflicting Information
AI agents can't tell which document is more recent or authoritative.
If you have two documents saying:
- Document A (2024): "Our Starter plan is $49/month"
- Document B (2026): "Our Starter plan is $24/month"
The AI might cite either one, giving wrong information.
Solution: Remove or archive outdated content before uploading.
Problem 3: Too Much Irrelevant Content
More documents ≠ better answers.
If you upload 500 documents covering every topic imaginable, the AI agent will waste time searching through irrelevant content.
Example:
User asks: "How do I reset my password?"
If your knowledge base includes:
- Product documentation
- Marketing blog posts
- Company history
- Random meeting notes
The AI might pull from a blog post about "The History of Passwords" instead of the actual password reset guide.
Solution: Only upload content relevant to the agent's specific use case.
Problem 4: Poorly Structured Documents
Walls of text with no headings, bullet points, or structure confuse AI agents.
Bad structure:
Our company offers a variety of plans for different business needs. The free
plan is great for individuals or small projects and includes 1 agent with 100
messages per month. If you need more you can upgrade to Starter which is $24
per month and includes 3 agents and 5000 messages or Growth which is $79 per
month with 10 agents and 25000 messages. We also have Scale for $199 per month
with 50 agents and 100000 messages. Annual billing is available with a discount.
Good structure:
## Pricing Plans
We offer four pricing tiers:
### Free Plan
- **Price:** $0/month
- **Agents:** 1
- **Messages:** 100/month
- **Best for:** Individuals, small projects, trying out the platform
### Starter Plan
- **Price:** $24/month (or $20/month billed annually)
- **Agents:** 3
- **Messages:** 5,000/month
- **Best for:** Small teams, growing businesses
### Growth Plan
- **Price:** $79/month (or $66/month billed annually)
- **Agents:** 10
- **Messages:** 25,000/month
- **Best for:** Scaling businesses, multiple departments
### Scale Plan
- **Price:** $199/month (or $165/month billed annually)
- **Agents:** 50
- **Messages:** 100,000/month
- **Best for:** Large teams, high-volume use cases
Why it matters: Structured content is easier for AI to parse and cite accurately.
The 7 Principles of AI-Friendly Knowledge Bases
1. Self-Contained Information
Each document should answer questions completely without requiring users to jump to other resources.
Bad:
Q: How do I integrate with Slack?
A: See our integrations guide.
Good:
Q: How do I integrate with Slack?
A: Follow these steps to integrate Herm.Chat with Slack:
1. Go to your agent settings
2. Click "Integrations" → "Slack"
3. Click "Connect to Slack"
4. Authorize the Herm.Chat app in your workspace
5. Select which channels the agent should monitor
6. Configure @mention settings
7. Save and test
The integration takes 2-3 minutes to set up. Once connected, team members can
@mention your agent in Slack to ask questions.
2. Clear, Scannable Structure
Use headings, bullet points, and tables to organize information.
Why:
- AI agents parse structured content more accurately
- Users can quickly verify the agent's answer by scanning the source
Best practices:
- Use H2 and H3 headings for sections
- Break long paragraphs into bullet points
- Use tables for comparisons
- Bold key terms
3. Consistent Terminology
Use the same terms throughout your knowledge base.
Inconsistent (bad):
- "AI agent" in one doc
- "chatbot" in another
- "bot" in another
- "assistant" in another
Consistent (good):
- Always use "AI agent"
Why: Inconsistent terminology confuses AI agents and leads to vague or conflicting answers.
4. Source Attribution
Include metadata so users can verify information:
- Document title
- Last updated date
- Author or department
- Version number (if applicable)
Example:
---
Title: Password Reset Guide
Last Updated: March 2026
Department: IT
Version: 2.1
---
Why: When the AI cites this document, users can trust the information is current and authoritative.
5. No Assumptions or Jargon
Write clearly and explicitly. Don't assume prior knowledge.
Bad:
Q: How do I deploy the widget?
A: Just drop the script in the footer and you're good.
Good:
Q: How do I deploy the widget?
A: To add the Herm.Chat widget to your website:
1. Copy the embed code from your dashboard (Installation tab)
2. Open your website's HTML editor
3. Paste the code just before the closing </body> tag
4. Save and publish your changes
The widget will appear on all pages where the code is installed.
Platform-specific instructions:
- WordPress: Use a Header/Footer plugin or edit footer.php
- Shopify: Edit theme.liquid
- Wix/Squarespace: Use the Custom Code feature
6. Frequent Updates
AI agents don't know when information is outdated. You must actively maintain your knowledge base.
Maintenance schedule:
- Weekly: Review and update high-traffic documents (e.g., pricing, features)
- Monthly: Audit for outdated content
- Quarterly: Full knowledge base review
Pro tip: Set calendar reminders to prevent your knowledge base from becoming stale.
7. Relevance Over Volume
Only upload documents that are directly relevant to the agent's purpose.
Customer-facing support agent:
- Product documentation
- Pricing information
- FAQs
- Setup guides
- Troubleshooting steps
Don't upload:
- Internal meeting notes
- Marketing copy unrelated to product
- Random blog posts
- Personal opinions
Internal knowledge agent:
- Employee handbook
- IT documentation
- Process guides
- Runbooks
- Onboarding materials
Don't upload:
- External marketing content
- Confidential strategy documents (unless intended for the agent)
- Random emails
How to Build Your Knowledge Base: Step-by-Step
Step 1: Define Your Agent's Purpose
Before you upload anything, be crystal clear about what the agent should do.
Examples:
- "Answer customer questions about our product, pricing, and features"
- "Help employees find HR policies and IT troubleshooting guides"
- "Assist sales reps with product information during customer calls"
Why it matters: Purpose determines what content to include.
Step 2: Audit Your Existing Content
Make a list of all potential documents:
- Product docs
- FAQs
- Support articles
- Wikis
- Runbooks
- Guides
- Policies
For each document, ask:
- Is this relevant to the agent's purpose?
- Is this up-to-date?
- Is this accurate?
- Is this well-structured?
Delete or archive:
- Outdated information
- Irrelevant content
- Drafts or works-in-progress
Step 3: Fill the Gaps
Identify common questions the agent should answer but you don't have documentation for.
Examples:
- "Customers always ask about refunds, but we don't have a refund policy doc"
- "Employees ask about expense reimbursement, but it's buried in a 50-page handbook"
Create new documents to fill these gaps before deploying your agent.
Step 4: Restructure for AI
Go through your remaining documents and apply the 7 principles:
- Make content self-contained (remove "see X for more")
- Add headings, bullet points, and structure
- Standardize terminology
- Add metadata (title, date, version)
- Simplify jargon and assumptions
- Verify accuracy
- Remove irrelevant sections
Step 5: Organize by Category
Group documents into logical categories:
Customer-facing agent:
- Pricing
- Features
- Installation
- Troubleshooting
- Integrations
Internal agent:
- HR Policies
- IT Support
- Processes
- Onboarding
- Tools & Access
Why: Categories help you spot gaps and make maintenance easier.
Step 6: Upload and Test
Upload your documents to your AI agent platform (e.g., Herm.Chat).
Testing checklist:
- Ask 20-30 common questions
- Verify the agent cites the correct document
- Check for accuracy and completeness
- Look for vague or unhelpful answers
- Refine content based on results
Common issues:
- Agent gives vague answers → Content is too sparse
- Agent cites wrong documents → Too much irrelevant content uploaded
- Agent says "I don't know" → Content exists but isn't structured well
Step 7: Monitor and Iterate
After deployment, review real conversations:
- What questions is the agent struggling with?
- What documents is it citing most often?
- Where are users frustrated?
Use this feedback to improve your knowledge base.
Examples: Bad vs Good Knowledge Base Entries
Example 1: Pricing
❌ Bad:
We have several plans. Check our pricing page for details.
✅ Good:
## Pricing Plans
We offer four pricing tiers to fit every business:
### Free Plan
- **Cost:** $0/month
- **Agents:** 1
- **Messages:** 100/month
- **Best for:** Trying out the platform
### Starter Plan
- **Cost:** $24/month ($20/month annually)
- **Agents:** 3
- **Messages:** 5,000/month
- **Best for:** Small teams
### Growth Plan
- **Cost:** $79/month ($66/month annually)
- **Agents:** 10
- **Messages:** 25,000/month
- **Best for:** Scaling businesses
### Scale Plan
- **Cost:** $199/month ($165/month annually)
- **Agents:** 50
- **Messages:** 100,000/month
- **Best for:** Large teams
Annual billing saves 17%. Enterprise plans available for 100+ agents.
Example 2: Troubleshooting
❌ Bad:
If your widget isn't showing up, check your embed code.
✅ Good:
## Widget Not Showing Up?
Follow these troubleshooting steps:
### 1. Verify Embed Code Placement
- The code should be placed just before the closing </body> tag
- Check that the entire script is copied (no truncation)
- Ensure no extra characters were added when pasting
### 2. Check Browser Cache
- Clear your browser cache and hard refresh (Ctrl+Shift+R or Cmd+Shift+R)
- Test in an incognito/private window
- Try a different browser
### 3. Verify Domain Whitelisting
- Go to your agent settings → Installation
- Confirm your domain is listed under "Allowed Domains"
- Add your domain if missing and save
### 4. Check for JavaScript Conflicts
- Open your browser's developer console (F12)
- Look for errors related to "herm" or "widget"
- Common conflicts: older jQuery versions, ad blockers, privacy extensions
### 5. Still Not Working?
Contact support with:
- Your agent ID
- Your website URL
- Screenshots of the developer console errors
- What browser and device you're using
We typically respond within 2 hours.
Example 3: Internal Policy
❌ Bad:
PTO: 15 days/year. See HR for details.
✅ Good:
## Paid Time Off (PTO) Policy
### Accrual
- **Full-time employees:** 15 days per year
- **Part-time employees:** Pro-rated based on hours worked
- **Accrual starts:** After 90 days of employment
- **Accrual rate:** 1.25 days per month
### Usage
- PTO can be used for vacation, sick leave, or personal time
- Minimum increment: 4 hours (half day)
- Blackout dates: End-of-quarter (last 2 weeks of March, June, September, December)
### Request Process
1. Submit PTO request in BambooHR at least 2 weeks in advance
2. Manager approves or denies within 3 business days
3. For last-minute sick leave, notify your manager via Slack or email ASAP
### Carryover
- Up to 5 unused days can carry over to the next year
- Days beyond 5 are forfeited
- Carryover days must be used by June 30th
### Questions?
Contact HR at hr@company.com or Slack #ask-hr
Advanced: RAG Optimization Techniques
Chunking Strategy
When you upload documents, the AI platform splits them into smaller "chunks" for efficient search.
Best practices:
- Keep paragraphs concise (3-5 sentences)
- Use headings to create natural chunk boundaries
- Avoid giant walls of text
Embedding Quality
RAG systems convert text into "embeddings" (vector representations) for semantic search.
Tips for better embeddings:
- Use descriptive headings (not "Introduction" — use "How to Reset Your Password")
- Include key terms in the first sentence of each section
- Avoid pronouns without clear antecedents ("it," "this," "that" without context)
Similarity Thresholds
AI agents use similarity scores to decide which documents to retrieve.
Too low: Agent retrieves irrelevant documents Too high: Agent says "I don't know" even when the answer exists
Recommended: 0.7 - 0.8 (most platforms handle this automatically)
Source Citation
Configure your agent to cite sources in responses:
Without citations:
Our Starter plan costs $24/month.
With citations:
Our Starter plan costs $24/month.
Source: Pricing Guide (updated March 2026)
Why: Users can verify information, building trust in your AI agent.
Measuring Knowledge Base Quality
Track these metrics:
1. Answer Accuracy Rate
How to measure: Manually review 50 conversations per week. What % of answers are accurate and helpful?
Target: 85-90%
2. "I Don't Know" Rate
How to measure: What % of questions result in "I don't have information on that"?
Target: <10%
If this is high, you're missing important content.
3. Source Citation Rate
How to measure: What % of answers include a source?
Target: >80%
If this is low, your documents may lack clear structure.
4. User Satisfaction
How to measure: Post-conversation survey ("Was this helpful?")
Target: >4.0/5.0
5. Time to Answer
How to measure: Average time from question to complete answer
Target: <5 seconds
If this is slow, you may have too many documents or poor chunk sizes.
Common Mistakes to Avoid
❌ Mistake 1: Uploading Everything
Why it fails: Too much content = slower, less accurate answers.
Fix: Be selective. Only upload relevant content.
❌ Mistake 2: Ignoring Document Maintenance
Why it fails: Outdated information leads to wrong answers.
Fix: Set a monthly review schedule.
❌ Mistake 3: Copy-Pasting Website Content
Why it fails: Website copy is optimized for SEO and marketing, not Q&A.
Fix: Rewrite content specifically for AI agents (clear, structured, self-contained).
❌ Mistake 4: No Testing Before Launch
Why it fails: You discover gaps and errors only after customers complain.
Fix: Test with 30-50 real questions before deploying.
❌ Mistake 5: Assuming AI Will "Figure It Out"
Why it fails: AI agents need high-quality, structured input.
Fix: Put in the work upfront to organize your knowledge base properly.
Your Knowledge Base Checklist
Before deploying your AI agent, ensure you've:
- Defined your agent's purpose clearly
- Audited all existing content for relevance and accuracy
- Removed or archived outdated information
- Filled gaps in coverage (common questions without docs)
- Restructured documents with headings, bullets, and tables
- Standardized terminology across all documents
- Added metadata (title, date, version) to each document
- Made content self-contained (no "see X for more")
- Tested with 30-50 real questions
- Set up a monthly review schedule
- Configured source citations in agent responses
- Measured baseline metrics (accuracy, satisfaction, speed)
Ready to build a knowledge base that powers accurate, helpful AI agent responses?
Start Free — Upload your docs, deploy an AI agent, and see the difference a well-structured knowledge base makes. No credit card required.