How to Build a Knowledge Base That Actually Works with AI

2026-03-16

knowledge-baseragbest-practicesai-agents

How to Build a Knowledge Base That Actually Works with AI

You've decided to deploy an AI agent. You upload your documentation, hit "go," and wait for the magic to happen.

But instead of helpful answers, your agent gives vague responses, cites irrelevant information, or worse — makes things up entirely.

The problem isn't the AI. It's your knowledge base.

AI agents powered by Retrieval-Augmented Generation (RAG) are only as good as the documents you feed them. Garbage in, garbage out.

This guide shows you how to build a knowledge base that actually works with AI — from document structure to maintenance strategies.


Why Most Knowledge Bases Fail with AI

Problem 1: Content Written for Humans, Not AI

Humans can infer context. AI agents can't (yet).

Human-friendly but AI-unfriendly:

Q: How much does it cost?
A: Check our pricing page.

AI-friendly:

Q: How much does it cost?
A: We offer four pricing plans:
- Free: $0/month — 1 agent, 100 messages/month
- Starter: $24/month — 3 agents, 5,000 messages/month
- Growth: $79/month — 10 agents, 25,000 messages/month
- Scale: $199/month — 50 agents, 100,000 messages/month

Annual billing saves 17%. Enterprise plans available for 100+ agents.

Why it matters: AI agents need complete, self-contained information. Links to "check the pricing page" force users to leave the conversation.


Problem 2: Outdated or Conflicting Information

AI agents can't tell which document is more recent or authoritative.

If you have two documents saying:

  • Document A (2024): "Our Starter plan is $49/month"
  • Document B (2026): "Our Starter plan is $24/month"

The AI might cite either one, giving wrong information.

Solution: Remove or archive outdated content before uploading.


Problem 3: Too Much Irrelevant Content

More documents ≠ better answers.

If you upload 500 documents covering every topic imaginable, the AI agent will waste time searching through irrelevant content.

Example:

User asks: "How do I reset my password?"

If your knowledge base includes:

  • Product documentation
  • Marketing blog posts
  • Company history
  • Random meeting notes

The AI might pull from a blog post about "The History of Passwords" instead of the actual password reset guide.

Solution: Only upload content relevant to the agent's specific use case.


Problem 4: Poorly Structured Documents

Walls of text with no headings, bullet points, or structure confuse AI agents.

Bad structure:

Our company offers a variety of plans for different business needs. The free
plan is great for individuals or small projects and includes 1 agent with 100
messages per month. If you need more you can upgrade to Starter which is $24
per month and includes 3 agents and 5000 messages or Growth which is $79 per
month with 10 agents and 25000 messages. We also have Scale for $199 per month
with 50 agents and 100000 messages. Annual billing is available with a discount.

Good structure:

## Pricing Plans

We offer four pricing tiers:

### Free Plan
- **Price:** $0/month
- **Agents:** 1
- **Messages:** 100/month
- **Best for:** Individuals, small projects, trying out the platform

### Starter Plan
- **Price:** $24/month (or $20/month billed annually)
- **Agents:** 3
- **Messages:** 5,000/month
- **Best for:** Small teams, growing businesses

### Growth Plan
- **Price:** $79/month (or $66/month billed annually)
- **Agents:** 10
- **Messages:** 25,000/month
- **Best for:** Scaling businesses, multiple departments

### Scale Plan
- **Price:** $199/month (or $165/month billed annually)
- **Agents:** 50
- **Messages:** 100,000/month
- **Best for:** Large teams, high-volume use cases

Why it matters: Structured content is easier for AI to parse and cite accurately.


The 7 Principles of AI-Friendly Knowledge Bases

1. Self-Contained Information

Each document should answer questions completely without requiring users to jump to other resources.

Bad:

Q: How do I integrate with Slack?
A: See our integrations guide.

Good:

Q: How do I integrate with Slack?
A: Follow these steps to integrate Herm.Chat with Slack:

1. Go to your agent settings
2. Click "Integrations" → "Slack"
3. Click "Connect to Slack"
4. Authorize the Herm.Chat app in your workspace
5. Select which channels the agent should monitor
6. Configure @mention settings
7. Save and test

The integration takes 2-3 minutes to set up. Once connected, team members can
@mention your agent in Slack to ask questions.

2. Clear, Scannable Structure

Use headings, bullet points, and tables to organize information.

Why:

  • AI agents parse structured content more accurately
  • Users can quickly verify the agent's answer by scanning the source

Best practices:

  • Use H2 and H3 headings for sections
  • Break long paragraphs into bullet points
  • Use tables for comparisons
  • Bold key terms

3. Consistent Terminology

Use the same terms throughout your knowledge base.

Inconsistent (bad):

  • "AI agent" in one doc
  • "chatbot" in another
  • "bot" in another
  • "assistant" in another

Consistent (good):

  • Always use "AI agent"

Why: Inconsistent terminology confuses AI agents and leads to vague or conflicting answers.


4. Source Attribution

Include metadata so users can verify information:

  • Document title
  • Last updated date
  • Author or department
  • Version number (if applicable)

Example:

---
Title: Password Reset Guide
Last Updated: March 2026
Department: IT
Version: 2.1
---

Why: When the AI cites this document, users can trust the information is current and authoritative.


5. No Assumptions or Jargon

Write clearly and explicitly. Don't assume prior knowledge.

Bad:

Q: How do I deploy the widget?
A: Just drop the script in the footer and you're good.

Good:

Q: How do I deploy the widget?
A: To add the Herm.Chat widget to your website:

1. Copy the embed code from your dashboard (Installation tab)
2. Open your website's HTML editor
3. Paste the code just before the closing </body> tag
4. Save and publish your changes

The widget will appear on all pages where the code is installed.

Platform-specific instructions:
- WordPress: Use a Header/Footer plugin or edit footer.php
- Shopify: Edit theme.liquid
- Wix/Squarespace: Use the Custom Code feature

6. Frequent Updates

AI agents don't know when information is outdated. You must actively maintain your knowledge base.

Maintenance schedule:

  • Weekly: Review and update high-traffic documents (e.g., pricing, features)
  • Monthly: Audit for outdated content
  • Quarterly: Full knowledge base review

Pro tip: Set calendar reminders to prevent your knowledge base from becoming stale.


7. Relevance Over Volume

Only upload documents that are directly relevant to the agent's purpose.

Customer-facing support agent:

  • Product documentation
  • Pricing information
  • FAQs
  • Setup guides
  • Troubleshooting steps

Don't upload:

  • Internal meeting notes
  • Marketing copy unrelated to product
  • Random blog posts
  • Personal opinions

Internal knowledge agent:

  • Employee handbook
  • IT documentation
  • Process guides
  • Runbooks
  • Onboarding materials

Don't upload:

  • External marketing content
  • Confidential strategy documents (unless intended for the agent)
  • Random emails

How to Build Your Knowledge Base: Step-by-Step

Step 1: Define Your Agent's Purpose

Before you upload anything, be crystal clear about what the agent should do.

Examples:

  • "Answer customer questions about our product, pricing, and features"
  • "Help employees find HR policies and IT troubleshooting guides"
  • "Assist sales reps with product information during customer calls"

Why it matters: Purpose determines what content to include.


Step 2: Audit Your Existing Content

Make a list of all potential documents:

  • Product docs
  • FAQs
  • Support articles
  • Wikis
  • Runbooks
  • Guides
  • Policies

For each document, ask:

  • Is this relevant to the agent's purpose?
  • Is this up-to-date?
  • Is this accurate?
  • Is this well-structured?

Delete or archive:

  • Outdated information
  • Irrelevant content
  • Drafts or works-in-progress

Step 3: Fill the Gaps

Identify common questions the agent should answer but you don't have documentation for.

Examples:

  • "Customers always ask about refunds, but we don't have a refund policy doc"
  • "Employees ask about expense reimbursement, but it's buried in a 50-page handbook"

Create new documents to fill these gaps before deploying your agent.


Step 4: Restructure for AI

Go through your remaining documents and apply the 7 principles:

  1. Make content self-contained (remove "see X for more")
  2. Add headings, bullet points, and structure
  3. Standardize terminology
  4. Add metadata (title, date, version)
  5. Simplify jargon and assumptions
  6. Verify accuracy
  7. Remove irrelevant sections

Step 5: Organize by Category

Group documents into logical categories:

Customer-facing agent:

  • Pricing
  • Features
  • Installation
  • Troubleshooting
  • Integrations

Internal agent:

  • HR Policies
  • IT Support
  • Processes
  • Onboarding
  • Tools & Access

Why: Categories help you spot gaps and make maintenance easier.


Step 6: Upload and Test

Upload your documents to your AI agent platform (e.g., Herm.Chat).

Testing checklist:

  1. Ask 20-30 common questions
  2. Verify the agent cites the correct document
  3. Check for accuracy and completeness
  4. Look for vague or unhelpful answers
  5. Refine content based on results

Common issues:

  • Agent gives vague answers → Content is too sparse
  • Agent cites wrong documents → Too much irrelevant content uploaded
  • Agent says "I don't know" → Content exists but isn't structured well

Step 7: Monitor and Iterate

After deployment, review real conversations:

  • What questions is the agent struggling with?
  • What documents is it citing most often?
  • Where are users frustrated?

Use this feedback to improve your knowledge base.


Examples: Bad vs Good Knowledge Base Entries

Example 1: Pricing

❌ Bad:

We have several plans. Check our pricing page for details.

✅ Good:

## Pricing Plans

We offer four pricing tiers to fit every business:

### Free Plan
- **Cost:** $0/month
- **Agents:** 1
- **Messages:** 100/month
- **Best for:** Trying out the platform

### Starter Plan
- **Cost:** $24/month ($20/month annually)
- **Agents:** 3
- **Messages:** 5,000/month
- **Best for:** Small teams

### Growth Plan
- **Cost:** $79/month ($66/month annually)
- **Agents:** 10
- **Messages:** 25,000/month
- **Best for:** Scaling businesses

### Scale Plan
- **Cost:** $199/month ($165/month annually)
- **Agents:** 50
- **Messages:** 100,000/month
- **Best for:** Large teams

Annual billing saves 17%. Enterprise plans available for 100+ agents.

Example 2: Troubleshooting

❌ Bad:

If your widget isn't showing up, check your embed code.

✅ Good:

## Widget Not Showing Up?

Follow these troubleshooting steps:

### 1. Verify Embed Code Placement
- The code should be placed just before the closing </body> tag
- Check that the entire script is copied (no truncation)
- Ensure no extra characters were added when pasting

### 2. Check Browser Cache
- Clear your browser cache and hard refresh (Ctrl+Shift+R or Cmd+Shift+R)
- Test in an incognito/private window
- Try a different browser

### 3. Verify Domain Whitelisting
- Go to your agent settings → Installation
- Confirm your domain is listed under "Allowed Domains"
- Add your domain if missing and save

### 4. Check for JavaScript Conflicts
- Open your browser's developer console (F12)
- Look for errors related to "herm" or "widget"
- Common conflicts: older jQuery versions, ad blockers, privacy extensions

### 5. Still Not Working?
Contact support with:
- Your agent ID
- Your website URL
- Screenshots of the developer console errors
- What browser and device you're using

We typically respond within 2 hours.

Example 3: Internal Policy

❌ Bad:

PTO: 15 days/year. See HR for details.

✅ Good:

## Paid Time Off (PTO) Policy

### Accrual
- **Full-time employees:** 15 days per year
- **Part-time employees:** Pro-rated based on hours worked
- **Accrual starts:** After 90 days of employment
- **Accrual rate:** 1.25 days per month

### Usage
- PTO can be used for vacation, sick leave, or personal time
- Minimum increment: 4 hours (half day)
- Blackout dates: End-of-quarter (last 2 weeks of March, June, September, December)

### Request Process
1. Submit PTO request in BambooHR at least 2 weeks in advance
2. Manager approves or denies within 3 business days
3. For last-minute sick leave, notify your manager via Slack or email ASAP

### Carryover
- Up to 5 unused days can carry over to the next year
- Days beyond 5 are forfeited
- Carryover days must be used by June 30th

### Questions?
Contact HR at hr@company.com or Slack #ask-hr

Advanced: RAG Optimization Techniques

Chunking Strategy

When you upload documents, the AI platform splits them into smaller "chunks" for efficient search.

Best practices:

  • Keep paragraphs concise (3-5 sentences)
  • Use headings to create natural chunk boundaries
  • Avoid giant walls of text

Embedding Quality

RAG systems convert text into "embeddings" (vector representations) for semantic search.

Tips for better embeddings:

  • Use descriptive headings (not "Introduction" — use "How to Reset Your Password")
  • Include key terms in the first sentence of each section
  • Avoid pronouns without clear antecedents ("it," "this," "that" without context)

Similarity Thresholds

AI agents use similarity scores to decide which documents to retrieve.

Too low: Agent retrieves irrelevant documents Too high: Agent says "I don't know" even when the answer exists

Recommended: 0.7 - 0.8 (most platforms handle this automatically)


Source Citation

Configure your agent to cite sources in responses:

Without citations:

Our Starter plan costs $24/month.

With citations:

Our Starter plan costs $24/month.

Source: Pricing Guide (updated March 2026)

Why: Users can verify information, building trust in your AI agent.


Measuring Knowledge Base Quality

Track these metrics:

1. Answer Accuracy Rate

How to measure: Manually review 50 conversations per week. What % of answers are accurate and helpful?

Target: 85-90%


2. "I Don't Know" Rate

How to measure: What % of questions result in "I don't have information on that"?

Target: <10%

If this is high, you're missing important content.


3. Source Citation Rate

How to measure: What % of answers include a source?

Target: >80%

If this is low, your documents may lack clear structure.


4. User Satisfaction

How to measure: Post-conversation survey ("Was this helpful?")

Target: >4.0/5.0


5. Time to Answer

How to measure: Average time from question to complete answer

Target: <5 seconds

If this is slow, you may have too many documents or poor chunk sizes.


Common Mistakes to Avoid

❌ Mistake 1: Uploading Everything

Why it fails: Too much content = slower, less accurate answers.

Fix: Be selective. Only upload relevant content.


❌ Mistake 2: Ignoring Document Maintenance

Why it fails: Outdated information leads to wrong answers.

Fix: Set a monthly review schedule.


❌ Mistake 3: Copy-Pasting Website Content

Why it fails: Website copy is optimized for SEO and marketing, not Q&A.

Fix: Rewrite content specifically for AI agents (clear, structured, self-contained).


❌ Mistake 4: No Testing Before Launch

Why it fails: You discover gaps and errors only after customers complain.

Fix: Test with 30-50 real questions before deploying.


❌ Mistake 5: Assuming AI Will "Figure It Out"

Why it fails: AI agents need high-quality, structured input.

Fix: Put in the work upfront to organize your knowledge base properly.


Your Knowledge Base Checklist

Before deploying your AI agent, ensure you've:

  • Defined your agent's purpose clearly
  • Audited all existing content for relevance and accuracy
  • Removed or archived outdated information
  • Filled gaps in coverage (common questions without docs)
  • Restructured documents with headings, bullets, and tables
  • Standardized terminology across all documents
  • Added metadata (title, date, version) to each document
  • Made content self-contained (no "see X for more")
  • Tested with 30-50 real questions
  • Set up a monthly review schedule
  • Configured source citations in agent responses
  • Measured baseline metrics (accuracy, satisfaction, speed)

Ready to build a knowledge base that powers accurate, helpful AI agent responses?

Start Free — Upload your docs, deploy an AI agent, and see the difference a well-structured knowledge base makes. No credit card required.