How to Build a Knowledge Base That Actually Works with AI

You've decided to deploy an AI agent. You upload your documentation, hit "go," and wait for the magic to happen.

But instead of helpful answers, your agent gives vague responses, cites irrelevant information, or worse — makes things up entirely.

The problem isn't the AI. It's your knowledge base.

AI agents powered by Retrieval-Augmented Generation (RAG) are only as good as the documents you feed them. Garbage in, garbage out.

This guide shows you how to build a knowledge base that actually works with AI — from document structure to maintenance strategies.

Why Most Knowledge Bases Fail with AI

Problem 1: Content Written for Humans, Not AI

Humans can infer context. AI agents can't (yet).

Human-friendly but AI-unfriendly:

Q: How much does it cost?
A: Check our pricing page.

AI-friendly:

Q: How much does it cost?
A: We offer four pricing plans:
- Free: $0/month — 1 agent, 100 messages/month
- Starter: $24/month — 3 agents, 5,000 messages/month
- Growth: $79/month — 10 agents, 25,000 messages/month
- Scale: $199/month — 50 agents, 100,000 messages/month

Annual billing saves 17%. Enterprise plans available for 100+ agents.

Why it matters: AI agents need complete, self-contained information. Links to "check the pricing page" force users to leave the conversation.

Problem 2: Outdated or Conflicting Information

AI agents can't tell which document is more recent or authoritative.

If you have two documents saying:

Document A (2024): "Our Starter plan is $49/month"
Document B (2026): "Our Starter plan is $24/month"

The AI might cite either one, giving wrong information.

Solution: Remove or archive outdated content before uploading.

Problem 3: Too Much Irrelevant Content

Problem 4: Poorly Structured Documents

Walls of text with no headings, bullet points, or structure confuse AI agents.

Bad structure:

Our company offers a variety of plans for different business needs. The free
plan is great for individuals or small projects and includes 1 agent with 100
messages per month. If you need more you can upgrade to Starter which is $24
per month and includes 3 agents and 5000 messages or Growth which is $79 per
month with 10 agents and 25000 messages. We also have Scale for $199 per month
with 50 agents and 100000 messages. Annual billing is available with a discount.

Good structure:

## Pricing Plans

We offer four pricing tiers:

### Free Plan
- **Price:** $0/month
- **Agents:** 1
- **Messages:** 100/month
- **Best for:** Individuals, small projects, trying out the platform

### Starter Plan
- **Price:** $24/month (or $20/month billed annually)
- **Agents:** 3
- **Messages:** 5,000/month
- **Best for:** Small teams, growing businesses

### Growth Plan
- **Price:** $79/month (or $66/month billed annually)
- **Agents:** 10
- **Messages:** 25,000/month
- **Best for:** Scaling businesses, multiple departments

### Scale Plan
- **Price:** $199/month (or $165/month billed annually)
- **Agents:** 50
- **Messages:** 100,000/month
- **Best for:** Large teams, high-volume use cases

Why it matters: Structured content is easier for AI to parse and cite accurately.

The 7 Principles of AI-Friendly Knowledge Bases

1. Self-Contained Information

Each document should answer questions completely without requiring users to jump to other resources.

Bad:

Q: How do I integrate with Slack?
A: See our integrations guide.

Good:

Q: How do I integrate with Slack?
A: Follow these steps to integrate Herm.Chat with Slack:

1. Go to your agent settings
2. Click "Integrations" → "Slack"
3. Click "Connect to Slack"
4. Authorize the Herm.Chat app in your workspace
5. Select which channels the agent should monitor
6. Configure @mention settings
7. Save and test

The integration takes 2-3 minutes to set up. Once connected, team members can
@mention your agent in Slack to ask questions.

2. Clear, Scannable Structure

Use headings, bullet points, and tables to organize information.

Why:

AI agents parse structured content more accurately
Users can quickly verify the agent's answer by scanning the source

Best practices:

Use H2 and H3 headings for sections
Break long paragraphs into bullet points
Use tables for comparisons
Bold key terms

3. Consistent Terminology

Use the same terms throughout your knowledge base.

Inconsistent (bad):

"AI agent" in one doc
"chatbot" in another
"bot" in another
"assistant" in another

Consistent (good):

Always use "AI agent"

Why: Inconsistent terminology confuses AI agents and leads to vague or conflicting answers.

4. Source Attribution

Include metadata so users can verify information:

Document title
Last updated date
Author or department
Version number (if applicable)

Example:

---
Title: Password Reset Guide
Last Updated: March 2026
Department: IT
Version: 2.1
---

Why: When the AI cites this document, users can trust the information is current and authoritative.

5. No Assumptions or Jargon

Write clearly and explicitly. Don't assume prior knowledge.

Bad:

Q: How do I deploy the widget?
A: Just drop the script in the footer and you're good.

Good:

Q: How do I deploy the widget?
A: To add the Herm.Chat widget to your website:

1. Copy the embed code from your dashboard (Installation tab)
2. Open your website's HTML editor
3. Paste the code just before the closing </body> tag
4. Save and publish your changes

The widget will appear on all pages where the code is installed.

Platform-specific instructions:
- WordPress: Use a Header/Footer plugin or edit footer.php
- Shopify: Edit theme.liquid
- Wix/Squarespace: Use the Custom Code feature

6. Frequent Updates

AI agents don't know when information is outdated. You must actively maintain your knowledge base.

Maintenance schedule:

Weekly: Review and update high-traffic documents (e.g., pricing, features)
Monthly: Audit for outdated content
Quarterly: Full knowledge base review

Pro tip: Set calendar reminders to prevent your knowledge base from becoming stale.

7. Relevance Over Volume

Only upload documents that are directly relevant to the agent's purpose.

Customer-facing support agent:

Product documentation
Pricing information
FAQs
Setup guides
Troubleshooting steps

Don't upload:

Internal meeting notes
Marketing copy unrelated to product
Random blog posts
Personal opinions

Internal knowledge agent:

Employee handbook
IT documentation
Process guides
Runbooks
Onboarding materials

Don't upload:

External marketing content
Confidential strategy documents (unless intended for the agent)
Random emails

How to Build Your Knowledge Base: Step-by-Step

Step 1: Define Your Agent's Purpose

Before you upload anything, be crystal clear about what the agent should do.

Examples:

"Answer customer questions about our product, pricing, and features"
"Help employees find HR policies and IT troubleshooting guides"
"Assist sales reps with product information during customer calls"

Why it matters: Purpose determines what content to include.

Step 2: Audit Your Existing Content

Make a list of all potential documents:

Product docs
FAQs
Support articles
Wikis
Runbooks
Guides
Policies

For each document, ask:

Is this relevant to the agent's purpose?
Is this up-to-date?
Is this accurate?
Is this well-structured?

Delete or archive:

Outdated information
Irrelevant content
Drafts or works-in-progress

Step 3: Fill the Gaps

Identify common questions the agent should answer but you don't have documentation for.

Examples:

"Customers always ask about refunds, but we don't have a refund policy doc"
"Employees ask about expense reimbursement, but it's buried in a 50-page handbook"

Create new documents to fill these gaps before deploying your agent.

Step 4: Restructure for AI

Go through your remaining documents and apply the 7 principles:

Make content self-contained (remove "see X for more")
Add headings, bullet points, and structure
Standardize terminology
Add metadata (title, date, version)
Simplify jargon and assumptions
Verify accuracy
Remove irrelevant sections

Step 5: Organize by Category

Group documents into logical categories:

Customer-facing agent:

Pricing
Features
Installation
Troubleshooting
Integrations

Internal agent:

HR Policies
IT Support
Processes
Onboarding
Tools & Access

Why: Categories help you spot gaps and make maintenance easier.

Step 6: Upload and Test

Upload your documents to your AI agent platform (e.g., Herm.Chat).

Testing checklist:

Ask 20-30 common questions
Verify the agent cites the correct document
Check for accuracy and completeness
Look for vague or unhelpful answers
Refine content based on results

Common issues:

Agent gives vague answers → Content is too sparse
Agent cites wrong documents → Too much irrelevant content uploaded
Agent says "I don't know" → Content exists but isn't structured well

Step 7: Monitor and Iterate

After deployment, review real conversations:

What questions is the agent struggling with?
What documents is it citing most often?
Where are users frustrated?

Use this feedback to improve your knowledge base.

Examples: Bad vs Good Knowledge Base Entries

Example 1: Pricing

❌ Bad:

We have several plans. Check our pricing page for details.

✅ Good:

## Pricing Plans

We offer four pricing tiers to fit every business:

### Free Plan
- **Cost:** $0/month
- **Agents:** 1
- **Messages:** 100/month
- **Best for:** Trying out the platform

### Starter Plan
- **Cost:** $24/month ($20/month annually)
- **Agents:** 3
- **Messages:** 5,000/month
- **Best for:** Small teams

### Growth Plan
- **Cost:** $79/month ($66/month annually)
- **Agents:** 10
- **Messages:** 25,000/month
- **Best for:** Scaling businesses

### Scale Plan
- **Cost:** $199/month ($165/month annually)
- **Agents:** 50
- **Messages:** 100,000/month
- **Best for:** Large teams

Annual billing saves 17%. Enterprise plans available for 100+ agents.

Example 2: Troubleshooting

❌ Bad:

If your widget isn't showing up, check your embed code.

✅ Good:

## Widget Not Showing Up?

Follow these troubleshooting steps:

### 1. Verify Embed Code Placement
- The code should be placed just before the closing </body> tag
- Check that the entire script is copied (no truncation)
- Ensure no extra characters were added when pasting

### 2. Check Browser Cache
- Clear your browser cache and hard refresh (Ctrl+Shift+R or Cmd+Shift+R)
- Test in an incognito/private window
- Try a different browser

### 3. Verify Domain Whitelisting
- Go to your agent settings → Installation
- Confirm your domain is listed under "Allowed Domains"
- Add your domain if missing and save

### 4. Check for JavaScript Conflicts
- Open your browser's developer console (F12)
- Look for errors related to "herm" or "widget"
- Common conflicts: older jQuery versions, ad blockers, privacy extensions

### 5. Still Not Working?
Contact support with:
- Your agent ID
- Your website URL
- Screenshots of the developer console errors
- What browser and device you're using

We typically respond within 2 hours.

Example 3: Internal Policy

❌ Bad:

PTO: 15 days/year. See HR for details.

✅ Good:

## Paid Time Off (PTO) Policy

### Accrual
- **Full-time employees:** 15 days per year
- **Part-time employees:** Pro-rated based on hours worked
- **Accrual starts:** After 90 days of employment
- **Accrual rate:** 1.25 days per month

### Usage
- PTO can be used for vacation, sick leave, or personal time
- Minimum increment: 4 hours (half day)
- Blackout dates: End-of-quarter (last 2 weeks of March, June, September, December)

### Request Process
1. Submit PTO request in BambooHR at least 2 weeks in advance
2. Manager approves or denies within 3 business days
3. For last-minute sick leave, notify your manager via Slack or email ASAP

### Carryover
- Up to 5 unused days can carry over to the next year
- Days beyond 5 are forfeited
- Carryover days must be used by June 30th

### Questions?
Contact HR at hr@company.com or Slack #ask-hr

Advanced: RAG Optimization Techniques

Chunking Strategy

When you upload documents, the AI platform splits them into smaller "chunks" for efficient search.

Best practices:

Keep paragraphs concise (3-5 sentences)
Use headings to create natural chunk boundaries
Avoid giant walls of text

Embedding Quality

RAG systems convert text into "embeddings" (vector representations) for semantic search.

Tips for better embeddings:

Use descriptive headings (not "Introduction" — use "How to Reset Your Password")
Include key terms in the first sentence of each section
Avoid pronouns without clear antecedents ("it," "this," "that" without context)

Similarity Thresholds

AI agents use similarity scores to decide which documents to retrieve.

Too low: Agent retrieves irrelevant documents Too high: Agent says "I don't know" even when the answer exists

Recommended: 0.7 - 0.8 (most platforms handle this automatically)

Source Citation

Configure your agent to cite sources in responses:

Without citations:

Our Starter plan costs $24/month.

With citations:

Our Starter plan costs $24/month.

Source: Pricing Guide (updated March 2026)

Why: Users can verify information, building trust in your AI agent.

Measuring Knowledge Base Quality

Track these metrics:

1. Answer Accuracy Rate

How to measure: Manually review 50 conversations per week. What % of answers are accurate and helpful?

Target: 85-90%

2. "I Don't Know" Rate

How to measure: What % of questions result in "I don't have information on that"?

Target: <10%

If this is high, you're missing important content.

3. Source Citation Rate

How to measure: What % of answers include a source?

Target: >80%

If this is low, your documents may lack clear structure.

4. User Satisfaction

How to measure: Post-conversation survey ("Was this helpful?")

Target: >4.0/5.0

5. Time to Answer

How to measure: Average time from question to complete answer

Target: <5 seconds

If this is slow, you may have too many documents or poor chunk sizes.

Common Mistakes to Avoid

❌ Mistake 1: Uploading Everything

Why it fails: Too much content = slower, less accurate answers.

Fix: Be selective. Only upload relevant content.

❌ Mistake 2: Ignoring Document Maintenance

Why it fails: Outdated information leads to wrong answers.

Fix: Set a monthly review schedule.

❌ Mistake 3: Copy-Pasting Website Content

Why it fails: Website copy is optimized for SEO and marketing, not Q&A.

Fix: Rewrite content specifically for AI agents (clear, structured, self-contained).

❌ Mistake 4: No Testing Before Launch

Why it fails: You discover gaps and errors only after customers complain.

Fix: Test with 30-50 real questions before deploying.

❌ Mistake 5: Assuming AI Will "Figure It Out"

Why it fails: AI agents need high-quality, structured input.

Fix: Put in the work upfront to organize your knowledge base properly.

Your Knowledge Base Checklist

Before deploying your AI agent, ensure you've:

Ready to build a knowledge base that powers accurate, helpful AI agent responses?

Start Free — Upload your docs, deploy an AI agent, and see the difference a well-structured knowledge base makes. No credit card required.