Using GitHub as a Knowledge Base for Your AI Tools
A well-structured GitHub repository gives your AI tools a searchable, versioned, plaintext knowledge base. Here's how to set one up.
You gave your AI a folder of documents. It found some of them. It missed the rest. The problem wasn’t the AI - it was the structure.
Most people store knowledge the same way they’ve always stored it: files scattered across Notion pages, Google Docs with names like “Final v3 FINAL”, Confluence wikis that haven’t been updated since 2023. When AI tools try to use this as context, they’re working with a haystack. A well-structured GitHub repository is a filing cabinet - and it’s one of the best ones available.
Why GitHub, not Notion or Google Docs
The tools you already use for documentation aren’t bad. They’re just not built for AI to read.
Notion stores content in a proprietary format. To read a Notion page, an AI needs to use the API, handle the block structure, and parse nested content. Even then, unnamed blocks and database views don’t translate cleanly. Notion is great for humans browsing. It’s friction for AI reading.
Google Docs is similar. Rich text formatting, comments embedded in the content, export that doesn’t always preserve structure. Feeding a Google Doc to an AI usually means copying and pasting - and losing the links between documents.
GitHub repositories, by contrast:
- Store everything as plaintext markdown - the format AI reads natively
- Version every change, so your knowledge has history, not just a current state
- Have a built-in structure: README, files, issues, wiki, discussions - each a distinct bucket for a distinct purpose
- Generate stable URLs for everything - a decision documented in an issue is linkable forever
- Are searchable by both humans and machines without proprietary APIs
When you point an AI tool at a well-structured GitHub repo, it can navigate it the way a new colleague would navigate a well-organised shared drive - if you’ve done the work of organising it well.
Where Obsidian fits
Obsidian deserves a specific mention because it looks, at first, like the obvious answer. It stores notes as local Markdown files, those notes can be linked together, and the result is often called a vault. For personal knowledge work, that’s excellent. Research notes, meeting notes, half-formed thinking, reading highlights, project journals - an Obsidian vault is a good home for all of that.
But a notes vault and a project knowledge base are not the same thing.
Obsidian is usually shaped around how one person thinks. The structure is personal: folders, tags, backlinks, daily notes, conventions you remember because you made them. An AI can search that if you connect it properly, but the search quality depends on whether your notes are named, linked, and written clearly enough for something else to interpret.
GitHub is better when the knowledge is meant to be acted on by other people or tools. A decision in a repo can be reviewed, linked to an issue, connected to a pull request, and changed through history. A README can tell the AI where to start. A file path can become a stable pointer. That matters when the goal is not just “find my notes” but “help me work on this project safely.”
The practical split is simple:
- Use Obsidian for personal notes, research, and long-form thinking.
- Use GitHub for versioned project knowledge, decisions, templates, and source material you expect AI tools to act on.
- Use memory for decisions, commitments, blockers, preferences, and follow-through that need to survive across sessions.
You can use all three. The mistake is expecting one of them to do every job.
The anatomy of a useful repo
Structure is what makes a repository actually usable as a knowledge base. Here’s what each area is for.
README.md - the front door
This is the first thing AI tools read. It should answer three questions: what is this repo, what’s in it, and where should someone look for what they need. Think of it as the index card on the outside of the filing cabinet.
A README that says “Company knowledge base” and nothing else is useless. A README that says “This repo contains decisions, project briefs, reference docs, and team context - see below for what lives where” gives the AI a navigation map before it reads anything else.
context/ - structured reference
A folder of documents that should be read rather than searched. Good candidates:
decisions.md- why things are the way they are. This is the most valuable document in most knowledge bases and the one most often missing. Not “we use Stripe” but “we chose Stripe over Paddle because of the EU VAT handling - revisit when we expand to APAC.”glossary.md- shared vocabulary. If you use terms that mean something specific in your company or project, define them here. AI tools hallucinate meanings for unexplained jargon.people.md- who’s involved and what they do. Roles, responsibilities, who owns what decision.principles.md- non-negotiables and working norms. The things that should inform every decision but aren’t written anywhere.
Issues - work in progress
Issues aren’t just bug trackers. They’re a record of open questions, decisions in flight, and tasks. AI tools can query them. Use labels to group related issues so the AI can filter - “decision”, “open question”, “blocked” as labels turn a flat list into something navigable.
Wiki - long-form reference
Process documentation, onboarding material, anything that changes rarely and needs to be read start-to-finish. The wiki is separate from the codebase, which keeps it from cluttering your file structure.
Discussions - async decisions with trails
Unlike Slack, GitHub Discussions are searchable, permanent, and linked. If a decision was made in a thread, that thread is still there in six months. Use Discussions for anything where you want a record of the conversation, not just the conclusion.
Structure that actually works
A template you can start from:
my-company-kb/
├── README.md ← what this is, what's here, where to look
├── context/
│ ├── decisions.md ← why things are the way they are
│ ├── glossary.md ← shared vocabulary
│ ├── people.md ← who's involved and what they do
│ └── principles.md ← non-negotiables and working norms
├── projects/
│ └── {project-name}/ ← one folder per ongoing project
│ ├── brief.md
│ └── status.md
└── reference/
└── {topic}.md ← deep-dive reference docs
A few naming principles that matter more than most people realise:
Descriptive filenames. q2-pricing-decision.md is useful. notes.md is not. AI tools use filenames as signals before reading content - vague names make files invisible.
Clear headings inside documents. An AI scanning a document uses headings to decide whether to read deeper. Buried information under a generic heading (“Miscellaneous”) often gets missed.
Explicit cross-links. If decisions.md references a project, link to the project folder. If a project status references a principle, link to it. AI tools follow links. A document that’s only reachable by knowing it exists is easy to miss.
One document, one purpose. The instinct is to put everything in one long doc. Resist it. Separate concerns so the AI can fetch exactly what’s relevant, not wade through everything to find one section.
How to feed it to your AI tools
Structure is half the job. The other half is the pointer.
Claude Projects: paste the repo URL directly, or upload key files. Claude will read the content and use it as context for every conversation in that project.
Cursor: a .cursorrules file or a context/ folder the AI reads automatically. Point it at your KB by referencing the key files.
Claude Code: a CLAUDE.md file in any directory tells Claude Code what matters in that project. Include links or references to your external KB.
The pattern is the same across tools: the AI needs an explicit pointer, not just access. “The context you need is in this repo, specifically start with README and then context/decisions.md” works. Hoping it will discover the right documents on its own usually doesn’t.
Bootstrap your README
If you’re starting from scratch, this prompt does the work of the first draft:
I'm setting up a GitHub repository as a knowledge base for my AI tools to reference.
Here's what the repo will contain: [describe your content areas]
Write me a README.md that:
- Explains what this repo is and who uses it
- Lists each folder and what belongs there
- Has a brief "how to use this" section for both humans and AI tools
- Is written in plain, clear language - not corporate documentation style
A good README is worth an hour of setup. It’s the document that makes everything else findable.
A ready-to-fork template repo - sebastianebg/github-as-kb - is coming soon. It includes the folder structure above, starter files for each document, and a README template.
Next: Memory, Project Files, and Retrieval: Which One Do You Need? - the broader framework for deciding what belongs where.