AI SEO Automation Python

AI Keyword Classifier

Rules-first AI classification - 50 hours/month saved, <$1 per run

Client
Copenhagen-based SEO agency
Industry
Digital Marketing
Period
September 2025 - Present
Role
Independent Consultant
5 min read

The Problem

Every SEO strategy requires keyword classification - categorizing thousands of keywords by intent, topic, funnel stage, and other dimensions. At this 60-client agency:

Manual classification was consuming massive analyst time:

  • Each new client required hours of keyword tagging
  • 60 clients meant endless categorization work
  • Inconsistency across team members created reporting problems
  • High-value analysts spent time on low-value data entry

The real cost: Approximately 50 hours per month per SEO analyst spent on manual keyword classification instead of strategy work.


The Solution

Architecture: Rules First, AI Second

Classification Pipeline

Why Rules First?

ApproachCost per 10K keywordsConsistencySpeed
100% GPT~$5-10VariableSlow
Rules only$0HighFast
Rules + GPT fallback<$1HighFast

Most keywords fall into predictable patterns. Rules handle 70-80% of classifications; GPT handles the edge cases.

The Hybrid Approach

  1. Rules handle the obvious (70-80% of keywords)

    • “buy nike shoes” → Transactional (rule: “buy *”)
    • “how to do seo” → Informational (rule: “how to *”)
  2. AI handles the ambiguous (20-30% of keywords)

    • Only keywords that don’t match rules go to GPT-4
    • Dramatically reduces AI costs
  3. Learning compounds

    • Once a keyword is classified, it’s cached
    • New clients benefit from prior classifications

Google Sheets Integration

Why Sheets?

  1. SEO team can edit rules without code deployments
  2. Taxonomy visible and auditable
  3. Version history built-in
  4. Collaborative editing

Sheet 1: Taxonomy

CategoryDescriptionParent
TransactionalPurchase intent keywordsIntent
InformationalResearch/learning keywordsIntent
NavigationalBrand/site search keywordsIntent

Sheet 2: Rules

PatternCategoryPriority
buy *Transactional1
how to *Informational1
* priceTransactional2

Results

Cost Economics

Before: Manual Classification

MetricValue
Analyst hourly cost~$50/hour
Hours per 10,000 keywords~5 hours
Cost per classification run~$250

After: Automated Classification

MetricValue
OpenAI API cost<$1 per run
Analyst review time~15 minutes
Total cost per run<$15

ROI: 94% cost reduction per classification task

Production Metrics

MetricValue
Time savings~50 hours/month per employee
Cost per run<$1 OpenAI API
Test scale10,000 rows × 3 columns
UsersEntire SEO team

Operational Benefits

BeforeAfter
4-6 hours to classify new client keywords15 minutes to run and review
Different analysts categorized differentlySingle source of truth
More clients = more analyst hoursMore clients = same infrastructure
Classification rules locked in codeSEO team updates rules via Sheets

Technology Stack

ComponentTechnology
ComputeGoogle Cloud Run
LLMOpenAI GPT-4
ConfigurationGoogle Sheets API
LanguagePython

Why Not Just Use ChatGPT?

Common question: “Can’t we just paste keywords into ChatGPT?”

Manual ChatGPTThis System
Copy-paste requiredFully automated
Inconsistent formattingStandardized output
No memory across runsPersistent cache
$5-10 per large batch<$1 per batch
No audit trailFull logging
One person at a timeTeam-wide access

Lessons Learned

  1. Rules beat AI for predictable patterns. 70-80% of keywords follow patterns that simple rules handle faster and cheaper than LLMs. This exemplifies a core principle: workflows with model-powered steps outperform pure agent approaches when the path is predictable and cost-sensitive.

  2. Deduplication is a multiplier. Aggressive dedup before API calls dramatically reduces costs and improves consistency.

  3. Google Sheets as config layer works. Non-technical team members can update rules without developer involvement.

  4. Batch processing is essential. Per-keyword API calls are too expensive at scale.

  5. Start with cost constraints. Designing for <$1/run forced good architecture decisions (dedup, caching, rules-first).


Impact

This system changed how the SEO team operates:

  • Analysts focus on strategy, not data entry. 50 hours per month per employee shifted from classification to client work.
  • New client onboarding dropped from days to minutes. What took a week-long classification sprint now runs while you grab coffee.
  • The classification cache compounds. Every new client benefits from prior classifications - a growing advantage over competitors starting fresh each time.

The ROI paid for the development in the first month.


Want to discuss AI classification?

Building a similar system for keyword tagging, content categorization, or document classification? I design cost-controlled AI systems that handle the predictable with rules and the ambiguous with LLMs - all without breaking the budget. Get in touch.