What Are LLMs, Actually?

Large Language Models explained without hype or jargon. The 4,096-dimensional mental model that explains hallucination, context windows, and how to use them well.

TL;DR: Large Language Models are prediction engines that understand language in thousands of dimensions simultaneously. Where a spreadsheet sees “Paris” as a cell in a row, an LLM sees Paris as a city, a location in France, a location in Texas, a woman’s name, a concept associated with romance, fashion, history—all at once. Context determines which meaning surfaces. This is remarkably powerful, and understanding it explains everything about how to use them well.


“We’re using LLMs” has become shorthand for “we’re doing AI.”

I hear it in boardrooms, pitch decks, and strategy sessions. But when I ask what an LLM actually is, most people can’t answer. They know it powers ChatGPT. They know it’s expensive. They know it feels like talking to something intelligent.

This gap matters. Not because LLMs are dangerous or overhyped—but because understanding the tool unlocks what it can actually do. And what it can do is remarkable.

From Two Dimensions to Four Thousand

Here’s the mental model that changed how I explain this.

2D Spreadsheet vs 4,096D LLM representation

Think about a spreadsheet. You have rows and columns. If you want to store information about Paris, you might have:

CityState/Country
ParisFrance
ParisTexas, USA

That’s two-dimensional. Paris relates to France in one row, Texas in another. The relationships are flat, explicit, manually defined. To find “Paris, France” you look it up. The database retrieves exactly what you stored.

Now think about how an LLM “knows” Paris.

When the model processes the word “Paris,” it doesn’t retrieve a row. It activates a representation across 4,096+ dimensions. In that space, “Paris” exists simultaneously as:

  • A city (near “London,” “Rome,” “Berlin”)
  • A location in France (near “Lyon,” “Marseille,” “French”)
  • A location in Texas (near “Houston,” “Dallas,” “Texan”)
  • A person’s name (near “Hilton,” “celebrity,” “heiress”)
  • A concept (near “romance,” “fashion,” “Eiffel Tower,” “croissant”)

All of these are true at the same time. The model holds all these meanings in parallel, weighted by how strongly each dimension activates.

So when you ask “What is the capital of France?”—the context (“capital,” “France”) collapses that 4,096-dimensional cloud into the city meaning. The model predicts “Paris” because in that context, city-Paris is the strongest signal.

But ask “Who was on The Simple Life?” and the same word “Paris” now activates completely different dimensions—celebrity, television, Hilton. Same token, different meaning, determined entirely by context.

How context collapses dimensional meaning

This is what databases cannot do. A spreadsheet needs you to specify which Paris you want. An LLM figures it out from context, the way humans do.

What This Makes Possible

This multi-dimensional understanding is why LLMs feel like they “get it.”

They’re not looking things up. They’re not following rules. They’ve encoded the relationships between concepts so densely that they can navigate meaning the way we do—fluidly, contextually, with nuance.

Ask an LLM to “write a formal email declining a meeting” and it knows what “formal” means in email context. Ask it to “explain this code to a junior developer” and it adjusts complexity automatically. Ask it to translate idioms and it finds equivalent expressions, not literal words.

None of this is programmed. It emerges from training on massive amounts of human language. The model learned that “break a leg” doesn’t involve legs, that “let’s circle back” means something specific in corporate contexts, that tone shifts between Slack messages and board presentations.

This is genuinely powerful. It’s why LLMs can draft, summarize, translate, converse, and code in ways that feel natural. They understand language in its full dimensionality.

”But Then It’s Not My Words”

I hear this objection often, especially about writing. If an LLM helped draft it, is it really yours?

Consider: What about brainstorming with a colleague? What if someone else drafts a memo for you, you review it, shape it, and send it out? You didn’t compose every sentence—but you brought the ideas, the direction, the judgment about what’s right. That’s still your work.

LLMs are the same. If you bring the ideas, the creativity, the critical eye that shapes the output into what you actually want—it’s your work. The model is a tool, like a calculator or a research assistant. The thinking is still yours.

But here’s the flip side: if you don’t bring those things—if you just accept whatever the model produces without thought—then yes, it’s not your words. It’s just the most likely prediction based on context, and no one is any wiser for it. The model amplifies what you put in. Put in nothing, get statistical noise. Put in real thinking, get real leverage.

The Tradeoffs of Dimensionality

Understanding the architecture also explains behaviors that confuse people.

Why hallucination happens. When context is ambiguous, multiple meanings activate. The model picks one—and sometimes picks wrong. It’s not “making things up” so much as selecting from many valid interpretations without enough signal to choose correctly. The fix isn’t to eliminate this (you can’t), it’s to provide clearer context and verify outputs that matter.

Why the same prompt gives different answers. The model produces probability distributions, not single answers. “Paris” might be 85% likely in one context, but “Lyon” is still 10%. Temperature settings control how much the model explores lower-probability options. This variability is a feature for creative tasks and a constraint for deterministic ones.

Why context windows matter. The model can only consider what’s in its current context. It doesn’t “remember” previous conversations—each interaction starts fresh. Long context windows let you include more relevant information, but you still have to provide it explicitly.

Why it struggles with math. Arithmetic requires precision. Language models work in probabilities. These are fundamentally different operations. LLMs can reason about math, explain concepts, even write code that calculates—but raw computation isn’t their strength.

These aren’t flaws. They’re natural consequences of how dimensional prediction works. Know them, and you’ll use the tool effectively.

What About AGI?

People ask about artificial general intelligence—the sci-fi scenario where AI becomes genuinely autonomous, perhaps dangerous. Let me be direct: we’re not there. Not close.

LLMs are extremely capable in the right hands, with the right guardrails. They’re also just prediction engines. Sophisticated ones, but bounded. The gap between “predicts language really well” and “generally intelligent autonomous system” is vast—not just technically, but energetically. Current models require enormous compute. The energy requirements for anything approaching AGI, if it’s even possible with this architecture, would be staggering.

The fear often outpaces the reality. What we have is a remarkably useful tool that requires human judgment to wield effectively. That’s not nothing—it’s transformative for many tasks. But it’s not AGI, and treating it as such (either with fear or with misplaced trust) leads to bad decisions.

What LLMs Are Not (And Why It Matters)

The dimensional model clarifies what LLMs can’t replace:

Not a database. Databases retrieve exactly what you stored. LLMs generate plausible outputs based on patterns. For precise factual retrieval, use a database. Use LLMs to interpret, summarize, or communicate what the database returns.

Not a search engine. LLMs don’t query external sources in real-time (unless you build that capability). Their knowledge was frozen at training time. For current information, use search. Use LLMs to synthesize what search finds.

Not a calculator. For precise arithmetic, use actual calculators or code. LLMs can help you write that code, or explain what the numbers mean.

The pattern: use the right tool for each part of the job. LLMs excel at language—understanding it, generating it, transforming it. Pair them with systems that handle precision, retrieval, and computation.

Using LLMs Well

Understanding the architecture leads directly to using it effectively.

Provide rich context. The model’s entire world is what you give it. Include relevant background, terminology, examples of what you want. The more context you provide, the more dimensions you activate in the right direction.

I’ve seen projects fail on this. One FDA document review assumed you could just feed a 100-page PDF to multiple agents and get useful compliance analysis out. You can’t—not without careful context management, chunking strategies, and explicit instructions about what to look for. Another project using Azure AI for pharmaceutical translation neglected to find good training sentence pairs, assuming all LLMs work the same way. They don’t. Context and data quality drive everything.

Verify what matters. Hallucination isn’t a bug to fear—it’s a natural consequence of dimensional prediction. For critical facts, verify. Build systems that cross-check. Use retrieval to ground the model in actual data. Systematic testing catches errors before production does—I’ve seen models spiral, outputs fail to generate, hallucinations slip through. The answer isn’t to avoid LLMs; it’s to build verification into the system.

Start simple. Before reaching for fine-tuning, try few-shot prompting. Give the model a few examples of what you want in the prompt itself. In many cases—more than most people realize—this gets you where you need to go without the cost and complexity of training custom models. Fine-tuning is powerful, but time and resources are finite. Think twice before going down that path.

Match temperature to task. Low temperature (0.0-0.3) for factual, consistent outputs. Higher temperature (0.7+) for creative exploration. The same model behaves very differently depending on this setting.

Combine tools intelligently. The best systems pair LLMs with databases, search, and code execution. Let each tool do what it’s good at. Use databases for precise retrieval. Use search for current information. Use code for calculation. Use LLMs for everything that involves understanding and generating language. The keyword classifier I built demonstrates this: rules handle 70-80% of classifications (fast, deterministic, free), and GPT-4 only processes ambiguous cases—reducing AI costs to under $1 per run.

LLMs orchestrating specialized tools

Get better at prompting. This is a skill that develops with practice. Learn to iterate—refine prompts based on outputs. Use adversarial techniques: ask the model “why does this suck?” or “what would kill this project?” to stress-test your thinking. Break complex tasks into specific sub-tasks. Consolidate and review. The more deliberately you work with the model, the better your results.

The Right Mental Model

Think of LLMs as prediction engines that understand language in thousands of dimensions.

They’ve absorbed the patterns of human communication so deeply that they can navigate meaning the way we do—contextually, fluidly, with nuance. That’s genuinely remarkable. It enables summarization, translation, drafting, conversation, code generation, and countless other language tasks that were impossible to automate before.

But they’re not databases. They’re not search engines. They’re not calculators. They’re not minds.

They’re tools. Powerful, specific tools that do something new under the sun: they understand language well enough to work with it. The Paris in your prompt isn’t a row in a table—it’s a point in a 4,096-dimensional space, ready to mean whatever context demands.

Understand that, and you’ll know exactly when to use them, how to use them well, and what to pair them with.

That’s what LLMs actually are.


This is the foundation. If you’re figuring out how to actually deploy LLMs in your business—where workflows make sense, where agents might fit, and how to avoid the common pitfalls - let’s talk.


Let's Build Something

Taking on new work.

I build AI workflows and agents that actually run in production—and stick around to maintain them.

Best fit: growing companies where ops can't keep up with volume, teams who tried AI and got burned, or regulated industries where you can't afford to get it wrong.

Based in Copenhagen. Available for remote or on-site (SF, NY, London).

What to expect: I respond within a few days. If there's a fit, we'll find 30 minutes for coffee or a call.

Have a quick question? — an AI that knows my work.

Book a Call

Skip the back-and-forth. Pick a time that works for you and let's talk about your project.

Book a 30-minute call →

Send a Message

Prefer email? Drop me a note and I'll get back within a few days.