Why LLMs and RAG Aren’t Enough for Building Agentic AI Systems

May 5, 2025

Schaun Wheeler

Business

Data Science

Product

I wrote a post recently (linked in comments below) on why agentic systems require structures for semantic-associative memory, and why LLMs lack the architecture to do anything but procedural memory. Therefore: LLMs aren't sufficient to build agentic systems. Someone replied with a very thoughtful question about Retrieval-Augmented Generation (RAG), a method that enhances language models by retrieving relevant external documents at query time and using them as additional context for generating responses.

The question was whether RAG didn't actually provide the solutions to the inadequacies I outlined in my original post.

The answer is: no. RAG is great for fact recall (well...let's say it's solidily good). But it falls short when it comes to forming and using semantic categories — the kind of abstract, relational knowledge humans build up over time.

Here’s why:

No integration

RAG fetches documents, but they don’t get internalized. Humans integrate knowledge into structured representations that guide future learning and generalization.

No schema formation

Humans compress experience into categories (e.g., "tools", "emotions", "rules"). RAG retrieves specific items but doesn’t abstract across them to form stable, general concepts.

No consolidation

There’s no memory strengthening. Humans reinforce category boundaries over time — through repetition, use, and relevance. RAG pipelines don’t naturally evolve or stabilize concepts with repeated exposure.

Flat associations

Retrieval is based on surface similarity (e.g., vector distance). Human categories are rich with associative structure — linking things by function, causality, affordance, etc.

No cross-episode learning

RAG doesn’t accumulate concepts across interactions. Each retrieval is a fresh lookup. Humans merge partial exposures to build coherent, resilient categories.

In short: RAG helps with lookup, not learning. It’s a patch for recall, not a path to structured, practical decision-making.

Until we give agents mechanisms for abstraction, consolidation, and associative generalization — they won’t have anything close to human-like semantic memory. In my opinion, it's more realistic to create those mechanism separately and let them interact with LLMs than it is to think that we can somehow train or augment LLMs to the point that they'll be able to do something they were not designed to do.