May 5, 2025
Schaun Wheeler

Why LLMs and RAG Aren’t Enough for Building Agentic AI Systems

May 5, 2025
Schaun Wheeler

Why LLMs and RAG Aren’t Enough for Building Agentic AI Systems

May 5, 2025
Schaun Wheeler

Why LLMs and RAG Aren’t Enough for Building Agentic AI Systems

May 5, 2025
Schaun Wheeler

Why LLMs and RAG Aren’t Enough for Building Agentic AI Systems

I wrote a post recently (linked in comments below) on why agentic systems require structures for semantic-associative memory, and why LLMs lack the architecture to do anything but procedural memory. Therefore: LLMs aren't sufficient to build agentic systems. Someone replied with a very thoughtful question about Retrieval-Augmented Generation (RAG), a method that enhances language models by retrieving relevant external documents at query time and using them as additional context for generating responses.

The question was whether RAG didn't actually provide the solutions to the inadequacies I outlined in my original post.

The answer is: no. RAG is great for fact recall (well...let's say it's solidily good). But it falls short when it comes to forming and using semantic categories — the kind of abstract, relational knowledge humans build up over time.

Here’s why:

No integration

RAG fetches documents, but they don’t get internalized. Humans integrate knowledge into structured representations that guide future learning and generalization.

No schema formation

Humans compress experience into categories (e.g., "tools", "emotions", "rules"). RAG retrieves specific items but doesn’t abstract across them to form stable, general concepts.

No consolidation

There’s no memory strengthening. Humans reinforce category boundaries over time — through repetition, use, and relevance. RAG pipelines don’t naturally evolve or stabilize concepts with repeated exposure.

Flat associations

Retrieval is based on surface similarity (e.g., vector distance). Human categories are rich with associative structure — linking things by function, causality, affordance, etc.

No cross-episode learning

RAG doesn’t accumulate concepts across interactions. Each retrieval is a fresh lookup. Humans merge partial exposures to build coherent, resilient categories.

In short: RAG helps with lookup, not learning. It’s a patch for recall, not a path to structured, practical decision-making.

Until we give agents mechanisms for abstraction, consolidation, and associative generalization — they won’t have anything close to human-like semantic memory. In my opinion, it's more realistic to create those mechanism separately and let them interact with LLMs than it is to think that we can somehow train or augment LLMs to the point that they'll be able to do something they were not designed to do.

0

Related

Shaping the future of marketing with Aampe through innovation, data.

Renewals, holidays, and launches don’t need hardcoded rules. With reward signals, eligibility criteria, and timing action sets, agents adapt naturally to recurring patterns.

Renewals, holidays, and launches don’t need hardcoded rules. With reward signals, eligibility criteria, and timing action sets, agents adapt naturally to recurring patterns.

Renewals, holidays, and launches don’t need hardcoded rules. With reward signals, eligibility criteria, and timing action sets, agents adapt naturally to recurring patterns.

Renewals, holidays, and launches don’t need hardcoded rules. With reward signals, eligibility criteria, and timing action sets, agents adapt naturally to recurring patterns.

Aug 21, 2025

Schaun Wheeler

By modeling statistical relationships between events, agents evaluate directional shifts in behavior—so the same system adapts across every lifecycle stage.

Aug 21, 2025

Schaun Wheeler

By modeling statistical relationships between events, agents evaluate directional shifts in behavior—so the same system adapts across every lifecycle stage.

Aug 21, 2025

Schaun Wheeler

By modeling statistical relationships between events, agents evaluate directional shifts in behavior—so the same system adapts across every lifecycle stage.

Aug 21, 2025

Schaun Wheeler

By modeling statistical relationships between events, agents evaluate directional shifts in behavior—so the same system adapts across every lifecycle stage.

Aug 19, 2025

Schaun Wheeler

You don’t coach by chasing the trophy. You coach by tracking whether each play puts you in a stronger position. The same is true for customer engagement.

Aug 19, 2025

Schaun Wheeler

You don’t coach by chasing the trophy. You coach by tracking whether each play puts you in a stronger position. The same is true for customer engagement.

Aug 19, 2025

Schaun Wheeler

You don’t coach by chasing the trophy. You coach by tracking whether each play puts you in a stronger position. The same is true for customer engagement.

Aug 19, 2025

Schaun Wheeler

You don’t coach by chasing the trophy. You coach by tracking whether each play puts you in a stronger position. The same is true for customer engagement.

Jul 23, 2025

Schaun Wheeler

A/B tests help us see what works on average, but real users aren’t average, their motivations and contexts vary. That’s where agentic learning shines, adapting to individuals over time. The best results come when we layer the two: tests for clarity, agents for personalization.

Jul 23, 2025

Schaun Wheeler

A/B tests help us see what works on average, but real users aren’t average, their motivations and contexts vary. That’s where agentic learning shines, adapting to individuals over time. The best results come when we layer the two: tests for clarity, agents for personalization.

Jul 23, 2025

Schaun Wheeler

A/B tests help us see what works on average, but real users aren’t average, their motivations and contexts vary. That’s where agentic learning shines, adapting to individuals over time. The best results come when we layer the two: tests for clarity, agents for personalization.

Jul 23, 2025

Schaun Wheeler

A/B tests help us see what works on average, but real users aren’t average, their motivations and contexts vary. That’s where agentic learning shines, adapting to individuals over time. The best results come when we layer the two: tests for clarity, agents for personalization.

Load More

Load More

Load More

Load More