The LLM Alignment problem

The LLM Alignment problem

Agentic Edge

Aug 6, 2024

Schaun Wheeler

The LLM Alignment problem

Agentic Edge

Aug 6, 2024

Schaun Wheeler

The LLM Alignment problem

Agentic Edge

Aug 6, 2024

Schaun Wheeler

The LLM Alignment problem

LLM "alignment" refers to reducing the gap between LLM output and user expectations. I came across a nice technical overview (LLMs from scratch) comparing the two main methods for LLM alignment: Reinforcement Learning from Human Feedback (RLHF) and Direct Preference Optimization (DPO).

My conclusion: both methods show how talk about AI alignment is really missing the boat.

The difference between RLHF and DPO is technical rather than substantial. RHLF takes feedback about how good an output is, whereas DPO takes actual corrections. Both methods input individual user preferences, but output a model that tries to cater to the average preferences of the whole user base.

In human cognition, "procedural" memory is the mechanism behind automatic actions like riding a bike, typing, or speaking. That's why people can run their mouths without really saying anything—our brain can string together words coherently just by remembering what words tend to follow other words. LLMs mimic procedural memory and only procedural memory. That, in a nutshell, is the alignment problem.

Procedural memory is different from other kinds of implicit memory: associative (Pavlov's dogs), non-associative (people who live near train tracks learn to ignore the sound of passing trains), and priming (repeated exposure to a brand's advertising makes you more willing to give that brand a try).

Neither RHLF nor DPO do anything to represent those other kinds of memory.

Behavioral agents, on the other hand, remember:

which value propositions tends to precede a purchase (associative)
which messages are syntactically similar, and therefore prone to habituation (non-associative)
which actions tend to feed into purchase behavior, and are therefore early indicators of purchase behavior (priming)

LLMs can ingest session-specific background information. OpenAI calls this a "system" prompt—the user never sees it, but the LLM acts on it.

Agentic learning produces tagged weights that represent lessons learned and these tagged weights can be fed into a system prompt to give an LLM information not available from procedural memory. This also allows the LLM to treat users as individuals rather than ingredients for an aggregate

When a user asks an LLM what key features to look for when buying a new laptop, instead of learning that users in general prefer a user-friendly response to a technical response, a primed LLM can know that this particular user in this particular session prefers a technical response, even if everyone else in the world prefers a user-friendly response.

Better LLM training can't solve the alignment problem because LLMs mimic only one out of many kinds of human cognitive capabilities. We get closer to alignment when we mimic more kinds of memory.

Shaping the future of marketing with Aampe through innovation, data.

See All Posts

Jun 16, 2025

Schaun Wheeler

How Aampe's Agents Use Causal Analysis to Measure Impact Amidst External Messaging

Discover how Aampe's agents employ causal analysis to accurately measure user engagement outcomes, even when influenced by external messages. By isolating the effects of their own actions from other variables, Aampe ensures precise attribution and effective decision-making in a complex messaging environment.

Jun 16, 2025

Schaun Wheeler

How Aampe's Agents Use Causal Analysis to Measure Impact Amidst External Messaging

Jun 16, 2025

Schaun Wheeler

How Aampe's Agents Use Causal Analysis to Measure Impact Amidst External Messaging

Jun 16, 2025

Schaun Wheeler

How Aampe's Agents Use Causal Analysis to Measure Impact Amidst External Messaging

Jun 10, 2025

Schaun Wheeler

Understanding Decision-Making in Agentic Systems

Explore how agentic systems define and execute decisions. This article delves into the five key decision types—Go/No-Go, Context, Creative Policy, Item Recommendation, and Freshness—that guide autonomous agents in delivering personalized user experiences. Learn how these systems prioritize meaningful choices to enhance engagement and effectiveness.

Jun 10, 2025

Schaun Wheeler

Understanding Decision-Making in Agentic Systems

Jun 10, 2025

Schaun Wheeler

Understanding Decision-Making in Agentic Systems

Jun 10, 2025

Schaun Wheeler

Understanding Decision-Making in Agentic Systems

Jun 5, 2025

Schaun Wheeler

Agentic Architecture in Action: Aampe's Dual-Path Personalization Strategy

Explore Aampe's innovative approach to personalization, combining classical recommender systems for item selection with real-time reinforcement learning agents for dynamic message composition. Learn how this dual-path strategy enhances user engagement and content relevance.

Jun 5, 2025

Schaun Wheeler

Agentic Architecture in Action: Aampe's Dual-Path Personalization Strategy

Jun 5, 2025

Schaun Wheeler

Agentic Architecture in Action: Aampe's Dual-Path Personalization Strategy

Jun 5, 2025

Schaun Wheeler

Agentic Architecture in Action: Aampe's Dual-Path Personalization Strategy

Jun 3, 2025

Schaun Wheeler

Evaluating Adaptive Systems: Beyond Short-Term Metrics

Explore why focusing solely on short-term metrics like lift can be misleading when assessing adaptive systems, and discover alternative approaches for meaningful evaluation.

Jun 3, 2025

Schaun Wheeler

Evaluating Adaptive Systems: Beyond Short-Term Metrics

Explore why focusing solely on short-term metrics like lift can be misleading when assessing adaptive systems, and discover alternative approaches for meaningful evaluation.

Jun 3, 2025

Schaun Wheeler

Evaluating Adaptive Systems: Beyond Short-Term Metrics

Explore why focusing solely on short-term metrics like lift can be misleading when assessing adaptive systems, and discover alternative approaches for meaningful evaluation.

Jun 3, 2025

Schaun Wheeler