Why Agentic Systems Don't Operate Over a Huge State Space

Why Agentic Systems Don't Operate Over a Huge State Space

Agentic Edge

May 22, 2025

Schaun Wheeler

Why Agentic Systems Don't Operate Over a Huge State Space

Agentic Edge

May 22, 2025

Schaun Wheeler

Why Agentic Systems Don't Operate Over a Huge State Space

Agentic Edge

May 22, 2025

Schaun Wheeler

Why Agentic Systems Don't Operate Over a Huge State Space

Agentic systems don't operate over a huge, chaotic message space. They operate over structured action sets — defined semantic categories that make up a treatment policy.

At Aampe, a typical treatment policy might be composed from action sets: day of week, time of day, channel, value proposition, product/offering, tone of voice, incentive level/type, and so on.

Each set has maybe 3 to 20 possible actions. So yes, the space is combinatorially large—but it's semantically tractable. The agent's task isn't to predict the perfect message. It's to select the best combination of these features for a given user at a given moment. That policy then determines what content gets sent.

The main complexity of user engagement is in the user space - all of the different events that can happen, either individually or in sequence, in their journey though the app. Agents avoid getting bogged down in that complexity by modeling the action space instead of the user space. The user space gets handled via policy selection rather than direct modeling.

That's not just a design choice. It's the reality of online, real-time, organic human behavior. Every customer's behavior is messy, dynamic, and only partially observable. Some users are effectively cold starts. Others have interacted enough that the agent has built confidence in their preferences.

Agentic learners track this directly: every action is represented as a beta distribution, continuously updated per user. So for any feature — say, "push" vs. "email," or "high discount" vs. "low urgency" — the agent knows

what it expects the reward to be, and
how confident it is in that estimate

When the agent selects a policy, it's balancing expected value and *uncertainty. Some actions are taken because they're strong bets. Others are taken because uncertainty is high and requires exploration.

Because treatments are structured, and every send is tied to a selected policy, we can analyze outcomes in detail:

Which features are being chosen confidently vs. tentatively?
How often exploratory actions outperform exploitative ones?
How does agent learning translate into measurable gains over time?

This is the core advantage of agentic personalization: it doesn't just deliver content. It runs structured experiments, learns revealed preferences, and adapts policy decisions at the feature level for every individual user.

The table below shows the percent difference between explore-mode messages and exploit-mode messages for one of our customers. The columns of the table represent eight different use cases agents were assigned to orchestrate, and each row represents a different app event. The higher the value (and the more blue the cell), the more exploit-mode messages outperformed explore-mode messages.

Shaping the future of marketing with Aampe through innovation, data.

See All Posts

Jul 23, 2025

Schaun Wheeler

Layering A/B Testing and Agentic Learning for Better Results

A/B tests help us see what works on average, but real users aren’t average, their motivations and contexts vary. That’s where agentic learning shines, adapting to individuals over time. The best results come when we layer the two: tests for clarity, agents for personalization.

Jul 23, 2025

Schaun Wheeler

Layering A/B Testing and Agentic Learning for Better Results

Jul 23, 2025

Schaun Wheeler

Layering A/B Testing and Agentic Learning for Better Results

Jul 23, 2025

Schaun Wheeler

Layering A/B Testing and Agentic Learning for Better Results

Jul 17, 2025

Schaun Wheeler

Why Auditing a Single Agent Decision Misses the Point

Asking why an agent made one decision is like asking a human. The answer will sound right but miss the deeper truth. The real insight comes from system-level questions about patterns and priorities. Treat agents like teammates: give clarity, not interrogation.

Jul 17, 2025

Schaun Wheeler

Why Auditing a Single Agent Decision Misses the Point

Jul 17, 2025

Schaun Wheeler

Why Auditing a Single Agent Decision Misses the Point

Jul 17, 2025

Schaun Wheeler