At the core of Aampe’s agentic learning system lies a simple idea: we personalize not only what to communicate to users, but when to communicate it. To do that effectively, we must understand what we’re trying to accomplish in each communication. This is not just a messaging problem—it’s a causal one. If we can't define a desired outcome, we can't measure whether an intervention had an effect.
Our method of per-treatment, per-moment causal analysis depends on this precise foundation: defining a target event that the agent is trying to nudge the user toward. These events become the positive class in the classification task that underpins causal modeling. If we choose the wrong target event, the entire learning system may optimize toward irrelevant, infrequent, or even unreachable outcomes. The quality and relevance of the target event determines the value of every subsequent insight.
In this context, the selection of goal events is not an operational detail—it's a foundational design choice. One that determines what the agent learns, how it learns, and ultimately how well it can act.
From Business Metrics to Agent Rewards
A crucial distinction in this process is between business goals and agent rewards. Business goals are high-level outcomes: increasing revenue, reducing churn, improving retention. Agent rewards, by contrast, are local and observable: they must correspond to concrete in-app events that can serve as evidence of user engagement following a treatment.
While business goals are directional, agent rewards must be actionable. For example, "user retention" is a business metric, but it cannot be directly observed as a one-off event. "Opening the app", "starting a session", or "completing a purchase" are tangible events that can serve as agent rewards—provided they meet certain criteria.
What Makes a Bad Target Event?
Not all user events make for useful or reliable agent reward targets. Below are several categories of events that can compromise learning when used as general reward signals:
1. Scheduled Events
These events occur on fixed timelines, independent of user engagement patterns. Because their timing is predetermined, they are unresponsive to interventions. An agent that optimizes for a scheduled event is simply learning the calendar, not learning what actions drive behavior.
Examples of scheduled events are subscription renewals, salary deposits, monthly report generation, or scheduled payments. For example, if an agent sends a message on June 29th and a user renews a subscription on July 1st, there's no causal inference to be made. The event was going to happen anyway. These events provide no opportunity for the agent to demonstrate or improve its influence.
2. Exhaustible Events
Some user actions are inherently one-time or low-frequency: once completed, they rarely recur. This makes them fragile and short-lived as reward signals. Once a user completes such an event, they drop out of the learning loop—not because they’re disengaged, but because there’s nothing left for them to do.
Examples of exhaustible events are selecting a favorite team, completing onboarding, linking a payment method or bank account, or filling out a user profile. Agents can't differentiate between lack of interest and lack of opportunity. Over time, as more users complete the action, the remaining pool of "eligible" users shrinks, creating biased and noisy signals. This leads to poor generalization and incomplete learning.
3. Highly Context-Dependent Events
These events are only meaningful under certain user conditions or contexts. At any given time, only a subset of users is eligible to perform them. If treated as universal rewards, they generate inconsistent and misleading learning signals because the agent can’t reliably tell who is and isn’t in the right context.
Examples of context-dependent events are: sending a referral (only relevant if the user has someone to refer), uploading a tax document (only during filing season), scheduling a delivery (only if an order is pending), or confirming account identity (only required under certain flags). In the absence of explicit context filters, agents don't know whether a user ignored a message because it wasn't compelling or because it wasn't applicable. This conflates disinterest with ineligibility, making it hard to learn what works.
That being said, context-dependent events can be effective rewards when tied to targeted interventions that only reach pre-qualified users. For example, a message about uploading a tax document sent only to users who have initiated the tax process can use that event as a valid learning signal. The key is scoping the target state to the subset of users for whom it is actually actionable. In this way, the event works well as a localized, conditional reward, even though it’s unsuitable as a global goal state.
What Makes a Good Target Event?
The ideal target event for agentic learning is:
Plausibly triggerable at any time
Observable within the app context
Frequent enough to support causal estimation
Meaningfully connected to engagement or value creation
This is one reason why purchase events work well in e-commerce. While users technically have finite budgets, the variety and frequency of opportunities to buy something mean that purchases are almost always available as a latent behavior. Even if users don't purchase often, the possibility is nearly always present—making purchases a valid, always-on reward target.
From Business Metrics to Agent Rewards
A crucial distinction in this process is between business goals and agent rewards. Business goals are high-level outcomes: increasing revenue, reducing churn, improving retention. Agent rewards, by contrast, are local and observable: they must correspond to concrete in-app events that can serve as evidence of user engagement following a treatment.
While business goals are directional, agent rewards must be actionable. For example, "user retention" is a business metric, but it cannot be directly observed as a one-off event. "Opening the app", "starting a session", or "completing a purchase" are tangible events that can serve as agent rewards—provided they meet certain criteria.