How Agentic Learners handle Time-based Cycles without breaking

Aug 25, 2025

Schaun Wheeler

A lot depends on implementation, so I can only speak to the agentic learners I’ve designed, but one question that comes up often is how they handle time-based events like subscription cycles, product launches, holidays, and seasonality. This touches on an important point about agent design in general.

First thing to understand: recurring events (like subscription renewals) don’t make good reward signals. If renewal happens automatically on the 15th of every month, the system would spend most of the month seeing nothing but “failure,” then suddenly see “success” once. That’s not a useful signal. It's a fine business goal, but agents need something more granular.

So it's a good thing reward signals aren’t the only thing agents rely on. If you want to message a user about an upcoming renewal or refill, you can set eligibility criteria tied to the renewal date. Agents fundamentally rely on instructions about what they can or can’t say. You can make certain vocabulary available for certain days - say, the week before the 15th - and not for others.

The same idea applies to holidays and scheduled events. A holiday is just a due date that applies to everyone, and show releases or sports matches are due dates that apply to specific users. The agent doesn’t need to “remember” those dates internally; it just needs the right eligibility criteria passed in.

Dates for new product releases work a little differently. Agents can learn preferences for those, but the important part is the action set (the array of options among which an agent learns to distinguish) not the reward signal. If you’re adding a new product category, you can create a new label and approve content for it, and the agent will explore it like any other. Or you can create a "new product" label and learn which users find announcements about new features useful.

That general-label functionality has lots of uses—it’s one way to learn seasonal patterns. Daily and weekly seasonality are straightforward. Longer cycles - monthly, yearly, holidays - require a broader timing action set, which can take longer to learn because you only see one data point per year. You can chunk larger seasonality into smaller sets - month, season, weekend vs. weekday, sales days, holidays, and so on.

That’s the three major components of how agents learn to leverage time-recurrence:

1️⃣ A reward signal that tells the agent whether the thing they just did moved the assigned user in the right direction.
2️⃣ Eligibility criteria that give the agent permission to do certain things during certain time constraints.
3️⃣ Timing action sets at varying levels of granularity that act as repositories for aggregated rewards.

Those three things are enough to handle refill cycles, product launches, holidays, or travel bookings - without the agent needing special-case logic for each one.