Here is a diagram of our agentic architecture (well, part of it). See the top-right box: "recommender service"? Let’s talk about that. At Aampe, we split copy personalization into two distinct decisions:
- Which item to recommend 
- How to compose the message that delivers it 
Each calls for a different approach.
For item recommendations, we use classical recommender systems: collaborative filtering, content-based ranking, etc. These are built to handle high-cardinality action spaces — often tens or hundreds of thousands of items — by leveraging global similarity structures among users and items.
For message personalization, we take a different route. Each user has a dedicated semantic-associative agent that composes messages modularly — choosing tone, value proposition, incentive type, product category, and call to action. These decisions use a variant of Thompson sampling, with beta distributions derived from each user’s response history.
Why split the system this way? Sometimes you want to send content without recommending an item — having two separate processes makes that easier. But there are deeper reasons why recommender systems suit item selection and reinforcement learning suits copy composition:
- Cardinality- The item space is vast — trial-and-error is inefficient. Recommenders generalize across users/items. Copy has a smaller, more personal space where direct exploration works well. 
- Objectives- Item recommendations aim at discovery — surfacing new or long-tail content. Copy is about resonance — hitting the right tone based on past response. 
- Decision structure- Item selection is often a single decision. Copy is modular — interdependent parts that must cohere. Perfect for RL over structured actions. 
- Hidden dimensions- Item preferences stem from stable traits like taste or relevance. Copy preferences shift quickly and depend on context — ideal for RL’s recency-weighted learning. 
- Reward density- Item responses are sparse. Every content delivery yields feedback — dense enough to train RL agents, if interpreted correctly. 
In short: recommenders find cross-user/item patterns in large spaces. RL adapts to each user in real time over structured choices. Aampe uses both — each matched to the decision it’s best for.













