The aim of this study is to provide this framework as well as benchmark business performance of both Uplift Modeling and Reinforcement Learning. Furthermore, the framework will account for the essential requirements of profit maximization in real-world business scenarios that have rarely been covered in uplift literature. Specifically, it incorporates covariates that capture the expected revenue and costs associated with a given action, which are necessary to account for the heterogeneity in spending patterns and action costs.
Profit maximization is traditionally known as one of the key objectives of a firm and requires little explanation. In a marketing context, it translates to targeting only the relevant individuals, namely those that will react favorably to receiving a form of treatment. Identifying precisely those individuals has been subject of two distinct Machine Learning approaches that are associated with optimal decision-making: Uplift Modeling and Reinforcement Learning. Despite their shared focus, both techniques are fundamentally distinct from each other. Uplift Modeling utilizes labeled data to predict the uplift of an action, whereas Reinforcement Learning is an iterative, label-free technique that aims to determine the optimal decision, incorporating the uplift. However – to date – research has scarcely examined the comparative effectiveness of these two approaches, nor has it explored the feasibility of an integrated framework that leverages both disciplines.
Inhaltsverzeichnis (Table of Contents)
- Introduction
- Research Contributions
- Structure of this Paper
- Preliminaries
- Uplift Modeling in Machine Learning
- Profit Maximization in Uplift Modeling
- Policy Learning and multi-armed Bandit Models
- Reinforcement Learning for Uplift Modeling
- Policy Learning Approaches to Uplift Modeling
- Multi-Armed Bandit Models for Uplift Modeling
- Related Literature
- Profit Maximization through Uplift Modeling
- Reinforcement Learning for Uplift Modeling
- Experiment
- Methodology
- Data Sets
- Empirical Results
- Regret-Optimality as the reward metric for PL
- Quote paper
- Jon Henrik Rosenkranz (Author), 2023, Maximizing profit in uplift modeling through regret-optimal policy learning strategies, Munich, GRIN Verlag, https://www.grin.com/document/1378908