Beyond Winning and Losing: Modeling Human Motivations and Behaviors with Vector-Valued Inverse Reinforcement Learning
In recent years, reinforcement learning (RL) methods have been applied to model gameplay with great success, achieving super-human performance in various environments, such as Atari, Go and Poker. However, these studies mostly focus on winning the game and have largely ignored the rich and complex human motivations, which are essential for understanding humans’ diverse behavior. In this paper, we present a multi-motivation behavior model which investigates the multifaceted human motivations and learns the underlying value structure of the agents. Our approach extends inverse RL to vectored-valued rewards with Pareto optimality which significantly weakens the inverse RL assumption. Our model therefore incorporates a wider range of behavior that commonly appears in real-world environments. For practical assessment, our algorithm is tested on World of Warcraft datasets and demonstrates the improvement over existing methods.