A Two-Tier DRL and LLM-Based Agent System for Enhancing Fighting Game Enjoyability
Developing a two-tier agent (TTA) system that enhances player enjoyment in Street Fighter II using deep reinforcement learning (DRL) and a Large Language Model Hyper-Agent, which dynamically selects suitable DRL opponents based on player data and feedback.
To enhance your enjoyment, let’s start with a video!
(In this clip, the player in grey is the deep reinforcement learning agent, while the player in white is a human player.)
Paper
The paper can be found here, please download PDF file
Password to the PDF file is: my full name (ShourenWang) + _ + my NYU Net ID (e.g. JohnnySilverhand_js1111), you should know the password if you’re allowed to view this file.
Abstract
Deep reinforcement learning (DRL) has effectively enhanced gameplay experiences and game design across various game genres. However, few studies on fighting game agents have focused explicitly on enhancing player enjoyment, a critical factor for both developers and players. To address this gap and establish a practical baseline for designing enjoyability-focused agents, we propose a two-tier agent (TTA) system and conducted experiments in the classic fighting game Street Fighter II. The first tier of TTA employs a task-oriented network architecture, modularized reward functions, and hybrid training to produce diverse and skilled DRL agents. In the second tier of TTA, a Large Language Model Hyper-Agent, leveraging players’ playing data and feedback, dynamically selects suitable DRL opponents. In addition, we investigate and model several key factors that affect the enjoyability of the opponent. The experiments demonstrate improvements from 218.56\% to 472.55\% in the execution of advanced skills over baseline methods. The trained agents also exhibit distinct game-playing styles. Additionally, we conducted a small-scale user study, and the overall enjoyment in the player’s feedback validates the effectiveness of our TTA system. (The content below is outdated and will be updated in the coming weeks.)
Introduction
This project focuses on developing a Deep Reinforcement Learning (DRL)-based game-playing agent for Street Fighter II (SF2), with an emphasis on enhancing the agent’s enjoyability for players. Conducted as part of my research internship at NYU Game Innovation Lab under the guidance of Prof. Julian Togelius , in collaboration with Zehua Jiang , Fernando Silva , and Sam Earle , the project explores advanced training techniques, such as self-play and reward design, to create an engaging and adaptive game-playing agent to improve players’ enjoyment.
Method
Initial Challenges
The project began by investigating existing baselines for SF2 DRL agents. The first baseline lacked accessible code or a GitHub repository, while the second was a very simple project with poor agent performance. To improve upon this, I tested various DRL models and extended them using auxiliary objectives to better capture key environmental information. However, the agents still failed to learn advanced strategies, highlighting limitations in the original training methods rather than the model’s capacity.
Advanced Training Techniques
Hybrid Self-Play
Inspired by successful self-play pipelines in other game AI research, I integrated a hybrid self-play training approach. In this method:
- Self-Play: The agent trains by competing against copies of itself, enabling iterative improvement.
- Built-in AI Opponents:To further enhance training diversity, I incorporated rule-based built-in AI designed by CAPCOM engineers as opponents during self-play.
This combination enabled the agent to effectively learn advanced skills and strategies, surpassing the limitations of earlier approaches.
Reward Design
To make the agent more enjoyable for players, I designed several reward functions aimed at promoting advanced and diverse behaviors. This involved: Rewarding advanced strategies, e.g. special moves. Penalizing repetitive or overly simplistic strategies. The reward design process played a crucial role in guiding the agent toward learning behaviors that align with human players’ preferences.
Results
The results demonstrated significant improvements in the agent’s learning and enjoyability:
- Skill Advancement: The agent successfully learned advanced strategies, showcasing a level of play comparable to CAPCOM’s built-in AI.
- Player Enjoyability: The enhanced reward functions and self-play pipeline resulted in agents that provided more engaging gameplay experiences.
Future Work and Dissemination
The research is ongoing, with further efforts directed toward refining training settings and exploring hyper-agent methods to enhance enjoyability further. The work is targeted for submission to ICML 2025 and IEEE CoG 2025.
Codebase
The code is available in thisGitHub repository. The project code is well-structured, should be compatible for other OpenAI Gym or OpenAI Gym Retro-based tasks. Agents’ model are currently not provided, due to it requires LFS service.