Lin, Tianyi; Zhou, Zhengyuan; Ba, Wenjia; Zhang, Jiawei - 2021
We consider online no-regret learning in unknown games with bandit feedback, where each agent only observes its reward at each time -- determined by all players' current joint action -- rather than its gradient. We focus on the class of smooth and strongly monotone games and study optimal...