Towards Learning from Implicit Human Reward: (Extended Abstract)

Guangliang Li, Hamdi Dibeklioglu, S Whiteson, Hayley Hung

Research output: Contribution to conferenceAbstractScientific

2 Citations (Scopus)


The TAMER framework provides a way for agents to learn to solve tasks using human-generated rewards. Previous research showed that humans give copious feedback early in training but very sparsely thereafter and that an agent's competitive feedback --- informing the trainer about its performance relative to other trainers --- can greatly affect the trainer's engagement and the agent's learning. In this paper, we present the first large-scale study of TAMER, involving 561 subjects, which investigates the effect of the agent's competitive feedback in a new setting as well as the potential for learning from trainers' facial expressions. Our results show for the first time that a TAMER agent can successfully learn to play Infinite Mario, a challenging reinforcement-learning benchmark problem. In addition, our study supports prior results demonstrating the importance of bi-directional feedback and competitive elements in the training interface. Finally, our results shed light on the potential for using trainers' facial expressions as reward signals, as well as the role of age and gender in trainer behavior and agent performance.
Original languageEnglish
Number of pages2
Publication statusPublished - 2016
EventAAMAS 2016 : 15th International Conference on Autonomous Agents and Multiagent Systems - Singapore, Singapore
Duration: 9 May 201613 May 2016
Conference number: 15


ConferenceAAMAS 2016
Abbreviated titleAAMAS
Internet address


  • Reinforcement learning
  • human agent interaction


Dive into the research topics of 'Towards Learning from Implicit Human Reward: (Extended Abstract)'. Together they form a unique fingerprint.

Cite this