UnivLogo

Bikramjit Banerjee's Publications

Selected PublicationsAll Sorted by DateAll Classified by Publication Type

Team Learning from Human Demonstration with Coordination Confidence

Bikramjit Banerjee, Syamala Vittanala, and Matthew E. Taylor. Team Learning from Human Demonstration with Coordination Confidence. The Knowledge Engineering Review, 34(e12), Cambridge University Press, 2019.

Download

[PDF] 

Abstract

Among an array of techniques proposed to speed-up reinforcement learning (RL), learning from human demonstration has a proven record of success. A related technique, called Human AgentTransfer, and its confidence-based derivatives have been successfullyapplied to single agent RL. This article investigates their application tocollaborative multi-agent RL problems. We show that a first-cut extensionmay leave room for improvement in some domains, and propose a newalgorithm called coordination confidence (CC). CC analyzes the differencein perspectives between a human demonstrator (global view) and thelearning agents (local view), and informs the agents’action choices whenthe difference is critical and simply following the human demonstration canlead to miscoordination. We conduct experiments in three domains toinvestigate the performance of CC in comparison with relevant baseline.

BibTeX

@Article{Banerjee19:Team,
  author = 	 {Bikramjit Banerjee and Syamala Vittanala and Matthew E.
                 Taylor},
  title = 	 {Team Learning from Human Demonstration with Coordination


                 Confidence},
  journal = 	 {The Knowledge Engineering Review},
  year = 	 {2019},
  volume = 	 {34},
  number = 	 {e12},
  publisher =    {Cambridge University Press},
  abstract =     {Among an array of techniques proposed to speed-up
  reinforcement learning (RL), learning from human demonstration has a
  proven record of success. A related technique, called Human Agent
Transfer, and its confidence-based derivatives have been successfully
applied to single agent RL. This article investigates their application to
collaborative multi-agent RL problems. We show that a first-cut extension
may leave room for improvement in some domains, and propose a new
algorithm called coordination confidence (CC). CC analyzes the difference
in perspectives between a human demonstrator (global view) and the
learning agents (local view), and informs the agents’action choices when
the difference is critical and simply following the human demonstration can
lead to miscoordination. We conduct experiments in three domains to
investigate the performance of CC in comparison with relevant baseline.},
}

Generated by bib2html.pl (written by Patrick Riley ) on Sat May 29, 2021 15:48:22