UnivLogo

Bikramjit Banerjee's Publications

Selected PublicationsAll Sorted by DateAll Classified by Publication Type

Reinforcement Learning as a Rehearsal for Swarm Foraging

Trung Nguyen and Bikramjit Banerjee. Reinforcement Learning as a Rehearsal for Swarm Foraging. Swarm Intelligence, 16(1):29–58, Springer, 2022.

Download

[PDF] 

Abstract

Foraging in a swarm of robots has been investigated by many researchers, where the prevalent techniques have been hand designed algorithms with parameters often tuned via machine learning. Our departure point is one such algorithm, where we replace a handcoded decision procedure with reinforcement learning (RL), resulting in significantly superior performance. We situate our approach within the reinforcement learning as a rehearsal (RLaR) framework, that we have recently introduced. We instantiate RLaR for the foraging problem, and experimentally show that a key component of RLaR---a conditional probability distribution function---can be modeled as a uni-modal distribution (with a lower memory footprint) despite evidence that it is multi-modal. Our experiments also show that the learned behavior has some degree of scalability in terms of variations in the swarm size or the environment.

BibTeX

@Article{Nguyen22:Reinforcement,
  author =       {Trung Nguyen and Bikramjit Banerjee},
  title =        {{Reinforcement Learning as a Rehearsal for Swarm Foraging}},


  journal =      {Swarm Intelligence},
  year =         {2022},
  volume =       {16},
  number =       {1},
  pages =        {29--58},
  publisher =    {Springer},
  abstract =     {Foraging in a swarm of robots has been
  investigated by many researchers, where the prevalent techniques
  have been hand designed algorithms with parameters often tuned
  via machine learning. Our departure point is one such algorithm,
  where we replace a handcoded decision procedure with reinforcement
  learning (RL), resulting in significantly superior performance.
  We situate our approach within the reinforcement learning as a
  rehearsal (RLaR) framework, that we have recently introduced.
  We instantiate RLaR for the foraging problem, and experimentally
  show that a key component of RLaR---a conditional probability
  distribution function---can be modeled as a uni-modal distribution
  (with a lower memory footprint) despite evidence that it is
  multi-modal. Our experiments also show that the learned behavior
  has some degree of scalability in terms of variations in the
  swarm size or the environment.},
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Jun 01, 2022 14:33:17