Q-learning policies for multi-agent foraging task

M. Yogeswaran; Ponnambalam S G

doi:10.1007/978-3-642-15810-0_25

Profiles Research Units Publications

Conferences

Q-learning policies for multi-agent foraging task

M. Yogeswaran,

Published in

2010

DOI: 10.1007/978-3-642-15810-0_25

Volume: 103 CCIS

Pages: 194 - 201

Abstract

The trade-off issue between exploitation and exploration in multi-agent systems learning have been a crucial area of research for the past few decades. A proper learning policy is necessary to address the issue for the agents to react rapidly and adapt in a dynamic environment. A family of core learning policies were identified in the open literature that are suitable for non-stationary multi-agent foraging task modeled in this paper. The model is used to compare and contrast between the identified learning policies namely greedy, ε-greedy and Boltzmann distribution. A simple random search is also included to justify the convergence of q-learning. A number of simulation-based experiments was conducted and based on the numerical results that was obtained, the performances of the learning policies are discussed. © 2010 Springer-Verlag.

Topics: Q-learning (58)%

View more info for "Q-Learning Policies for Multi-Agent Foraging Task"

About the journal