
JACIII Vol.23 No.5 pp. 920-927
doi: 10.20965/jaciii.2019.p0920


Developing End-to-End Control Policies for Robotic Swarms Using Deep Q-learning

Yufei Wei*, Xiaotong Nie*, Motoaki Hiraga*, Kazuhiro Ohkura*, and Zlatan Car**

*Graduate School of Engineering, Hiroshima University
1-4-1 Kagamiyama, Higashi-hiroshima, Hiroshima 739-8527, Japan

**Faculty of Engineering, University of Rijeka
58 Vukovarska, Rijeka 51000, Croatia

September 25, 2018
May 1, 2019
September 20, 2019
swarm robotics, automatic design, deep reinforcement learning, deep Q-learning

In this study, the use of a popular deep reinforcement learning algorithm – deep Q-learning – in developing end-to-end control policies for robotic swarms is explored. Robots only have limited local sensory capabilities; however, in a swarm, they can accomplish collective tasks beyond the capability of a single robot. Compared with most automatic design approaches proposed so far, which belong to the field of evolutionary robotics, deep reinforcement learning techniques provide two advantages: (i) they enable researchers to develop control policies in an end-to-end fashion; and (ii) they require fewer computation resources, especially when the control policy to be developed has a large parameter space. The proposed approach is evaluated in a round-trip task, where the robots are required to travel between two destinations as much as possible. Simulation results show that the proposed approach can learn control policies directly from high-dimensional raw camera pixel inputs for robotic swarms.

Initial experiment environment for a robotic swarm that is controlled by deep Q-learning

Cite this article as:
Y. Wei, X. Nie, M. Hiraga, K. Ohkura, and Z. Car, “Developing End-to-End Control Policies for Robotic Swarms Using Deep Q-learning,” J. Adv. Comput. Intell. Intell. Inform., Vol.23 No.5, pp. 920-927, 2019.
Data files:
