Behavior Acquisition in Partially Observable Environments by Autonomous Segmentation of the Observation Space

Kousuke Inoue*, Tamio Arai**, and Jun Ota***

*Department of Intelligent Systems Engineering, Faculty of Engineering, Ibaraki University
4-12-1 Nakanarusawa-cho, Hitachi, Ibaraki 316-8511, Japan

**Shibaura Institute of Technology
3-7-5 Toyosu, Koto-ku, Tokyo 135-8548, Japan

***Research into Artifacts, Center for Engineering (RACE), The University of Tokyo
5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8568, Japan

December 26, 2014
May 11, 2015
June 20, 2015
learning, partially observable Markov decision process, autonomous state-space construction

State representation
In this paper, we propose a method by which an agent can autonomously construct a state-representation to achieve state-identification with a sufficient Markovian property. Furthermore, the agent does this using continuous and multi-dimensional observationspace in partially observable environments. In order to deal with the non-Markovian property of the environment, a state-representation of a decision tree structure based on past observations and actions is used. This representation is gradually segmented to achieve appropriate state-distinction. Because the observation-space of the agent is not segmented in advance, the agent has to determine the cause of its state-representation insufficiency: (1) insufficient observation-space segmentation, or (2) perceptual aliasing. In the proposed method, the cause is determined using a statistical analysis of past experiences, and the method of state-segmentation is decided based on this cause. Results of simulations in two-dimensional grid-environments and experiments with real mobile robot navigating in two-dimensional continuous workspace show that an agent can successfully acquire navigation behaviors with many hidden states.
