
JACIII Vol.18 No.6 pp. 992-998
doi: 10.20965/jaciii.2014.p0992


Mutually Dependent Markov Decision Processes

Toshiharu Fujita* and Akifumi Kira**

*Graduate School of Engineering, Kyushu Institute of Technology, 1-1 Sensui-cho, Tobata, Kitakyushu 804-8550, Japan

**Graduate School of Economics and Management, Tohoku University, 27-1 Kawauchi, Aoba-ku, Sendai 980-8576, Japan

February 20, 2014
June 10, 2014
November 20, 2014
markov decision process, dynamic programming, nonserial system, additive reward
In this paper, we introduce a basic framework for mutually dependent Markov decision processes (MDMDP) showing recursive mutual dependence. Our model is structured upon two types of finite-stage Markov decision processes. At each stage, the reward in one process is given by the optimal value of the alternative process problem, whose initial state is determined by the current state and decision in the original process. We formulate the MDMDP model and derive mutually dependent recursive equations by dynamic programming. Furthermore, MDMDP is illustrated in a numerical example. The model enables easier treatment of some classes of complex multi-stage decision processes.
Cite this article as:
T. Fujita and A. Kira, “Mutually Dependent Markov Decision Processes,” J. Adv. Comput. Intell. Intell. Inform., Vol.18 No.6, pp. 992-998, 2014.
