Multi-Robot Behavior Adaptation to Humans’ Intention in Human-Robot Interaction Using Information-Driven Fuzzy Friend-Q Learning

Lue-Feng Chen; Zhen-Tao Liu; Min Wu; Fangyan Dong; Kaoru Hirota

doi:10.20965/jaciii.2015.p0173

single-jc.php

JACIII Vol.19 No.2 pp. 173-184

doi: 10.20965/jaciii.2015.p0173

(2015)

Paper:

Views over last 60 days: 1,489

Multi-Robot Behavior Adaptation to Humans’ Intention in Human-Robot Interaction Using Information-Driven Fuzzy Friend-Q Learning

Lue-Feng Chen^*,, Zhen-Tao Liu^, Min Wu^**, Fangyan Dong^, and Kaoru Hirota^

^*Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology
G3-49, 4259 Nagatsuta, Midori-ku, Yokohama, Kanagawa 226-8502, Japan

^**School of Automation, China University of Geosciences
No. 388 Lumo Road, Hongshan District, Wuhan, Hubei 430074, China

Received:

May 31, 2014

Accepted:

November 13, 2014

Published:

March 20, 2015

Keywords:

human-robot interaction, Q-learning, behavior adaptation, intention understanding, information-driven

Abstract

A multi-robot behavior adaptation mechanism that adapts to human intention is proposed for human-robot interaction (HRI), where information-driven fuzzy friend-Q learning (IDFFQ) is used to generate an optimal behavior-selection policy, and intention is understood mainly based on human emotions. This mechanism aims to endow robots with human-oriented interaction capabilities to understand and adapt their behaviors to human intentions. It also decreases the response time (RT) of robots by embedding the human identification information such as religion for behavior selection, and increases the satisfaction of humans by considering their deep-level information, including intention and emotion, so as to make interactions run smoothly. Experiments is performed in a scenario of drinking at a bar. Results show that the learning steps of the proposal is 51 steps less than that of the fuzzy production rule based friend-Q learning (FPRFQ), and the robots’ RT is about 25% of the time consumed by FPRFQ. Additionally, emotion recognition and intention understanding achieved an accuracy of 80.36% and 85.71%, respectively. Moreover, a subjective evaluation of customers through a questionnaire obtains a reaction of “satisfied.” Based on these preliminary experiments, the proposal is being extended to service robots for behavior adaptation to customers’ intention to drink at a bar.

Cite this article as:

L. Chen, Z. Liu, M. Wu, F. Dong, and K. Hirota, “Multi-Robot Behavior Adaptation to Humans’ Intention in Human-Robot Interaction Using Information-Driven Fuzzy Friend-Q Learning,” J. Adv. Comput. Intell. Intell. Inform., Vol.19 No.2, pp. 173-184, 2015.

Data files:

References

[1] S. Ikemoto, H. B. Amor, T. Minato, B. Jung, and H. Ishiguro, “Physical Human-Robot Interaction: Mutual Learning and Adaptation,” IEEE Robotics & Automation Magazine, Vol.19, No.4, pp. 24-35, 2012.
[2] M. A. Goodrich and A. C. Schultz, “Human-Robot Interaction: A Survey,” Foundations and Trends in Human-Computer Interaction Vol.1, No.3, pp. 203-275, 2007.
[3] K. G. Kim, D. Choi, J. Y. Lee, J. M. Park, and B. J. You, “Controlling a Humanoid Robot in Home Environment with a Cognitive Architecture,” IEEE Int. Conf. on Robotics and Biomimetics, pp. 1754-1759, 2011.
[4] N. Mitsunaga, T. Miyashita, H. Ishiguro, K. Kogure, and N. Hagita, “Robovie-IV: A Communication Robot Interacting with People Daily in an Office,” Proc. of Int. Conf. on Intelligent Robots and System, pp. 5066-5072, 2006.
[5] N. Mitsunaga, C. Smith, T. Kanda, H. Ishiguro, and N. Hagita, “Adapting Robot Behavior for Human-Robot Interaction,” IEEE Trans. on Robotics, Vol.24, No.4, pp. 911-916, 2008.
[6] N. Ay, H. Bernigau, R. Der, and M. Prokopenko, “Information-Driven Self-Organization: the Dynamical System Approach to Autonomous Robot Behavior,” Theory Biosciences, Vol.131, No.3, pp. 161-179, 2012.
[7] Y. Tian, T. Kanade, and J. F. Cohn, “Facial Expression Recognition,” Handbook of Face Recognition, Springer London, pp. 487-519, 2011.
[8] Z. T. Liu, M. Wu, D. Y. Li, L. F. Chen, F. Y. Dong, Y. Yamazaki, and K. Hirota, “Concept of Fuzzy Atmosfield for Representing Communication Atmosphere and Its Application to Humans-Robots Interaction,” J. of Advanced Computational Intelligence and Intelligent Informatics, Vol.17, No.1, pp. 3-17, 2013.
[9] Y. K. Tang, F. Y. Dong, M. Yuhki, Y. Yamazaki, T. Shibata, and K. Hirota, “Deep Level Situation Understanding and Its Application to Casual Communication Between Robots and Humans,” Int. Conf. on Informatics in Control, Automation and Robotics, pp. 292-299, 2013.
[10] K. Muelling, J. Kober, O. Kroemer, and J. Peters, “Learning to Select and Generalize Striking Movements in Robot Table Tennis,” The Int. J. of Robotics Research, Vol.32, No.3, pp. 263-279, 2013.
[11] E. A. Sisbot, L. F. M. Urias, X. Broquère, D. Sidobre, and R. Alami, “Synthesizing Robot Motions Adapted to Human Presence,” Int. J. of Social Robotics, Vol.2, No.3, pp. 329-343, 2010.
[12] V. Tikhanoff, A. Cangelosi, and G. Metta, “Integration of Speech and Action in Humanoid Robots: iCub Simulation Experiments,” IEEE Trans. on Autonomous Mental Development, Vol.3, No.1, pp. 17-29, 2011.
[13] M. S. Erden, “Emotional Postures for the Humanoid-Robot Nao,” Int. J. of Social Robotics, Vol.5, No.4, pp. 441-456, 2013.
[14] L. F. Chen, Z. T. Liu, F. Y. Dong, Y. Yamazaki, M. Wu, and K. Hirota, “Adapting Multi-Robot Behavior to Communication Atmosphere in Humans-Robots Interaction Using Fuzzy Production Rule Based Friend-Q Learning,” J. of Advanced Computational Intelligence and Intelligent Informatics, Vol.17, No.2, pp. 291-301, 2013.
[15] L. F. Chen, Z. T. Liu, M. Wu, F. Y. Dong, Y. Yamazaki, and K. Hirota, “Multi-robot behavior adaptation to local and global communication atmosphere in humans-robots interaction,” J. on Multimodal User Interfaces, Vol.8, No.3, pp. 289-303, 2014.
[16] L. F. Chen, Z. T. Liu, M. Wu, M. Ding, F. Y. Dong, Y. Yamazaki, and K. Hirota, “Emotion-age-gender-nationality based Intention Understanding in Human-Robot Interaction Using Two-Layer Fuzzy Support Vector Regression,” Int. J. of Social Robotics, 2015.
[17] J. Kober and J. Peters, “Reinforcement Learning in Robotics: A Survey,” Reinforcement Learning, Springer Berlin Heidelberg, pp. 579-610, 2012.
[18] L. Busoniu, R. Babuska, and B. D. Schutter, “Multi-agent Reinforcement Learning: An Overview,” Innovations in Multi-Agent Systems and Applications-1, Springer Berlin Heidelberg, pp. 183-221, 2010.
[19] M. L. Littman, “Markov Games as a Framework for Multi-Agent Reinforcement Learning,” Proc. of the 11th Int. Conf. on Machine Learning, pp. 157-163, 1994.
[20] J. Hu and M. P. Wellman, “Nash Q-Learning for General-Sum Stochastic Games,” J. of Machine Learning Research, Vol.4, pp. 1039-1069, 2003.
[21] M. L. Littman, “Friend-or-Foe Q-learning in General-Sum Games,” Proc. of Eighteenth Int. Conf. on Machine Learning, pp. 322-328, 2001.
[22] C. J. H. Watkins and P. Dayan, “Q-learning,” Machine Learning, Vol.8, No.3-4, pp. 279-292, 1992.
[23] L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement Learning: a Survey,” J. of Artificial Intelligence Research, Vol.4, pp. 237-285, 1996.
[24] S. Singh, T. Jaakkola, M. Littman, and C. Szepesvari, “Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms,” Machine Learning, Vol.38, No.3, pp. 287-308, 2000.
[25] Z. T. Liu, M. Wu, D. Y. Li, L. F. Chen, F. Y. Dong, Y. Yamazaki, and K. Hirota, “Communication Atmosphere in Humans-Robots Interaction based on Concept of Fuzzy Atmosfield Generated by Emotional States of Humans and Robots,” J. of Automation, Mobile Robotics and Intelligent Systems, Vol.7, No.2, pp. 52-63, 2013.
[26] Z. T. Liu, Z. Mu, L. F. Chen et al., “Emotion Recognition of Violin Music based on Strings Music Theory for Mascot Robot System,” 9th Int. Conf. on Informatics in Control, Automation and Robotics, pp. 5-14, 2012.
[27] H. A. Vu, Y. Yamazaki, F. Y. Dong, and K. Hirota, “Emotion Recognition based on Human Gesture and Speech Information Using RT Middleware,” IEEE Int. Conf. on Fuzzy Systems, pp. 787-791, 2011.
[28] G. Hu, W. P. Tay, and Y. Wen, “Cloud Robotics: Architecture, Challenges and Applications,” IEEE Network, Vol.26, No.3, pp. 21-28, 2012.
[29] S. Calinon, P. Kormushev, and D. G. Caldwell, “Compliant Skills Acquisition and Multi-Optima Policy Search with EM-based Reinforcement Learning,” Robotics and Autonomous Systems, Vol.61, No.4, pp. 369-379, 2013.
[30] M. Richter, Y. Sandamirskaya, and G. Schoner, “A Robotic Architecture for Action Selection and Behavioral Organization Inspired by Human Cognition,” IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, pp. 2457-2464, 2012.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] S. Ikemoto, H. B. Amor, T. Minato, B. Jung, and H. Ishiguro, “Physical Human-Robot Interaction: Mutual Learning and Adaptation,” IEEE Robotics & Automation Magazine, Vol.19, No.4, pp. 24-35, 2012.

[2] [2] M. A. Goodrich and A. C. Schultz, “Human-Robot Interaction: A Survey,” Foundations and Trends in Human-Computer Interaction Vol.1, No.3, pp. 203-275, 2007.

[3] [3] K. G. Kim, D. Choi, J. Y. Lee, J. M. Park, and B. J. You, “Controlling a Humanoid Robot in Home Environment with a Cognitive Architecture,” IEEE Int. Conf. on Robotics and Biomimetics, pp. 1754-1759, 2011.

[4] [4] N. Mitsunaga, T. Miyashita, H. Ishiguro, K. Kogure, and N. Hagita, “Robovie-IV: A Communication Robot Interacting with People Daily in an Office,” Proc. of Int. Conf. on Intelligent Robots and System, pp. 5066-5072, 2006.

[5] [5] N. Mitsunaga, C. Smith, T. Kanda, H. Ishiguro, and N. Hagita, “Adapting Robot Behavior for Human-Robot Interaction,” IEEE Trans. on Robotics, Vol.24, No.4, pp. 911-916, 2008.

[6] [6] N. Ay, H. Bernigau, R. Der, and M. Prokopenko, “Information-Driven Self-Organization: the Dynamical System Approach to Autonomous Robot Behavior,” Theory Biosciences, Vol.131, No.3, pp. 161-179, 2012.

[7] [7] Y. Tian, T. Kanade, and J. F. Cohn, “Facial Expression Recognition,” Handbook of Face Recognition, Springer London, pp. 487-519, 2011.

[8] [8] Z. T. Liu, M. Wu, D. Y. Li, L. F. Chen, F. Y. Dong, Y. Yamazaki, and K. Hirota, “Concept of Fuzzy Atmosfield for Representing Communication Atmosphere and Its Application to Humans-Robots Interaction,” J. of Advanced Computational Intelligence and Intelligent Informatics, Vol.17, No.1, pp. 3-17, 2013.

[9] [9] Y. K. Tang, F. Y. Dong, M. Yuhki, Y. Yamazaki, T. Shibata, and K. Hirota, “Deep Level Situation Understanding and Its Application to Casual Communication Between Robots and Humans,” Int. Conf. on Informatics in Control, Automation and Robotics, pp. 292-299, 2013.

[10] [10] K. Muelling, J. Kober, O. Kroemer, and J. Peters, “Learning to Select and Generalize Striking Movements in Robot Table Tennis,” The Int. J. of Robotics Research, Vol.32, No.3, pp. 263-279, 2013.

[11] [11] E. A. Sisbot, L. F. M. Urias, X. Broquère, D. Sidobre, and R. Alami, “Synthesizing Robot Motions Adapted to Human Presence,” Int. J. of Social Robotics, Vol.2, No.3, pp. 329-343, 2010.

[12] [12] V. Tikhanoff, A. Cangelosi, and G. Metta, “Integration of Speech and Action in Humanoid Robots: iCub Simulation Experiments,” IEEE Trans. on Autonomous Mental Development, Vol.3, No.1, pp. 17-29, 2011.

[13] [13] M. S. Erden, “Emotional Postures for the Humanoid-Robot Nao,” Int. J. of Social Robotics, Vol.5, No.4, pp. 441-456, 2013.

[14] [14] L. F. Chen, Z. T. Liu, F. Y. Dong, Y. Yamazaki, M. Wu, and K. Hirota, “Adapting Multi-Robot Behavior to Communication Atmosphere in Humans-Robots Interaction Using Fuzzy Production Rule Based Friend-Q Learning,” J. of Advanced Computational Intelligence and Intelligent Informatics, Vol.17, No.2, pp. 291-301, 2013.

[15] [15] L. F. Chen, Z. T. Liu, M. Wu, F. Y. Dong, Y. Yamazaki, and K. Hirota, “Multi-robot behavior adaptation to local and global communication atmosphere in humans-robots interaction,” J. on Multimodal User Interfaces, Vol.8, No.3, pp. 289-303, 2014.

[16] [16] L. F. Chen, Z. T. Liu, M. Wu, M. Ding, F. Y. Dong, Y. Yamazaki, and K. Hirota, “Emotion-age-gender-nationality based Intention Understanding in Human-Robot Interaction Using Two-Layer Fuzzy Support Vector Regression,” Int. J. of Social Robotics, 2015.

[17] [17] J. Kober and J. Peters, “Reinforcement Learning in Robotics: A Survey,” Reinforcement Learning, Springer Berlin Heidelberg, pp. 579-610, 2012.

[18] [18] L. Busoniu, R. Babuska, and B. D. Schutter, “Multi-agent Reinforcement Learning: An Overview,” Innovations in Multi-Agent Systems and Applications-1, Springer Berlin Heidelberg, pp. 183-221, 2010.

[19] [19] M. L. Littman, “Markov Games as a Framework for Multi-Agent Reinforcement Learning,” Proc. of the 11th Int. Conf. on Machine Learning, pp. 157-163, 1994.

[20] [20] J. Hu and M. P. Wellman, “Nash Q-Learning for General-Sum Stochastic Games,” J. of Machine Learning Research, Vol.4, pp. 1039-1069, 2003.

[21] [21] M. L. Littman, “Friend-or-Foe Q-learning in General-Sum Games,” Proc. of Eighteenth Int. Conf. on Machine Learning, pp. 322-328, 2001.

[22] [22] C. J. H. Watkins and P. Dayan, “Q-learning,” Machine Learning, Vol.8, No.3-4, pp. 279-292, 1992.

[23] [23] L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement Learning: a Survey,” J. of Artificial Intelligence Research, Vol.4, pp. 237-285, 1996.

[24] [24] S. Singh, T. Jaakkola, M. Littman, and C. Szepesvari, “Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms,” Machine Learning, Vol.38, No.3, pp. 287-308, 2000.

[25] [25] Z. T. Liu, M. Wu, D. Y. Li, L. F. Chen, F. Y. Dong, Y. Yamazaki, and K. Hirota, “Communication Atmosphere in Humans-Robots Interaction based on Concept of Fuzzy Atmosfield Generated by Emotional States of Humans and Robots,” J. of Automation, Mobile Robotics and Intelligent Systems, Vol.7, No.2, pp. 52-63, 2013.

[26] [26] Z. T. Liu, Z. Mu, L. F. Chen et al., “Emotion Recognition of Violin Music based on Strings Music Theory for Mascot Robot System,” 9th Int. Conf. on Informatics in Control, Automation and Robotics, pp. 5-14, 2012.

[27] [27] H. A. Vu, Y. Yamazaki, F. Y. Dong, and K. Hirota, “Emotion Recognition based on Human Gesture and Speech Information Using RT Middleware,” IEEE Int. Conf. on Fuzzy Systems, pp. 787-791, 2011.

[28] [28] G. Hu, W. P. Tay, and Y. Wen, “Cloud Robotics: Architecture, Challenges and Applications,” IEEE Network, Vol.26, No.3, pp. 21-28, 2012.

[29] [29] S. Calinon, P. Kormushev, and D. G. Caldwell, “Compliant Skills Acquisition and Multi-Optima Policy Search with EM-based Reinforcement Learning,” Robotics and Autonomous Systems, Vol.61, No.4, pp. 369-379, 2013.

[30] [30] M. Richter, Y. Sandamirskaya, and G. Schoner, “A Robotic Architecture for Action Selection and Behavioral Organization Inspired by Human Cognition,” IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, pp. 2457-2464, 2012.

Multi-Robot Behavior Adaptation to Humans’ Intention in Human-Robot Interaction Using Information-Driven Fuzzy Friend-Q Learning

Lue-Feng Chen*,**, Zhen-Tao Liu**, Min Wu**, Fangyan Dong*, and Kaoru Hirota*

Lue-Feng Chen^*,, Zhen-Tao Liu^, Min Wu^**, Fangyan Dong^, and Kaoru Hirota^