
JACIII Vol.27 No.2 pp. 173-181
doi: 10.20965/jaciii.2023.p0173

Research Paper:

Capsule Network Extension Based on Metric Learning

Nozomu Ohta ORCID Icon, Shin Kawai ORCID Icon, and Hajime Nobuhara ORCID Icon

Department of Intelligent Interaction Technologies, University of Tsukuba
1-1-1 Tenoudai, Tsukuba Science City, Ibaraki 305-8573, Japan

February 13, 2022
October 20, 2022
March 20, 2023
deep learning, capsule network, angular loss, image analysis

A capsule network (CapsNet) is a deep learning model for image classification that provides robustness to changes in the poses of objects in the images. A capsule is a vector whose direction represents the presence, position, size, and pose of an object. However, with CapsNet, the distribution of capsules is concentrated in a class, and the number of capsules increases with the number of classes. In addition, learning is computationally expensive for a CapsNet. We proposed a method to increase the diversity of capsule directions and decrease the computational cost of CapsNet training by allowing a single capsule to represent multiple object classes. To determine the distance between classes, we used an additive angular margin loss called ArcFace. To validate the proposed method, the distribution of the capsules was determined using principal component analysis to validate the proposed method. In addition, using the MNIST, fashion-MNIST, EMNIST, SVHN, and CIFAR-10 datasets, as well as the corresponding affine-transformed datasets, we determined the accuracy and training time of the proposed method and original CapsNet. The accuracy of the proposed method improved by 8.91% on the CIFAR-10 dataset, and the training time reduced by more than 19% for each dataset compared with those of the original CapsNets.

CapsNet and the proposed method

Cite this article as:
N. Ohta, S. Kawai, and H. Nobuhara, “Capsule Network Extension Based on Metric Learning,” J. Adv. Comput. Intell. Intell. Inform., Vol.27 No.2, pp. 173-181, 2023.
