Research Paper:
Deep Convolutional Neural Networks Based on Knowledge Distillation for Offline Handwritten Chinese Character Recognition
Hongli He*
, Zongnan Zhu**
, Zhuo Li***
, and Yongping Dan**,

*Rail Transit Institute, Henan College of Transportation
No.259 Tonghui Road, Baisha Vocational Education Park, Zhengdong New District, Zhengzhou 450061, China
**School of Electronic and Information, Zhongyuan University of Technology
No.41 Zhongyuan Road, Zhengzhou 450007, China
Corresponding author
***Graduate School of Science and Engineering, Ritsumeikan University
1-1-1 Noji-higashi, Kusatsu, Shiga 525-8577, Japan
Deep convolutional neural networks (DNNs) have achieved outstanding performance in this field. Meanwhile, handwritten Chinese character recognition (HCCR) is a challenging area of research in the field of computer vision. DNNs require a large number of parameters and high memory consumption. To address these issues, this paper proposes an approach based on an attention mechanism and knowledge distillation. The attention mechanism improves the feature extraction and the knowledge distillation reduces the number of parameters. The experimental results show that ResNet18 achieves a recognition accuracy of 97.63% on the HCCR dataset with 11.25 million parameters. Compared with other methods, this study improves the performance for HCCR.
- [1] C.-L. Liu, “High Accuracy Handwritten Chinese Character Recognition Using Quadratic Classifiers with Discriminative Feature Extraction,” 18th Conf. on Pattern Recognition (ICPR’06), Vol.2, pp. 942-945, 2006. https://doi.org/10.1109/ICPR.2006.624
- [2] T. H. Hildebrandt and W. Liu, “Optical Recognition of Handwritten Chinese Characters: Advances Since 1980,” Pattern Recognition, Vol.26, pp. 205-225, 1993. https://doi.org/10.1016/0031-3203(93)90030-Z
- [3] X. Zhang, Y. Bengio, and C. Liu, “Online and Offline Handwritten Chinese Character Recognition: A Comprehensive Study and New Benchmark,” Pattern Recognition, Vol.61, pp. 348-360, 2016. https://doi.org/10.1016/j.patcog.2016.08.005
- [4] R. Dai, C. Liu, and B. Xiao, “Chinese Character Recognition: History, Status and Prospects,” Frontiers of Computer Science in China, Vol.1, No.2, pp. 126-136, 2007. https://doi.org/10.1007/s11704-007-0012-5
- [5] C.-L. Liu, F. Yin, D.-H. Wang, and Q.-F. Wang, “Online and Offline Handwritten Chinese Character Recognition: Benchmarking on New Databases,” Pattern Recognition, Vol.46, No.1, pp. 155-162, 2013. https://doi.org/10.1016/j.patcog.2012.06.021
- [6] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet Classification with Deep Convolutional Neural Networks,” Communications of the ACM, Vol.60, No.6, pp. 84-90, 2017. https://doi.org/10.1145/3065386
- [7] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv:1409.1556, 2015. https://doi.org/10.48550/arXiv.1409.1556
- [8] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going Deeper with Convolutions,” 2015 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 1-9, 2015. https://doi.org/10.1109/CVPR.2015.7298594
- [9] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, 2016. https://doi.org/10.1109/CVPR.2016.90
- [10] H. Li, Z. Wang, X. Yue, W. Wang, T. Hiroyuki, and L. Meng, “A Comprehensive Analysis of Low-Impact Computations in Deep Learning Workloads,” Proc. of the 2021 on Great Lakes Symp. on VLSI (GLSVLSI’21), pp. 385-390, 2021. https://doi.org/10.1145/3453688.3461747
- [11] W. Park, D. Kim, Y. Lu, and M. Cho, “Relational Knowledge Distillation,” Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 3967-3976, 2019. https://doi.org/10.1109/CVPR.2019.00409
- [12] R. Z. Tan, X. Chew, and K. W. Khaw, “Neural Architecture Search for Lightweight Neural Network in Food Recognition,” Mathematics, Vol.9, No.11, Article No.1245, 2021. https://doi.org/10.3390/math9111245
- [13] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv:1503.02531, 2015. https://doi.org/10.48550/arXiv.1503.02531
- [14] Z. Huang and N. Wang, “Like What You Like: Knowledge Distill via Neuron Selectivity Transfer,” arxiv:1707.01219, 2017. https://doi.org/10.48550/arXiv.1707.01219
- [15] S. Zagoruyko and N. Komodakis, “Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer,” Int. Conf. on Learning Representations, pp. 24-26, 2017.
- [16] J. Yim, D. Joo, J. Bae, and J. Kim, “A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning,” Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 4133-4141, 2017. https://doi.org/10.1109/CVPR.2017.754
- [17] H. Bagherinezhad, M. Horton, M. Rastegari, and A. Farhadi, “Label Refinery: Improving ImageNet Classification Through Label Progression,” arXiv:1805.02641, 2018. https://doi.org/10.48550/arXiv.1805.02641
- [18] S. Lai, L. Jin, and W. Yang, “Toward high-performance online HCCR: A CNN approach with DropDistortion, path signature and spatial stochastic max-pooling,” Pattern Recognition Letters, Vol.89, pp. 60-66, 2017. https://doi.org/10.1016/j.patrec.2017.02.011
- [19] R. Sharma and B. Kaushik, “Offline Recognition of Handwritten Indic Scripts: A State-of-the-Art Survey and Future Perspectives,” Computer Science Review, Vol.38, Article No.100302, 2020. https://doi.org/10.1016/j.cosrev.2020.100302
- [20] Q. Hao, X. Wu, S. Zhang, P. Zhang, X. Ma, and J. Jiang, “Research on Offline Handwritten Chinese Character Recognition Based on Deep Learning,” 2019 9th Int. Conf. on Formation Science and Technology (ICIST), pp. 470-474, 2019. https://doi.org/10.1109/ICIST.2019.8836833
- [21] W. Luo and S. I. Kamata, “Radical Region-Based CNN for Offline Handwritten Chinese Character Recognition,” Proc. of the 2017 4th IAPR Asian Conf. on Pattern Recognition (ACPR), pp. 542-547, 2017. https://doi.org/10.1109/ACPR.2017.76
- [22] Y. Ge, Q. Huo, and Z.-D. Feng, “Offline Recognition of Handwritten Chinese Characters Using Gabor Features, CDHMM Modeling and MCE Training,” Proc. of the Conf. on Acoustics, Speech and Signal Processing (CASSP’02), pp. I-1053-I-1056, 2002. https://doi.org/10.1109/ICASSP.2002.5743976
- [23] C.-L. Liu, “Normalization-Cooperated Gradient Feature Extraction for Handwritten Character Recognition,” IEEE Trans. on Pattern Analysis and Machine Int., Vol.29, No.8, pp. 1465-1469, 2007. https://doi.org/10.1109/TPAMI.2007.1090
- [24] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, Vol.521, Article No.7553, pp. 436-444, 2015. https://doi.org/10.1038/nature14539
- [25] Z. Zhong, X.-Y. Zhang, F. Yin, and C.-L. Liu, “Handwritten Chinese Character Recognition with Spatial Transformer and Deep Residual Networks,” 2016 23rd Int. Conf. on Pattern Recognition (ICPR), pp. 3440-3445, 2016. https://doi.org/10.1109/ICPR.2016.7900166
- [26] N. Bi, J. Chen, and J. Tan, “The Handwritten Chinese Character Recognition Uses Convolutional Neural Networks with the GoogLeNet,” J. of Pattern Recognition and Artificial Intelligence, Vol.33, No.11, Article No.1940016, 2019. https://doi.org/10.1142/S0218001419400160
- [27] P. Melnyk, Z. You, and K. Li, “A High-Performance CNN Method for Offline Handwritten Chinese Character Recognition and Visualization,” Soft Computing, Vol.24, No.11, pp. 7977-7987, 2020. https://doi.org/10.1007/s00500-019-04083-3
- [28] Z. Li, Q. Wu, Y. Xiao, M. Jin, and H. Lu, “Deep Matching Network for Handwritten Chinese Character Recognition,” Pattern Recognition, Vol.107, Article No.107471, 2020. https://doi.org/10.1016/j.patcog.2020.107471
- [29] Y. Dan and Z. Li, “Particle Swarm Optimization-Based Convolutional Neural Network for Handwritten Chinese Character Recognition,” J. Adv. Comput. Intell. Intell. Inform., Vol.27, No.2, pp. 165-172, 2023. https://doi.org/10.20965/jaciii.2023.p0165
- [30] M. H. Sadi and A. Mahani, “Accelerating Deep Convolutional Neural Network Base on Stochastic Computing,” Integration, Vol.76, pp. 113-121, 2021. https://doi.org/10.1016/j.vlsi.2020.09.008
- [31] N. Passalis and A. Tefas, “Training Lightweight Deep Convolutional Neural Networks Using Bag-of-Features Pooling,” IEEE Trans. on Neural Networks and Learning Systems, Vol.30, No.6, pp. 1705-1715, 2019. https://doi.org/10.1109/TNNLS.2018.2872995
- [32] C. Buciluǎ, R. Caruana, and A. Niculescu-Mizil, “Model Compression,” Proc. of the 12th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD’06), pp. 535-541, 2006. https://doi.org/10.1145/1150402.1150464
- [33] J. Ba and R. Caruana, “Do Deep Nets Really Need to Be Deep?,” Advances in Neural Information Processing Systems, Vol.3, pp. 2654-2662, 2014.
- [34] Z. Xu, Y.-C. Hsu, and J. Huang, “Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks,” arXiv:1709.00513, 2017. https://doi.org/10.48550/arXiv.1709.00513
- [35] S. Sun, Y. Cheng, Z. Gan, and J. Liu, “Patient Knowledge Distillation for BERT Model Compression,” arXiv:1908.09355, 2019. https://doi.org/10.48550/arXiv.1908.09355
- [36] G. Ros, S. Stent, P. F. Alcantarilla, and T. Watanabe, “Training Constrained Deconvolutional Networks for Road Scene Semantic Segmentation,” arXiv:1604.01545, 2016. https://doi.org/10.48550/arXiv.1604.01545
- [37] P. Luo, Z. Zhu, Z. Liu, X. Wang, and X. Tang, “Face Model Compression by Distilling Knowledge from Neurons,” Proc. of the AAAI Conf. on Artificial Intelligence, Vol.30, No.1, 2016. https://doi.org/10.1609/aaai.v30i1.10449
- [38] G. Chen, W. Choi, X. Yu, T. Han, and M. Chandraker, “Learning Efficient Object Detection Models with Knowledge Distillation,” Proc. of the 31st Int. Conf. on Neural Information Processing Systems, pp. 742-751, 2017.
- [39] J. Shen, N. Vesdapunt, V. N. Boddeti, and K. M. Kitani, “In Teacher We Trust: Learning Compressed Models for Pedestrian Detection,” arxiv:1612.00478, 2016. https://doi.org/10.48550/arXiv.1612.00478
- [40] Y. Teh, V. Bapst, W. M. Czarnecki, J. Quan, J. Kirkpatrick, R. Hadsell, N. Heess, and R. Pascanu, “Distral: Robust Multitask Reinforcement Learning,” Proc. of the 31st Int. Conf. on Neural Information Processing Systems, pp. 4499-4509, 2017.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.