Paper:

# Class Imbalanced Fault Diagnosis via Combining K-Means Clustering Algorithm with Generative Adversarial Networks

## Huifang Li^{†}, Rui Fan, Qisong Shi, and Zijian Du

School of Automation, Beijing Institute of Technology

No.5 Zhongguancun South Street, Haidian District, Beijing 100081, China

^{†}Corresponding author

Recent advancements in machine learning and communication technologies have enabled new approaches to automated fault diagnosis and detection in industrial systems. Given wide variation in occurrence frequencies of different classes of faults, the class distribution of real-world industrial fault data is usually imbalanced. However, most prior machine learning-based classification methods do not take this imbalance into consideration, and thus tend to be biased toward recognizing the majority classes and result in poor accuracy for minority ones. To solve such problems, we propose a *k*-means clustering generative adversarial network (KM-GAN)-based fault diagnosis approach able to reduce imbalance in fault data and improve diagnostic accuracy for minority classes. First, we design a new *k*-means clustering algorithm and GAN-based oversampling method to generate diverse minority-class samples obeying the similar distribution to the original minority data. The *k*-means clustering algorithm is adopted to divide minority-class samples into *k* clusters, while a GAN is applied to learn the data distribution of the resulting clusters and generate a given number of minority-class samples as a supplement to the original dataset. Then, we construct a deep neural network (DNN) and deep belief network (DBN)-based heterogeneous ensemble model as a fault classifier to improve generalization, in which DNN and DBN models are trained separately on the resulting dataset, and then the outputs from both are averaged as the final diagnostic result. A series of comparative experiments are conducted to verify the effectiveness of our proposed method, and the experimental results show that our method can improve diagnostic accuracy for minority-class samples.

*J. Adv. Comput. Intell. Intell. Inform.*, Vol.25 No.3, pp. 346-355, 2021.

- [1] Y. Lei, B. Yang, X. Jiang, F. Jia, N. Li, and A. K. Nandi, “Applications of machine learning to machine fault diagnosis: A review and roadmap,” Mechanical Systems and Signal Processing, Vol.138, doi: 10.1016/j.ymssp.2019.106587, 2020.
- [2] H. Li, G. Hu, J. Li, and M. Zhou, “Intelligent fault diagnosis for largescale rotating machines using binarized deep neural network and random forests,” IEEE Trans. on Automation Science and Engineering, doi: 10.1109/TASE.2020.3048056, 2021.
- [3] H. Choi, S. Choi, S. Han, and J. Oh, “Fault diagnosis of planetary gear carrier packs: A class imbalance and multiclass classification problem,” Int. J. of Precision Engineering and Manufacturing, Vol.20, No.2, pp. 167-179, 2019.
- [4] Y. He, J. Zhou, Y. Lin, and T. Zhu, “A class imbalance-aware Relief algorithm for the classification of tumors using microarray gene expression data,” Computational Biology and Chemistry, Vol.80, pp. 121-127, 2019.
- [5] D. N. Anh, B. D. Hung, P. Q. Huy, and D. X. Tho, “Feature Analysis for Imbalanced Learning,” J. Adv. Comput. Intell. Intell. Inform., Vol.24, No.5, pp. 648-655, 2020.
- [6] J. Zhang and I. Mani, “KNN approach to unbalanced data distributions: A case study involving information extraction,” Int. Conf. on Machine Learning Workshop on Learning from Imbalanced Datasets (ICML 2003), 2003.
- [7] R. Barandela, R. M. Valdovinos, J. S. Sanchez, and F. J. Ferri, “The imbalanced training sample problem: under or oversampling?,” Joint IAPR Int. Workshops on Statistical Techniques in Pattern Recognition and Structural and Syntactic Pattern Recognition, pp. 806-814, 2004.
- [8] G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A study of the behavior of several methods for balancing machine learning training data,” ACM Special Interest Group on Knowledge Discovery and Data Mining Explorations Newsletter, Vol.6, No.1, pp. 20-29, 2004.
- [9] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” J. of Artificial Intelligence Research, Vol.16, No.1, pp. 321-357, 2002.
- [10] H. B. Nguyen and V. N. Huynh, “On Sampling Techniques for Corporate Credit Scoring,” J. Adv. Comput. Intell. Intell. Inform., Vol.24, No.1, pp. 48-57, 2020.
- [11] T. Zhu, Y. Lin, and Y. Liu, “Synthetic minority oversampling technique for multiclass imbalance problems,” Pattern Recognition, Vol.72, pp. 327-340, 2017.
- [12] I. J. Goodfellow et al., “Generative adversarial nets,” Int. Conf. on Neural Information Processing Systems (NIPS 2014), pp. 2672-2680, 2014.
- [13] J. Wu, Z. Zhao, and C. Sun, “Ss-InfoGAN for Class-Imbalance Classification of Bearing Faults,” Procedia Manufacturing, Vol.49, pp. 99-104, 2020.
- [14] Z. Wang, J. Wang, and Y. Wang, “An intelligent diagnosis scheme based on generative adversarial learning deep neural networks and its application to planetary gearbox fault pattern recognition,” Neurocomputing, Vol.310, No.8, pp. 213-222, 2018.
- [15] H. Han, L. Hao, and D. Cheng, “GAN-SAE based fault diagnosis method for electrically driven feed pumps,” PLoS ONE, Vol.15, No.10, Article No.e0239070, 2020.
- [16] H. Liu, J. Zhou, Y. Xu, Y. Zheng, X. Peng, and W. Jiang, “Unsupervised fault diagnosis of rolling bearings using a deep neural network based on generative adversarial networks,” Neurocomputing, Vol.315, No.13, pp. 412-424, 2018.
- [17] T. Han, C. Liu, W. Yang, and D. Jiang, “A novel adversarial learning framework in deep convolutional neural network for intelligent diagnosis of mechanical faults,” Knowledge-Based Systems, Vol.165, No.1, pp. 474-487, 2019.
- [18] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein Generative Adversarial Networks,” Proc. of the 34th Int. Conf. on Machine Learning (PMLR 2017), pp. 214-223, 2017.
- [19] H. Bae, Y. Kim, S. Kim, S. Lee, and B. Wang, “Fault Detection of Induction Motors Using Fourier and Wavelet Analysis,” J. Adv. Comput. Intell. Intell. Inform., Vol.8, No.4, pp. 431-436, 2004.
- [20] S. Tang, S. Yuan, and Y. Zhu, “Deep learning-based intelligent fault diagnosis methods toward rotating machinery,” IEEE Access, Vol.8, pp. 9335-9346, 2019.
- [21] F. Jia, Y. Lei, J. Lin, X. Zhou, and N. Lu, “Deep neural networks: A promising tool for fault characteristic mining and intelligent diagnosis of rotating machinery with massive data,” Mechanical Systems and Signal Processing, Vol.72-73, pp. 303-315, 2016.
- [22] Y. Wei and Z. Weng, “Research on TE process fault diagnosis method based on DBN and dropout,” The Canadian J. of Chemical Engineering, Vol.98, No.6, pp. 1293-1306, 2020.
- [23] J. A. Hartigan and M. A. Wong, “A K-Means clustering algorithm,” Applied Statistics, Vol.28, No.1, pp. 100-108, 1979.
- [24] F. Rosenblatt, “The perceptron: a probabilistic model for information storage and organization in the brain,” Psychological Review, Vol.65, No.6, pp. 386-408, 1958.
- [25] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back propagating errors,” Nature, Vol.323, No.6088, pp. 533-536, 1986.
- [26] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A Fast Learning Algorithm for Deep Belief Net,” Neural Computation, Vol.18, No.7, pp. 1527-1554, 2006.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.