Research Paper:
Research on Multidimensional Power Big Data Clustering Algorithm Based on Graph Mode
Xue Han*,
, Yue Zhang*, and Sheng Gao**
*State Grid East Inner Mongolia Information & Telecommunication Company
Hohhot, Inner Mongolia 010010, China
Corresponding author
**SICT Shenyang Institute of Computing Technology Co. Ltd., CAS
Shenyang, Liaoning 110000, China
Power system data possess many characteristics and indicators, having certain high dimensions and redundant information, which can easily increase the calculation and storage overhead. To reduce the dimension of power data, eliminate redundant information, and reduce the delay time, a data clustering algorithm is proposed. Firstly, an algorithm based on PCA and kernel local Fisher identification is used to reduce the dimension of large multidimensional samples and enhance the accuracy of subsequent clustering. Thereafter, the redundant data are processed after dimension reduction is processed to optimize the data quality by introducing a bloom filter structure. In the graph model, data clustering is completed based on the parallel processing of redundant data. Simulation results show that the correctness and stability of this method are over 85%, and the delay time is decreased, representing good application prospects.

Schematic of parallel K-means clustering algorithm
- [1] M. E.-S. M. Essa, M. Elsisi, M. Saleh Elsayed et al., “An Improvement of Model Predictive for Aircraft Longitudinal Flight Control Based on Intelligent Technique,” Mathematics, Vol.10, No.19, Article No.3510, 2022. https://doi.org/10.3390/math10193510
- [2] M. Elsisi, M. Altius, S.-F. Su et al., “Robust Kalman Filter for Position Estimation of Automated Guided Vehicles Under Cyberattacks,” IEEE Trans. on Instrumentation and Measurement, Vol.72, No.1, pp. 1-12, 2023. https://doi.org/10.1109/TIM.2023.3250285
- [3] S. Bergies, S. F. Su, and M. Elsisi, “Model Predictive Paradigm with Low Computational Burden Based on Dandelion Optimizer for Autonomous Vehicle Considering Vision System Uncertainty,” Mathematics, Vol.10, No.23, Article No.4539, 2022. https://doi.org/10.3390/math10234539
- [4] M. A. E. Mohamed, S. M. R. Mohamed, E. M. M. Saied et al., “Optimal Energy Management Solutions Using Artificial Intelligence Techniques for Photovoltaic Empowered Water Desalination Plants Under Cost Function Uncertainties,” IEEE Access, Vol.10, No.1, pp. 93646-93658, 2022. https://doi.org/10.1109/ACCESS.2022.3203692
- [5] T. F. Agajie, A. Fopah-Lele, A. Ali et al., “Optimal Sizing and Power System Control of Hybrid Solar PV-Biogas Generator with Energy Storage System Power Plant,” Sustainability, Vol.15, No.7, Article No.5739, 2023. https://doi.org/10.3390/su15075739
- [6] Y. Xu, Q. Cheng, Y. Li et al., “Mid-long Term Load Forecasting of Power System Based on Big Data Clustering,” Proc. of the CSU-EPSA, Vol.29, No.8, pp. 43-48, 2017.
- [7] J. Y. Chen, J. Y. Ding, S. M. Tian et al., “An improved density peaks clustering algorithm for power load profiles clustering analysis,” Power System Protection and Control, Vol.46, No.20, pp. 85-93, 2018.
- [8] F. H. Awad and M. M. Hamad, “Improved k-Means Clustering Algorithm for Big Data Based on Distributed Smartphone Neural Engine Processor,” Electronics, Vol.11, No.6, Article No.883, 2022. https://doi.org/10.3390/electronics11060883
- [9] G. Alkawsi, R. Al-Amri, Y. Baashar et al., “Towards lowering computational power in IoT systems: Clustering algorithm for high-dimensional data stream using entropy window reduction,” Alexandria Engineering J., Vol.70, No.1, pp. 503-513, 2023. https://doi.org/10.1016/j.aej.2023.03.008
- [10] S. Y. Liu, M. Wu, and R. Z. Li, “Power Load Curve Clustering Research Based on Multi-dimensional Scaling and KICIC,” Science Technology and Engineering, Vol.23, No.3, pp. 1096-1103, 2023.
- [11] A. Sreedharan and A. Kumar K. S., “Effect of Hadamard multiplication on bloom filter and double bloom filter transformations,” Security And Privacy, Vol.6, No.6, Article No.e316, 2023. https://doi.org/10.1002/spy2.316
- [12] M. Alsuhaibani, R. U. Khan, A. M. Qamar et al., “Content-Based Approach for Improving Bloom Filter Efficiency,” Applied Sciences, Vol.13, No.13, Article No.7922, 2023. https://doi.org/10.3390/app13137922
- [13] N. Yan, H. Chen, K. Lin et al., “BFSearch: Bloom filter based tag searching for large-scale RFID systems,” Ad Hoc Networks, Vol.139, Article No.103022, 2023. https://doi.org/10.1016/j.adhoc.2022.103022
- [14] M. T. I. Ramadhan, F. Afianti, B. A. Wahyudi et al., “Strengthening the Integrity of Forwarding First Communication Using Forward Key Chain and Bloom Filter in the Wireless Sensor Networks,” J. of Computer Science, Vol.19, No.3, pp. 305-314, 2023. https://doi.org/10.3844/jcssp.2023.305.314
- [15] X. Huo, “A Light-weight Authentication Scheme in the Internet of Things using the Enhanced Bloom Filter,” Int. J. of Advanced Computer Science and Applications, Vol.14, No.1, pp. 2356-2367, 2023. https://doi.org/10.14569/IJACSA.2023.0140154
- [16] T. Chu, H. Yan, P. Li et al., “A similarity calculation model of road network and its application in map generalization quality evaluation,” Spatial Cognition & Computation, Vol.24, No.1, pp. 6-31, 2024. https://doi.org/10.1080/13875868.2023.2234074
- [17] S. Zhai, “Research on process route matching mechanism based on similarity calculation,” J. of Physics: Conf. Series, Vol.2229, Article No.012015, 2022.
- [18] M. Ji and X. Zhang, “A Short Text Similarity Calculation Method Combining Semantic and Headword Attention Mechanism,” Scientific Programming, Vol.2022, Article No.8252492, 2022. https://doi.org/10.1155/2022/8252492
- [19] M. M. Navarro, M. N. Young, Y. T. Prasetyo et al., “Customer Load Profile Clustering Using K-means Algorithm: A Case Study in an Electric Distribution Company in the Philippines Amidst the COVID-19 Pandemic,” 2022 IEEE Int. Conf. on Industrial Engineering and Engineering Management (IEEM), pp. 625-629, 2022. https://doi.org/10.1109/IEEM55944.2022.9989809
- [20] W. He and L. Zhao, “Application of Federated Learning Algorithm Based on K-Means in Electric Power Data,” J. of New Media, Vol.4, No.4, pp. 191-203, 2022. https://doi.org/10.32604/jnm.2022.032994
- [21] L. Yifei, H. L. Minh, S. Khatir et al., “Structure damage identification in dams using sparse polynomial chaos expansion combined with hybrid K-means clustering optimizer and genetic algorithm,” Engineering Structures, Vol.283, Article No.115891, 2023. https://doi.org/10.1016/j.engstruct.2023.115891
- [22] S.-W. Li, L.-C. Xu, C. Zhang et al., “Reaction performance prediction with an extrapolative and interpretable graph model based on chemical knowledge,” Nature Communications, Vol.14, Article No.3569, 2023. https://doi.org/10.1038/s41467-023-39283-x
- [23] P. Plamper, O. J. Lechtenfeld, P. Herzsprung et al., “A Temporal Graph Model to Predict Chemical Transformations in Complex Dissolved Organic Matter,” Environmental Science & Technology, Vol.57, No.46, pp. 18116-18126, 2023. https://doi.org/10.1021/acs.est.3c00351
- [24] Q. Wan, R. Lv, Y. Xiao et al., “Multi-Target Occlusion Tracking with 3-D Spatio-Temporal Context Graph Model,” IEEE Sensors J., Vol.23, No.18, pp. 21631-21639, 2023. https://doi.org/10.1109/JSEN.2023.3303691
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.