JACIII Vol.27 No.5 pp. 780-789
doi: 10.20965/jaciii.2023.p0780

Research Paper:

Prediction and Characteristic Analysis of Enterprise Digital Transformation Integrating XGBoost and SHAP

Dan Tang*,** and Jiangying Wei*,**,†

*School of Statistics, Huaqiao University
No.668 Jimei Avenue, Jimei District, Xiamen, Fujian 361021, China

**Institute of Quantitative Economics, Huaqiao University
No.668 Jimei Avenue, Jimei District, Xiamen, Fujian 361021, China

Corresponding author

December 30, 2022
April 18, 2023
September 20, 2023
digital transformation, text mining, XGBoost, SHAP

Objective: An interpretability model of enterprise digital transformation that integrates XGBoost and Shapley additive explanations (SHAP) is proposed to accurately identify the important factors that affect the digital transformation of enterprises and their mode of action, improve the digital capabilities and levels of enterprises, and prevent the risks of digital transformation of enterprises. Method: The annual report information of listed companies from 2009 to 2021 is used as the research object. First, the digital transformation index is constructed using the text mining method. Second, an enterprise digital transformation prediction model based on XGBoost is constructed and compared it with other mainstream algorithms, such as linear regression and random forest, to find a comprehensive optimal model. Finally, the SHAP interpretation framework is introduced to quantify and attribute the importance of each characteristic variable. Results: The results found that the XGBoost model outperformed the compared models in the mean absolute error and R2 performance indicators. In addition, development capability, comprehensive capability, and solvency are important characteristics influencing the digital transformation of enterprises, and they differ in the way, direction, and strength of influence on the digital transformation of enterprises. Research value: This paper applies XGBoost integrated learning method to identify the factors of enterprise digital transformation, which enables enterprises to assess their digital transformation status, discover the key determinants of digital transformation, and adopt effective digital transformation modes for higher value.

SHAP feature analysis summary chart

SHAP feature analysis summary chart

Cite this article as:
D. Tang and J. Wei, “Prediction and Characteristic Analysis of Enterprise Digital Transformation Integrating XGBoost and SHAP,” J. Adv. Comput. Intell. Intell. Inform., Vol.27 No.5, pp. 780-789, 2023.
Data files:
  1. [1] China Academy of Information and Communications Technology, “Global Digital Economy White Paper,” 2022 (in Chinese). [Accessed August 25, 2023]
  2. [2] Essence Securities, “Digital Transformation Index of China Enterprises in 2020,” 2020.
  3. [3] G. Vial, “Understanding digital transformation: A review and a research agenda,” The J. of Strategic Information Systems, Vol.28, No.2, pp. 118-144, 2019.
  4. [4] F. Wu et al., “Enterprise digital transformation and capital market performance: Empirical evidence from stock liquidity,” J. of Management World, Vol.37, No.7, pp. 130-144+10, 2021 (in Chinese).
  5. [5] M. F. Manesh et al., “Knowledge management in the fourth industrial revolution: Mapping the literature and scoping future avenues,” IEEE Trans. on Engineering Management, Vol.68, No.1, pp. 289-300, 2021.
  6. [6] C. Matt et al., “Options for formulating a digital transformation strategy,” MIS Quarterly Executive, Vol.15, No.2, Article No.6, 2016.
  7. [7] W. Chen and F. Kamal, “The impact of information and communication technology adoption on multinational firm boundary decisions,” J. of Int. Business Studies, Vol.47, No.5, pp. 563-576, 2016.
  8. [8] F. Bertani et al., “The complexity of the intangible digital economy: An agent-based model,” J. of Business Research, Vol.129, pp. 527-540, 2021.
  9. [9] J. K. Nwankpa and P. Datta, “Balancing exploration and exploitation of IT resources: The influence of Digital Business Intensity on perceived organizational performance,” European J. of Information Systems, Vol.26, No.5, pp. 469-488, 2017.
  10. [10] O. Werth et al., “Influencing factors for the digital transformation in the financial services sector,” Zeitschrift für die gesamte Versicherungswissenschaft, Vol.109, No.2, pp. 155-179, 2020.
  11. [11] X. Zhang, Y. Xu, and L. Ma, “Research on successful factors and influencing mechanism of the digital transformation in SMEs,” Sustainability, Vol.14, No.5, Article No.2549, 2022.
  12. [12] A. Ko et al., “Influencing factors of digital transformation: Management or IT is the driving force?,” Int. J. of Innovation Science, Vol.14, No.1, pp. 1-20, 2022.
  13. [13] T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” Proc. of the 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD’16), pp. 785-794, 2016.
  14. [14] S. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” 31st Conf. on Neural Information Processing Systems (NIPS 2017), pp. 4765-4774, 2017.
  15. [15] X. Zhang et al., “Digital economy, inclusive finance and inclusive growth,” Economic Research J., Vol.54, No.8, pp. 71-86, 2019 (in Chinese).
  16. [16] Z. Sun and Y.-L. Hou, “How does industrial intelligence reshape the employment structure of Chinese labor force,” China Industrial Economics, Vol.2019, No.5, pp. 61-79, 2019 (in Chinese).
  17. [17] C. Zhao, W. Wang, and X. Li, “How does digital transformation affect the total factor productivity of enterprises?,” Finance & Trade Economics, Vol.42, No.7, pp. 114-129, 2021 (in Chinese).
  18. [18] F. He and H.-X. Liu, “The performance improvement effect of digital transformation enterprises from the digital economy perspective,” Reform, Vol.2019, No.4, pp. 137-148, 2019 (in Chinese).
  19. [19] X. Tu and X. Yan, “Digital transformation, knowledge spillover, and enterprise total factor productivity: Empirical evidence from listed manufacturing companies,” Industrial Economics Research, Vol.2022, No.2, pp. 43-56, 2022 (in Chinese).
  20. [20] S. Wold, K. Esbensen, and P. Geladi, “Principal component analysis,” Chemometrics and Intelligent Laboratory Systems, Vol.2, Nos.1-3, pp. 37-52, 1987.
  21. [21] M. Schonlau and R. Y. Zou, “The random forest algorithm for statistical learning,” The Stata J., Vol.20, No.1, pp. 3-29, 2020.
  22. [22] L. Breiman, “Bagging predictors,” Machine Learning, Vol.24, No.2, pp. 123-140, 1996.
  23. [23] A. B. Parsa et al., “Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis,” Accident Analysis & Prevention, Vol.136, Article No.105405, 2020.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Jul. 23, 2024