Semantically Enhanced Code Clone Refinement Algorithm Based on Analysis of Multiple Detection Reports

Ricardo Sotolongo, Fangyan Dong, and Kaoru Hirota

Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology, G3-49, 4259 Nagatsuta, Midori-ku, Yokohama 226-8502, Japan

October 14, 2010
December 21, 2010
May 20, 2011
code clones, refinement, semantic analysis, WordNet
An algorithm based on semantic analysis of multiple detection tools’ reports using WordNet is proposed oriented on the refinement of code clones. It parses different detection tools’ reports looking for new clone specifications, and refines the location of existing ones using semantic information contained in source code. It is applied to a real and complex software system and is compared to three other well-known detection algorithms, discovering 4888 clone pairs more than the average detected by other tools; also making the code clones 3 lines longer (for a subset of the same system the results are proportional to the size reduction). The objective is to provide higher quantity of code clones, and more appropriated localization to be used in refactoring processes.
R. Sotolongo, F. Dong, and K. Hirota, “Semantically Enhanced Code Clone Refinement Algorithm Based on Analysis of Multiple Detection Reports,” J. Adv. Comput. Intell. Intell. Inform., Vol.15 No.3, pp. 322-328, 2011.
