Blending Code and Text Analysis to Detect Fragile Comments
Abstract
Refactoring is a common software development practice and many simple refactorings can be performed automatically by tools. Identifier
renaming is a widely performed refactoring activity. With tool support, rename refactorings can rely on the program structure to ensure
correctness of the code transformation. Unfortunately, the textual references to the renamed identifier present in the unstructured comment
text cannot be formally detected through the syntax of the language, and are thus fragile with respect to identifier renaming. We designed a new
rule-based approach to detect fragile comments. Our approach, called Fraco, takes into account the type of identifier, its morphology, the
scope of the identifier and the location of comments. We evaluated the approach by comparing its precision and recall against hand-annotated
benchmarks created for six target Java systems, and compared the results against the performance of Eclipse’s automated in-comment identifier
replacement feature. Fraco performed with near-optimal precision and recall on most components of our evaluation data set, and generally
outperformed the baseline Eclipse feature. As part of our evaluation, we also noted that more than half of the total number of identifiers in our
data set had fragile comments after renaming, which further motivates the need for research on automatic comment refactoring.
Bio
Martin Robillard is a Professor of Computer Science at McGill University. His current research focuses on problems related to software
evolution, architecture and design, and software reuse. He served as the Program Co-Chair for the 20th ACM SIGSOFT International Symposium on the
Foundations of Software Engineering (FSE 2012) and the 39th ACM/IEEE International Conference on Software Engineering (ICSE 2017). He
received his Ph.D. and M.Sc. in Computer Science from the University of British Columbia and a B.Eng. from École Polytechnique de Montréal.