An Extensive Study of Independent Comment Changes in Java Projects

Published in Under Review, 2020

Recommended citation: Chao Wang, Hao He, Uma Paroma, Darko Marinov, and Minghui Zhou. An Extensive Study of Independent Comment Changes in Java Projects. Under Review. Not Available

Abstract

While code comments are valuable for software development, code often has low-quality comments or misses comments altogether, which we call suboptimal comments. Such suboptimal comments create challenges in code comprehension and maintenance. Despite substantial research on suboptimal comments, empirical knowledge about why comments are sub- optimal is lacking, affecting commenting practice and related research. We help bridge this knowledge gap by investigating independent comment changes—comment changes committed in- dependently of code changes—which likely attempt to address suboptimal comments. We collect 23M+ comment changes from 4,410 open-source Java repositories and find that ∼16% of com- ment changes are independent, indicating a considerable amount of comments may be suboptimal. Our thematic analysis of 3,600 randomly sampled independent comment changes provides a two-dimensional taxonomy about what is changed (comment category) and how it changed (commenting activity category). We find some combinations of comment and activity categories have a relatively high frequency although those comments are not a large proportion of all comments; the reason may be that some comments easily become obsolete/inconsistent. By further inspecting extensive related materials for these independent comment changes, and validating it with a survey of 33 developer respondents, we find four reasons for suboptimal comments: belief in future actions, lack of comment guidelines, ineffective use of tools, and legacy. We finally provide implications for project maintainers, researchers, and tool designers.

Download Paper Here