Posts by Collection

portfolio

ShootGame

A Minimal 2D Shooter Game Implemented in Java

RISC-V Simulator

A Simple RISC-V CPU Simulator with 5 Stage Pipeline, Branch Prediction and Cache Simulation

Intelligent and Secure Library Migration Recommendation

Library migration is a common development acticity during software evolution. To support this activity, we design a multi-metric ranking algorithm to mine library migrations from large-scale open-source data. We further develop MigrationAdvisor, a demo tool to recommend library migrations. The backend data have been deployed in an internal tool at Huawei.

publications

Poster: Retroreflective MIMO communication

Published in Proceedings of the 20th International Workshop on Mobile Computing Systems and Applications, 2019

We propose to design retroreflective MIMO channel based on polarization division multiplexing (PDM), with multiple LCD modulators and photodiode (PD) receivers. LCD shutter works as a bi-state modulator which rotates the polarized light by 0 or 90. With polarizer on each side of LCD, it could retroreflect incoming light or absorb it. The retroreflected light is polarized to the angle of front polarizer, which is imperceptible by human eyes but could be separated using polarizer on PD receivers.

Recommended citation: Yue Wu, Kenuo Xu, Hao He, Zihang Wu and Chenren Xu. "Poster: Retroreflective MIMO Communication." Proceedings of the 20th International Workshop on Mobile Computing Systems and Applications. ACM, 2019.

Understanding Source Code Comments at Large-Scale

Published in Proceedings of the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’19), 2019

Source code comments are important for any software, but the basic patterns of writing comments across domains and programming languages remain unclear. In this paper, we take a first step toward understanding differences in commenting practices by analyzing the comment density of 150 projects in 5 different programming languages. We have found that there are noticeable differences in comment density, which may be related to the programming language used in the project and the purpose of the project.

Recommended citation: Hao He. Understanding Source Code Comments at Large-Scale. In Proceedings of the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’19), August 26–30, 2019, Tallinn, Estonia. ACM, New York, NY, USA, 3 pages.

Download Paper Here

An Extensive Study of Independent Comment Changes in Java Projects

Published in Preparing for Another Submission..., 2020

While code comments are valuable for software development, code often has low-quality comments or misses comments altogether, which we call suboptimal comments. Such suboptimal comments create challenges in code comprehension and maintenance. Despite substantial research on suboptimal comments, empirical knowledge about why comments are sub- optimal is lacking, affecting commenting practice and related research. We help bridge this knowledge gap by investigating independent comment changes—comment changes committed in- dependently of code changes—which likely attempt to address suboptimal comments. We collect 23M+ comment changes from 4,410 open-source Java repositories and find that ∼16% of com- ment changes are independent, indicating a considerable amount of comments may be suboptimal. Our thematic analysis of 3,600 randomly sampled independent comment changes provides a two-dimensional taxonomy about what is changed (comment category) and how it changed (commenting activity category). We find some combinations of comment and activity categories have a relatively high frequency although those comments are not a large proportion of all comments; the reason may be that some comments easily become obsolete/inconsistent. By further inspecting extensive related materials for these independent comment changes, and validating it with a survey of 33 developer respondents, we find four reasons for suboptimal comments: belief in future actions, lack of comment guidelines, ineffective use of tools, and legacy. We finally provide implications for project maintainers, researchers, and tool designers.

Recommended citation: Chao Wang, Hao He, Uma Paroma, Darko Marinov, and Minghui Zhou. An Extensive Study of Independent Comment Changes in Java Projects. Preparing for Another Submission...

Download Paper Here

MigrationAdvisor: Recommending Library Migration from GitHub Repositories

Published in Submitted to ICSE 2021 Demonstration Track, 2020

During software maintenance, developers may need to migrate an already used library to another library with similar functionalities. However, it is difficult to make the optimal migration decision with limited information, knowledge, or expertise. In this paper, we present MigrationAdvisor, a recommendation tool to objectively recommend library migration targets through intelligent analysis upon existing GitHub repositories. We have conducted systematic evaluations on the correctness of recommendations, and plan to evaluate the usefulness of our tool by collecting developer usage feedback in an industry context. An online introduction video is available at https://youtu.be/JZaVMWFfQO4

Recommended citation: Hao He, Yulin Xu, Xiao Cheng, Guangtai Liang and Minghui Zhou. MigrationAdvisor: Recommending Library Migration from GitHub Repositories. Submitted to ICSE 2021 Demonstrations.

Download Paper Here

A Multi-Metric Ranking Approach for Library Migration Recommendations

Published in Proceedings of the 28th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2021), 2021

The wide adoption of third-party libraries in software projects is beneficial but also risky. Third-party libraries may have security vulnerabilities, may be abandoned by its maintainers, or may no longer align with current project requirements. Under such circumstances, developers need to migrate a library to another library with similar functionalities, but the migration decisions are often opinion-based and sub-optimal with limited information at hand. Therefore, several filtering-based approaches have been proposed to mine library migrations from existing software data, but they suffer from either low precision or low recall with different thresholds, which limits their usefulness in supporting migration decisions. In this paper, we present a novel approach that utilizes multiple metrics to rank and therefore recommend library migrations. Given a library to migrate, our approach first generates candidate target libraries from a large corpus of software repositories, and then ranks them by combining the following four metrics to capture different dimensions of evidence from development history: Rule Support, Message Support, Distance Support, and API Support. We evaluate the performance of our approach with 773 migration rules (190 source libraries) that we borrow from previous work and recover from 21,358 Java GitHub projects. The experiments show that our metrics are effective to help identify real migration targets from other libraries, and our approach significantly outperforms existing works, with MRR of 0.8566, top-1 precision of 0.7947, top-10 NDCG of 0.7665, and top-20 recall of 0.8939. To demonstrate the generality of our approach, we manually verify the recommendation results of 480 most popular libraries and confirm 661 new migration rules from 231 libraries with comparable performance.

Recommended citation: Hao He, Yulin Xu, Yixiao Ma, Yifei Xu, Guangtai Liang and Minghui Zhou. A Multi-Metric Ranking Approach for Library Migration Recommendations. In Proceedings of the 28th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2021). Acceptance Rate: 25.5% (42/165). PDF. 中文版.

Download Paper Here

talks

teaching

Introduction to Computer Systems, Teaching Assistant, Fall 2018

Undergraduate course, Peking University, School of Electronic Engineering and Computer Science, 2018

Introducton to Computer Systems is an undergraduate course at Peking University. This course originates from the famous CMU 15-213 course. It includes a wide range of selected topics from system programming, computer organization, operating systems and networks. Up to 400 perspective students in computer science will take this course each year.

Introduction to Computation (C), Teaching Assistant, Fall 2020

Undergraduate course, Peking University, School of Electronic Engineering and Computer Science, 2020

Introducton to Computation (C) is an undergraduate course at Peking University. It is an introductory course to programming for students majoring in literal arts (literature, foreign language, history, etc).