A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.



Mining GitHub Repository Information using the Official REST API

2 minute read


GitHub provides a (not very convinent and well documented) HTTP API for requesting information from GitHub. We can use for requesting repository information in JSON format. You can apply various search conditions and sort them if necessary. For example, if you want to collect 1000 most starred repositories whose language is Java, you can use the following request.

基于五阶段流水线的RISC-V CPU模拟器实现

16 minute read


RISC-V是源自Berkeley的开源体系结构和指令集标准。这个模拟器实现的是RISC-V Specification 2.2中所规定RV64I指令集,基于标准的五阶段流水线,并且实现了分支预测模块和虚拟内存模拟。实现一个完整的CPU模拟器可以很好地锻炼系统编程能力,并且加深对体系结构有关知识的理解。在开始实现前,应当阅读并深入理解Computer Systems: A Programmer’s Perspective中的第四章,或者Computer Organizaton and Design: Hardware/Software Interface中的有关章节。

Building Event System in Unity3D

3 minute read


When I was developing a simple 3D game using Unity 3D, I found it non-trivial to build an event system that could handle dynamic game events efficiently and elegantly.

隐马尔可夫模型(Hidden Markov Model)

1 minute read


隐马尔可夫模型(Hidden Markov Model, HMM)是一个重要的机器学习模型。直观地说,它可以解决一类这样的问题:有某样事物存在一定的状态,但我们无法得知某个时刻(或位置)它所处在的状态,但是我们有一个参照事物,我们知道这个参照事物在某个时刻(或位置)的状态并认为参照事物的状态和原事物的状态存在联系,那么我们可以使用机器学习来推测原事物最有可能在一个时刻(或位置)处在什么样的状态。也就是说,这是一个基于概率统计的模型。


less than 1 minute read




less than 1 minute read


其实上一篇博文所写的$H(\vec{x},t)​$,就是二维傅里叶变换的求和式,之前的暴力计算法属于二维的离散傅里叶变换(Discrete Fourier Transform, DFT),利用二维的快速傅里叶变换(Fast Fourier Transform, FFT)可以将复杂度从$O(n^4)​$降低到$O(n^2\log{n})​$。


less than 1 minute read




5 minute read




RISC-V Simulator

A Simple RISC-V CPU Simulator with 5 Stage Pipeline, Branch Prediction and Cache Simulation


A Minimal 2D Shooter Game Implemented in Java


Poster: Retroreflective MIMO communication

Published in Proceedings of the 20th International Workshop on Mobile Computing Systems and Applications, 2019

We propose to design retroreflective MIMO channel based on polarization division multiplexing (PDM), with multiple LCD modulators and photodiode (PD) receivers. LCD shutter works as a bi-state modulator which rotates the polarized light by 0 or 90. With polarizer on each side of LCD, it could retroreflect incoming light or absorb it. The retroreflected light is polarized to the angle of front polarizer, which is imperceptible by human eyes but could be separated using polarizer on PD receivers.

Recommended citation: Yue Wu, Kenuo Xu, Hao He, Zihang Wu and Chenren Xu. "Poster: Retroreflective MIMO Communication." Proceedings of the 20th International Workshop on Mobile Computing Systems and Applications. ACM, 2019.

Understanding Source Code Comments at Large-Scale

Published in Proceedings of the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’19), 2019

Source code comments are important for any software, but the basic patterns of writing comments across domains and programming languages remain unclear. In this paper, we take a first step toward understanding differences in commenting practices by analyzing the comment density of 150 projects in 5 different programming languages. We have found that there are noticeable differences in comment density, which may be related to the programming language used in the project and the purpose of the project.

Recommended citation: Hao He. Understanding Source Code Comments at Large-Scale. In Proceedings of the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’19), August 26–30, 2019, Tallinn, Estonia. ACM, New York, NY, USA, 3 pages.

An Extensive Study of Independent Comment Changes in Java Projects

Published in Under Review, 2020

While code comments are valuable for software development, code often has low-quality comments or misses comments altogether, which we call suboptimal comments. Such suboptimal comments create challenges in code comprehension and maintenance. Despite substantial research on suboptimal comments, empirical knowledge about why comments are sub- optimal is lacking, affecting commenting practice and related research. We help bridge this knowledge gap by investigating independent comment changes—comment changes committed in- dependently of code changes—which likely attempt to address suboptimal comments. We collect 23M+ comment changes from 4,410 open-source Java repositories and find that ∼16% of com- ment changes are independent, indicating a considerable amount of comments may be suboptimal. Our thematic analysis of 3,600 randomly sampled independent comment changes provides a two-dimensional taxonomy about what is changed (comment category) and how it changed (commenting activity category). We find some combinations of comment and activity categories have a relatively high frequency although those comments are not a large proportion of all comments; the reason may be that some comments easily become obsolete/inconsistent. By further inspecting extensive related materials for these independent comment changes, and validating it with a survey of 33 developer respondents, we find four reasons for suboptimal comments: belief in future actions, lack of comment guidelines, ineffective use of tools, and legacy. We finally provide implications for project maintainers, researchers, and tool designers.

Recommended citation: Chao Wang, Hao He, Uma Paroma, Darko Marinov, and Minghui Zhou. An Extensive Study of Independent Comment Changes in Java Projects. Under Review. Not Available



Introduction to Computer Systems, Teaching Assistant, Fall 2018

Undergraduate course, Peking University, School of Electronic Engineering and Computer Science, 2018

Introducton to Computer Systems is an undergraduate course at Peking University. This course originates from the famous CMU 15-213 course. It includes a wide range of selected topics from system programming, computer organization, operating systems and networks. Up to 400 perspective students in computer science will take this course each year.