<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://hehao98.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://hehao98.github.io/" rel="alternate" type="text/html" /><updated>2026-01-28T11:00:11-08:00</updated><id>https://hehao98.github.io/feed.xml</id><title type="html">Hao He</title><subtitle></subtitle><author><name>Hao He</name><email>haohe@andrew.cmu.edu</email></author><entry><title type="html">Writing Release Notes for Your Software: How to Get it Right</title><link href="https://hehao98.github.io/posts/2022/3/8/release-note" rel="alternate" type="text/html" title="Writing Release Notes for Your Software: How to Get it Right" /><published>2022-03-08T00:00:00-08:00</published><updated>2022-03-08T00:00:00-08:00</updated><id>https://hehao98.github.io/posts/2022/3/8/release-note</id><content type="html" xml:base="https://hehao98.github.io/posts/2022/3/8/release-note"><![CDATA[<p>Release note is important. However, there is a lack of tutorials or widely acknowledged standards about how to produce a release note. Without “the right way,” release notes may cause all kinds of issues. In this article, we will provide an FAQ-style introduction on how to produce the “right” release note for your users, based on <a href="https://hehao98.github.io/publication/2022-release-notes">our recent research</a> on ~1000 real release note issues in GitHub project. This is still a preliminary draft, so if you have any suggestions or critiques, feel free to comment below!</p>

<p>This article is also available in <a href="https://osslab-pku.github.io/2022/03/10/Release-Note/">our Lab website</a>!</p>

<h2 id="why-producing-a-release-note">Why Producing a Release Note?</h2>

<p>In short, release notes <strong>convey the impact of change</strong> to your downstream users. In the GitHub context, most often, users are others that use your code, by importing through a package manager or downloading your tool/application, etc. Without a release note, the users will not know what to do with your new version. They typically have questions like: What is the advantage of the new version? Should any part of my code/configuration/workflow be adjusted? Such questions can be answered with a well-written release note. <strong>As long as your software project has gained a non-trivial user base and is constantly evolving, you should consider starting to provide release notes for its new releases.</strong></p>

<h2 id="why-should-i-learn-how-to-produce-release-notes">Why Should I Learn How to Produce Release Notes?</h2>

<p>If your release note is confusing, not understandable, or has missed important information (e.g., breaking changes), <strong>it will greatly frustrate your users when they update</strong>. Some users may open issues to bother you (<em>which is what we have observed and systematically analyzed</em>), but most of them will probably not. Ultimately, such negative experience may cause them to switch to other alternatives and make your project fall behind competition. It will be better if you can obtain knowledge about others’ lessons before you start to work on release notes.</p>

<h2 id="what-should-i-include-in-release-notes">What Should I Include in Release Notes?</h2>

<p>Theoretically, <strong>all user-visible or user-affecting changes should be included</strong>, such as:</p>

<ol>
  <li>Breaking / backward-Incompatible changes (to your APIs, command-lines, etc.). If you do not tell this, it immediately frustrate some users when they update)</li>
  <li>New features</li>
  <li>Fixed bugs</li>
  <li>Non-functional enhancements (performance, memory, etc.)</li>
  <li>Documentation changes</li>
  <li>Dependency or environment changes (e.g., if you increase the maximum version of a dependency, this affects people using the old version!)</li>
  <li>License changes (new license may opt out some of your users)</li>
  <li>Security changes (e.g., relevant security vulnerabilities and CVEs)</li>
</ol>

<p>We found in our analysis that <strong>breaking changes are especially likely to be missing from release notes</strong>. One reason for this is that it has the biggest impact (e.g., program crash) while hard to detect. Unfortunately, automated detection of breaking changes remains a formidable research challenge. Thus, we suggest manually and carefully inspect whether any change will be breaking many clients in a release. For large projects, it may be desirable to distribute such workload among all developers (e.g., by using the <a href="https://www.conventionalcommits.org/en/v1.0.0-beta.4/">Angular Conventional Commits</a> to label all breaking changes upon committing). Apart from breaking changes, <strong>dependency, license, and security-related changes are also frequently missing.</strong></p>

<p>For each of the change, it will be very useful if you can <strong>add some supplementary information</strong>, including:</p>

<ol>
  <li>For each change, links to related pull requests, issues, and commits. Some may want to refer to them for more information.</li>
  <li>For any backward incompatible change, guides for migration, mitigation, or additional setup required.</li>
  <li>For any intriguing new features or migration guides, code examples to explain them.</li>
  <li>Known issues, if there is any, to inform users of the bad and help users decide whether to update.</li>
  <li>Attributions, to acknowledge the people who have contributed to the release!</li>
</ol>

<h2 id="how-should-i-organize-release-notes">How Should I Organize Release Notes?</h2>

<p>Although there are a lot of changes to describe, this does not necessarily mean that you should flood your users with overwhelming and hard-to-grasp information. In fact, release note organization is very important so that users can quickly find the information they want.</p>

<p><strong>One common anti-pattern in release note production is to simply aggregate and list all commit messages or issue titles between two versions.</strong> At minimum, these commit messages / issues need to be properly organized into different categories for others’ quick reference. It will be better if you can write some text to highlight the most important changes and intrigue your users.</p>

<p><strong>If your project is huge, you may consider further organize changes by module</strong>. This can be that each module have its own release notes, or each module having its own sections in a release note.</p>

<h2 id="can-i-automate-the-process">Can I Automate the Process?</h2>

<p>Of course you can. There are a bunch of well-developed tools for your use: <a href="https://github.com/semantic-release/semantic-release">Semantic Release</a>, <a href="https://github.com/github-changelog-generator/github-changelog-generator">github-changelog-generator</a>, <a href="https://github.com/release-it/release-it">Release It</a>, <a href="https://github.com/release-drafter/release-drafter">Release Drafter</a>. You can refer to their documentation on how to integrate it with your project. All these tools requires you to <strong>adopt some way to systematically organize all changes</strong>, by following a commit message template, adding labels to your issues and pull requests, etc. Therefore, it is important to have some well-defined process to manage changes first before using those tools to automate the release process!</p>

<p>Although automation seems appealing, we found that <strong>users frequently complain about the uninformative release notes generated</strong>, because the commit messages, issue titles or pull request titles are not crystal clear and user-friendly at the beginning. If you want to generate good release notes, perhaps you need to also pay special attention to the quality of your generation sources!</p>

<h2 id="how-to-make-release-notes-more-accessible-to-my-users">How to Make Release Notes (More) Accessible to My Users?</h2>

<p>Accessibility does not come free. We found that <strong>users frequently complain about their difficulty in accessing release notes, such as broken links, no repository files, no notification, etc.</strong> To avoid these issues, it is important to make some effort to publicize your release notes!</p>

<p>It is advised to put release notes on:</p>

<ol>
  <li>GitHub Release Pages</li>
  <li>Project Websites</li>
  <li>In-App Notification, if you are developing an app</li>
  <li>Community Channels (mailing list, slack, discord, etc.), if your software has a community</li>
  <li>Git Repositories, if you want collaborative editing prior to release.</li>
</ol>

<p>You need to pay attention to <strong>links</strong> between them. Links tend to deteriorate quickly and a broken link can be frustrating.</p>

<h2 id="examplars">Examplars</h2>

<p>Many popular packages do an excellent job in producing release notes. Take a look at them and see how they follow the principles mentioned in our article!</p>

<h2 id="additional-information">Additional Information</h2>

<p>This article is based on our recent research paper <a href="https://hehao98.github.io/publication/2022-release-notes">Demystifying Software Release Note Issues on GitHub</a> published in <a href="https://conf.researchr.org/home/icpc-2022"><em>2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC 2022)</em></a>.</p>

<p>If you find this article helpful, please consider citing our paper:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@inproceedings{2022ICPC-ReleaseNote,
  author    = {Jianyu Wu and
               Hao He and
               Wenxin Xiao and
               Kai Gao and
               Minghui Zhou},
  title     = {Demystifying Software Release Note Issues on GitHub},
  booktitle = {Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, ICPC 2022, Pittsburgh, USA, May 16-17, 2022},
  pages     = {},
  publisher = ,
  year      = {2022}
}
</code></pre></div></div>]]></content><author><name>Hao He</name><email>haohe@andrew.cmu.edu</email></author><category term="English" /><category term="CI/CD/DevOps" /><category term="Release Engineering" /><summary type="html"><![CDATA[Release note is important. However, there is a lack of tutorials or widely acknowledged standards about how to produce a release note. Without “the right way,” release notes may cause all kinds of issues. In this article, we will provide an FAQ-style introduction on how to produce the “right” release note for your users, based on our recent research on ~1000 real release note issues in GitHub project. This is still a preliminary draft, so if you have any suggestions or critiques, feel free to comment below!]]></summary></entry><entry><title type="html">开源软件量化分析研究入门</title><link href="https://hehao98.github.io/posts/2021/12/onboarding/" rel="alternate" type="text/html" title="开源软件量化分析研究入门" /><published>2021-12-01T00:00:00-08:00</published><updated>2021-12-01T00:00:00-08:00</updated><id>https://hehao98.github.io/posts/2021/12/onboarding</id><content type="html" xml:base="https://hehao98.github.io/posts/2021/12/onboarding/"><![CDATA[<p>（出自我为我们实验室撰写的内部资料，是在周明辉老师写的某个早期版本上扩写而成，并增加了一点我自己的思考，因感觉非常有用，故在此留存）</p>

<h2 id="先置知识">先置知识</h2>

<p>如果你之前没有研究经历，我强烈推荐在开始做一个研究课题前，将以下列表中的文章通读一遍。这些文章均为计算机各个领域的知名学者撰写的教程性质的文章，内容涵盖：做研究的基本方法论；研究生阶段应该做和不应该做的事情；如何写论文；如何最大化文章被接受的概率；如何做报告；等等。由于我认为以上内容对研究生应当是<strong>常识性</strong>的，本文不会对以上内容做出更多的介绍。此外，在有一些研究经历之后，也可以选择你疑惑的内容再读一遍，通常会有更多的收获。</p>

<ul>
  <li><a href="http://www.cs.utexas.edu/~EWD/ewd06xx/EWD637.PDF">The Three Golden Rules for Successful Scientific Research</a> (by Prof. E.W. Dijkstra)</li>
  <li><a href="https://stearnslab.yale.edu/modest-advice">Some Modest Advice for Graduate Students</a> (by Prof. Stephen C. Stearns)</li>
  <li><a href="http://spoke.compose.cs.cmu.edu/write/">How to Write a Good Research Paper</a> (by Prof. Mary Shaw)</li>
  <li><a href="http://taoxie.cs.illinois.edu/publications/writepapers.pdf">Common Advice on Writing Research Papers</a> (by Prof. Tao Xie)</li>
  <li><a href="https://www.usenix.org/legacy/publications/library/proceedings/dsl97/good_paper.html">How/How Not to Write a Good System Paper</a> (by Dr. Roy Levin and Dr. David D. Redell)</li>
  <li><a href="http://dl.acm.org/citation.cfm?id=636197">Preliminary Guidelines for Empirical Research in Software Engineering</a> (by Prof. Barbara Kitchenham et al.)</li>
  <li><a href="http://matt-welsh.blogspot.hk/2016/04/why-i-gave-your-paper-strong-accept.html">Why I gave your paper a Strong Accept</a> (by Matt Welsh)</li>
  <li><a href="http://matt-welsh.blogspot.hk/2016/04/why-i-gave-your-paper-strong-reject.html">Why I gave your paper a Strong Reject</a> (by Matt Welsh)</li>
  <li><a href="http://www.cs.virginia.edu/~robins/YouAndYourResearch.html">You and Your Research</a> (by Prof. Richard Hamming) （这篇的主题是：如何做出<strong>卓越</strong>的研究）</li>
  <li><a href="http://homes.cs.washington.edu/~mernst/advice/giving-talk.html">How to Give a Technical Presentation</a> (by Prof. Michael Ernst)</li>
</ul>

<p>然后，在进入具体的研究方向介绍之前，我试图先给出一个大的绘图(big picture)。一般而言，这个世界上的所有研究可以分为两类：<strong>发明型研究</strong>(Solution-Seeking Research)和<strong>发现型研究</strong>(Knowledge-Seeking Research)。就软件工程领域而言，可以将两类研究的区别总结于下表：</p>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>发现型研究</th>
      <th>发明型研究</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>目标</strong></td>
      <td>提出一些观点(claim)，并使用科学的方法证明你的观点。这些观点一经证实，可以用于指导软件开发<strong>实践</strong>，从而提升软件开发的<strong>效率</strong>或<strong>质量</strong></td>
      <td>设计一个解决方案， 或改进现有的解决方案，以解决一个特定的<strong>实际</strong>问题，从而提升软件开发的<strong>效率</strong>或<strong>质量</strong></td>
    </tr>
    <tr>
      <td><strong>聚焦点</strong></td>
      <td>软件开发过程中的某种（尚不清楚或不广为人知的）现象</td>
      <td>特定场景下的实际问题，以及解决这些问题的阻碍</td>
    </tr>
    <tr>
      <td><strong>产出</strong></td>
      <td>一组<strong>首次</strong>经过科学方法证实的结论。在分析过程中，可能也会产出一些解决方案，例如一种数据分析流程，或者用于某种估算的算法等</td>
      <td>一个<strong>新</strong>的解决方案（算法、系统、模型、工具等），以及对这个解决方案的系统性验证</td>
    </tr>
    <tr>
      <td><strong>举例</strong></td>
      <td>什么样的人可以成为开源社区中的长期贡献者[1]？代码审查的难点和好处究竟在哪里[2]？什么样的开源项目会失败[3]？</td>
      <td>如何对包含多过程调用的程序的性质进行精确的分析[4]？如何对复杂程序生成更容易找到bug的测试样例[5]？如何自动检测代码与注释的不一致[6]？</td>
    </tr>
  </tbody>
</table>

<ul>
  <li><strong>表格改编自</strong>：Stol, Klaas-Jan, and Brian Fitzgerald. “The ABC of software engineering research.” ACM Transactions on Software Engineering and Methodology (TOSEM) 27.3 (2018): 1-51. （这篇文章对软件工程研究中，<strong>发现型研究</strong>的研究范式做出了全面和深入的介绍，思路广阔，不局限于目前主流的数据分析类研究，有一些研究经历之后可以通读一遍参考参考）</li>
  <li>举例的文献列表：
    <ol>
      <li>Zhou, Minghui, and Audris Mockus. “What make long term contributors: Willingness and opportunity in OSS community.” <em>2012 34th International Conference on Software Engineering (ICSE)</em>. IEEE, 2012.</li>
      <li>Bacchelli, Alberto, and Christian Bird. “Expectations, outcomes, and challenges of modern code review.” <em>2013 35th International Conference on Software Engineering (ICSE)</em>. IEEE, 2013.</li>
      <li>Valiev, Marat, Bogdan Vasilescu, and James Herbsleb. “Ecosystem-level determinants of sustained activity in open-source projects: A case study of the PyPI ecosystem.” <em>Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering</em>. 2018.</li>
      <li>Reps, Thomas, Susan Horwitz, and Mooly Sagiv. “Precise interprocedural dataflow analysis via graph reachability.” <em>Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages</em>. 1995.</li>
      <li>Cadar, Cristian, Daniel Dunbar, and Dawson R. Engler. “Klee: unassisted and automatic generation of high-coverage tests for complex systems programs.” <em>OSDI</em>. Vol. 8. 2008.</li>
      <li>Tan, Lin, et al. “/* iComment: Bugs or bad comments?<em>/.” *Proceedings of the 21th ACM SIGOPS Symposium on Operating Systems Principles</em>. 2007.</li>
    </ol>
  </li>
</ul>

<p>上表中有一些加粗的内容需要特别注意。首先，一个研究应当是在某种方面，做到了一些<strong>全新的</strong>(novel)，前人没有做过的事情。比如说，如果一个人声称因为牛顿和莱布尼兹发明了微积分，因此X国对此没有自主知识产权，所以他”发明“了一种本质上和微积分完全一样的”细巨分“，这种重复发明就不属于科学研究的范畴；而如果有人发现牛顿和莱布尼兹的微积分对无穷小没有一个严谨的定义，并发展了新的理论来给出严谨定义，这就属于科学研究的范畴。</p>

<p>其次，一个研究要得到承认，就必须<strong>符合某个研究社区(community)的价值取向</strong>。对于所有的工程学科而言，研究必须是为了<strong>解决实际问题</strong>而出发的。你试图解决的问题必须<strong>不能是一个空想的、实际不存在的伪问题</strong>。郭德纲的著名段子可以作为伪问题的经典例子：</p>

<blockquote>
  <p>内行要是和外行去辩论那就外行!</p>

  <p>比如我和火箭科学家说，你那火箭不行，燃料不好，我认为得烧柴，最好是煤，煤还得选精煤，水洗煤不好。</p>

  <p>如果那科学家，要是拿正眼看我一眼，那他就输了!</p>
</blockquote>

<p>“发射火箭应该烧柴、水洗煤、还是精煤？”就是一个典型的伪问题。为了不至于提出一个伪问题，<strong>工程学科的研究者必须首先是一个合格的工程师</strong>，具备一个同领域工程师应当具备的大多数知识和技能。当然，实际研究中，判断一个问题是不是伪问题不仅可能非常困难，也可能会有很多争议的。有的问题并不是完全没有价值，而是价值非常局限；有的问题可能一开始不被人认为有价值，之后却产生了非常深远的影响。为了你的文章能够发表，如果你的问题看上去不像是具有实际价值的问题，那么你可能需要花更多的篇幅去<strong>论证</strong>为什么你解决的问题具有实际价值，或者为什么解决这个问题具有带来实际价值的潜力。论证的方法可以是引用前人的研究；详细描述一个读者可能不知道的真实世界的应用场景；甚至可以让研究的一部分篇幅专门用于回答“为什么此问题具有价值？”这个问题。</p>

<p>对软件工程而言，软件工程领域关心的问题是如何提升软件开发的<strong>效率</strong>与<strong>质量</strong>。如果你发现你的研究问题不符合此标准，但是看上去也是有价值的，那么可能你需要寻找看重你的研究价值的方向，并投稿到相应的会议或期刊。此外，作为同行评审制度的副作用，每一个领域都会产生自己的<strong>品味(taste)</strong>，尽管审稿人的偏好可能会使得一些有意义的研究无法得到发表。把控一个研究社区品味的最好方式，就是与这个领域中有一定积累的学者进行合作。</p>

<h2 id="研究领域">研究领域</h2>

<p>言简意赅，我们实验室的主要研究对象是开源软件开发，任何涉及开源开发的效率和质量的方法、技术和工具都在其列。我们的学科领域主要是软件工程，但也涉及系统软件、人机交互、社会计算等领域。</p>

<p>开源软件开发主要涉及三个要素：软件，开发软件的个体和群体，以及开发者开发软件的方法、技术和工具。因此我们的研究也大概分为两类，一类偏人的行为的研究，包括个体行为和群体协作行为；一类偏技术性的研究，例如怎么设计自动化工具支持各种开发活动。前者更多是发现型研究，后者更多是发明型研究。</p>

<p>我们实验室长期而来一直聚焦于在软件工程领域进行<strong>发现型</strong>的研究。为什么软件工程领域需要发现型的研究，而不仅仅是去开发算法、系统与工具呢？第一个原因是，软件开发虽然以数学作为理论基础，并由计算机技术加以支撑，但是<strong>归根结底还是人的活动</strong>。比如说，开发一个软件不仅需要<strong>人</strong>与<strong>技术</strong>进行<strong>极其复杂而又不能出错的交互</strong>；也需要<strong>人</strong>与<strong>人</strong>之间进行<strong>交流</strong>与<strong>组织</strong>；还需要降低技术的<strong>门槛</strong>（例如控制代码的复杂度、编写技术文档等），使得不那么聪明的人也能参与软件开发。由于我们对人类智能的有限认识，这些问题是难以采用数学的方法去解决的，需要通过对历史事件的分析与反思，反复迭代得到最优的解决方案。第二个原因是，真实的软件开发场景极端复杂，很多问题也不太可能存在一个通用的解决方案(Silver Bullet)，使得<strong>定义合适的技术问题本身就不太容易</strong>，需要科学的证据和观点加以支持。因此，如果没有发现型的研究对<strong>软件工程研究</strong>和<strong>软件开发实践</strong>做出科学的指导，人们可能会在伪技术问题上浪费时间，拍脑袋的管理决策也可能会产生非常可怕的后果。感兴趣的读者可以看一看《人月神话》这本书，了解上世纪60年代IBM开发操作系统时遇到的各种问题。同时期的著名计算机科学家Edsger W. Dijkstra曾经这么评论过当时的“软件工程”：</p>

<blockquote>
  <p>The required techniques of effective reasoning are pretty formal, but as long as programming is done by people that don’t master them, the software crisis will remain with us and will be considered an incurable disease. And you know what incurable diseases do: they invite the quacks and charlatans in, who in this case take the form of Software Engineering gurus. (翻译：写代码需要很高的姿势水平，可是写代码的人一般都是傻Ｘ，所以他们总是写不出好软件。总是写不出来，就会引来一帮“软件工程”砖家；但是砖家们的观点啊，都too young, too simple, sometimes naïve!)</p>
</blockquote>

<p>当然，随着软件行业近六十年的发展，现在我们已经没有所谓的“软件危机”，倒不如说处在一个“傻Ｘ竟然也能开发软件”的时代。借助各种软件开发工具和成熟开源生态系统的支持，从小学生到北大青鸟速成班学员，都能在很短的时间内写出实际可用的应用程序。软件行业能走到今天，终究离不开众多软件行业实践者和软件工程研究人员的共同努力。</p>

<p>为了进行发现型的研究，需要采用某种研究范式。发现型研究有很多常用的研究范式，比如实地调研(Field Study)、实验室实验、计算机模拟等等，关于这些范式的详细介绍可以参考上一个表格的出处文章。然而，出于领域的特点，目前大多数软件工程领域的发现型研究都属于<strong>实地调研</strong>，而且主要是通过<strong>观测已有的软件开发活动数据</strong>，或者通过<strong>访谈/发问卷调查相关的人员</strong>来完成。控制变量实验相对较少见，主要原因在于软件开发的高成本和不可控性，使得这种实验不仅在成本上难以接受，也很难得到贴近现实的结果。由于软件开发活动不仅极端复杂还包含人类参与，基本没有采用计算机模拟来回答实际问题的研究。</p>

<p>上述加粗的两种研究方法之所以能成为主流范式，主要是要感谢近年来<strong>开源软件</strong>的流行与成功。由于开源软件普遍奉行数据开放的原则，且成功的开源软件往往具有极高的代码质量和项目管理水平，从而这些开源软件项目提供了<strong>海量的真实软件开发活动数据</strong>，从而能够用于回答之前由于缺乏数据而难以回答的大量问题（使用企业数据去做公开发表的研究往往是比较困难的）。于此同时，<strong>“开源”这个模式本身，也带来了大量的全新问题</strong>，例如开源项目要如何可持续发展，企业要如何参与开源等等。</p>

<p>具体到我们实验室的研究而言，我们的<strong>研究对象</strong>主要是开源软件的软件开发活动数据；<strong>研究目标</strong>通常是观察和度量大规模软件项目或生态中人们的开发行为，量化开发者与其环境和软件制品之间的关系，以期理解和掌握控制大型复杂软件系统的方法；<strong>研究风格</strong>通常是从某一个软件开发中普遍存在的实际问题出发，利用数据挖掘与分析对问题本身进行深入和全面的探索；如有必要，通过访谈和问卷验证我们的结论；如有可能，基于前面的结果尝试进行量化与建模，使用模型验证我们的结论，并建立推荐工具，对软件开发各种实践活动进行预测和推荐。</p>

<p>我们常常使用<strong>开源软件量化分析</strong>这个词来总结，也可以使用以下关键词来总结我们实验室当前的研究领域：</p>

<table>
  <thead>
    <tr>
      <th>一级学科</th>
      <th>软件工程（Software Engineering）</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>细分领域关键词</td>
      <td>数据驱动的软件开发（Data-Driven Software Development)</td>
    </tr>
    <tr>
      <td> </td>
      <td>软件仓库挖掘（Mining Software Repositories）</td>
    </tr>
    <tr>
      <td> </td>
      <td>软件度量（Software Measurement）</td>
    </tr>
    <tr>
      <td> </td>
      <td>实证/经验软件工程（Empirical Software Engineering）</td>
    </tr>
    <tr>
      <td> </td>
      <td>软件数字社会学（Software Digital Sociology）</td>
    </tr>
    <tr>
      <td> </td>
      <td>软件数字考古学（Software Digital Archeology）</td>
    </tr>
  </tbody>
</table>

<h2 id="研究对象">研究对象</h2>

<p>之前提到，我们实验室的主要研究对象是<strong>软件开发活动数据</strong>。这是一个非常宽广的概念。一般而言，只要是在软件开发过程中产生的数据，都可以称作软件开发活动数据，包括但不限于</p>

<ul>
  <li><strong>源代码(Source Code)</strong>：对于开源软件而言，代码显然是最容易获取到的数据，也是最重要的基础数据。有研究可以仅仅观察源代码来总结开发者的使用模式，例如：开发者如何在Java中使用继承(Stevenson and Wood, ICSE 2018)？什么样的Python代码才是“Pythonic”的代码(Peng et al., SANER 2021)？等等。当然，更多的时候，源代码是作为基础数据，与其他数据相结合来回答各种复杂的问题，例如与版本控制数据相结合来研究代码重构，等等。</li>
  <li><strong>文档(Documentation)</strong>：软件工程教材中常见一个说法：软件就是代码+文档。可见，源代码只有有了配套的文档，才能让一群人都能够真正理解、使用、开发、和维护。文档的类型包括但不限于：代码注释(Comment)、API文档(API Documentation)、教程(Tutorial)、工作流文档(Workflow Documentation)，等等。对文档数据的研究一般有两个方向，一是总结关于各种各样的软件文档的最佳实践；二是利用软件文档辅助其他方面的研究，例如通过API文档的演化来分析软件的兼容性等等。</li>
  <li><strong>配置文件(Configuration Files)</strong>：现代软件开发不仅需要依赖各种其他软件（工具链/运行环境/框架/库等），还常常需要对其他软件进行深度定制以符合自己项目的需求。为此，便产生了各种各样的配置文件，例如对项目依赖的配置文件、对项目持续集成(Continous Integration)的配置文件、对代码静态分析工具的配置文件、等等。一般而言，对配置文件的研究主要是通过分析已有的配置文件，来观察相关的开发者/生态系统的行为，从而提出改进策略。</li>
  <li><strong>变更(Changes)</strong>以及变更相关的文档：软件并不是一成不变的，而会持续地变化来适应新的需求。很多工具和管理过程都是为了适应变更而产生的，其中最重要和最著名的就是以git为代表的版本控制系统。git中记录的变更历史数据支撑了几乎所有关于软件演化的研究。</li>
  <li><strong>开发流程(Development Process)</strong>：为了协同多人协同进行软件开发，需要有高效的软件开发流程。目前，绝大多数开源项目都是采用Bug Report (Issue) + Patch (Pull Request) + Release的开发流程。这个流程中产生的数据，不仅可以用于观察和改进目前的开发流程，还可以用于观察各种人与人之间交互的有趣现象。</li>
  <li><strong>开发者(Developers)</strong>：开发者在开源社区中自身留下的轨迹，也可以作为一种独特的研究数据。目前，对开发者的刻画研究相对较少，已有研究曾经刻画过开发者的工作习惯、领域专长、任务偏好等等。</li>
  <li><strong>包管理平台(Package Hosting Platforms)</strong>： Maven, npm, PyPI, etc</li>
  <li><strong>社交数据(Social Data)</strong>：Mailing List, Stack Overflow, Twitter, Slack, etc</li>
</ul>

<h2 id="研究问题">研究问题</h2>

<p>但凡软件工程领域，任何研究问题都需要与提高开发效率和软件质量相关，即相关性（relevant）。另外需要具有的特点：</p>

<ol>
  <li>
    <p>前沿性(novel)：选题应聚焦于有新意的问题，进行创新和深入的研究。</p>
  </li>
  <li>
    <p>普遍性(non-trivial)：问题和解决方案需要有一般性，不能只针对极个别群体。</p>
  </li>
</ol>

<p>我们的研究主要围绕着<strong>开源软件</strong>与<strong>开源社区</strong>，使用这些数据来解答<strong>代码</strong>、<strong>文档</strong>、<strong>社区</strong>与<strong>生态</strong>有关的问题。关于我们正在研究的具体问题，可以参考“推荐文献”一节中后面的列表。</p>

<h2 id="研究方法">研究方法</h2>

<h3 id="数据集">数据集</h3>

<p>大量研究会直接使用GitHub上的项目数据，也有一些非常常用的公开数据集，例如Libraries.io, GHTorrent, World of Code等。此外，互联网上成千上万开源项目的数据都是开放的，可获取的软件开发活动数据非常多。一般来说，开源项目都会支持在线问题追踪系统，版本控制系统和邮件列表等，例如<a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1000000">Mozilla 的在线issue tracking system</a>、<a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git">Linux kernel 的在线代码库(mainline code repository)</a>、<a href="https://lkml.org/">Linux kernel 的在线邮件列表(mailing-list)</a>等。可以编写爬虫，面对你的实际研究问题，获取各种各样的数据。</p>

<h3 id="分析方法">分析方法</h3>

<p>我个人认为，使用什么样的分析方法，完全取决于你手边正在进行的课题的情况。一般而言，要得到简单的描述性结果，按直觉进行简单的数据分析与可视化即可；如果要得到相关性或者因果的结论，则需要用到各种统计学工具，例如相关性分析、回归分析、显著性检验等等；如果需要对复杂现象进行系统性的描述与总结，一般则采用某种定性研究的方法，例如主题分析(Thematic Analysis)等。</p>

<p>本文不想过多地局限于介绍某些具体的工具，一方面是因为，为了进行某种分析去自学相关工具应当是一个必备技能；另一方面，不同人对不同工具有不同的偏好，使用的效率往往也是不一样的。这里提供一个（并不完整的）常用工具列表：</p>

<ul>
  <li>
    <p>原始数据处理：Shell Scripts, Perl, Python</p>
  </li>
  <li>
    <p>数据分析：R language, Python + pandas + matplotlib + Seaborn, Jupyter Lab</p>
  </li>
  <li>
    <p>定性分析：interviews, surveys, open coding, thematic analysis</p>
  </li>
</ul>

<h2 id="推荐阅读">推荐阅读</h2>

<h3 id="领域科普文章">领域科普文章</h3>

<ol>
  <li>Brooks Jr, Frederick P. <em>No Silver Bullet – Essence and Accident in Software Engineering</em>. April, 1987.</li>
  <li>Brooks Jr, Frederick P. <em>The Mythical Man-Month: Essays on Software Engineering</em>. Pearson Education, 1995.</li>
  <li>邹欣. 构建之法: 现代软件工程. 人民邮电出版社. 2017年6月.</li>
  <li>Eric Raymond. 大教堂与集市. (有中文版，卫剑钒翻译，卫sir是北大软件所博士毕业)</li>
  <li>周明辉, 郭长国. 基于大数据的软件工程新思维.  中国计算机学会通讯. 第十卷第3期. 2014年3月.</li>
  <li>周明辉, 张伟等. 开源软件的量化分析.中国计算机学会通讯. 第十二卷第2期. 2016年2月.</li>
  <li>周明辉, 张宇霞, 谭鑫. 软件数字社会学. 中国科学信息科学. 2019. 49(11). 1399-1411.</li>
</ol>

<p><strong>注解：</strong> [1] [2] 是软件工程早期的重要文献，[1] 最早阐述了“软件工程没有通用解决方案”的核心论点，可直接Google到全文；[2] 则记录了IBM在60年代组织开发System/360操作系统的经验教训，国内有中译版，但是翻译非常生硬，推荐寻找英文PDF原版阅读；[3] 是一本软件工程入门教材，可在网上买到，和其他教材相比，内容简短易懂，行文生动有趣，非常适合作为闲暇读物；[4] [5] [6] 是关于我们实验室研究领域的概述，可以发邮件找周老师获取。</p>

<h3 id="领域早期工作">领域早期工作</h3>

<ol>
  <li>Mockus, Audris, Roy T. Fielding, and James D. Herbsleb. “Two case studies of open source software development: Apache and Mozilla.” <em>ACM Transactions on Software Engineering and Methodology (TOSEM)</em> 11.3 (2002): 309-346.</li>
  <li>Herbsleb, James D., and Audris Mockus. “An empirical study of speed and communication in globally distributed software development.” <em>IEEE Transactions on Software Engineering</em> 29.6 (2003): 481-494.</li>
  <li>Mockus, Audris, and David M. Weiss. “Globalization by chunking: A quantitative approach.” <em>IEEE Software</em> 18.2 (2001): 30-37.</li>
  <li>Parnas, D.L. (December 1972). “On the Criteria To Be Used in Decomposing Systems into Modules”. Communications of the ACM. 15 (12): 1053–58.</li>
  <li>Conway, M. E. (1968). “How Do Committees invent?” Datamation 14(4): 28-31.</li>
</ol>

<p>注解：使用数据探索开源开发和全球分布式开发是本领域的开拓性工作(也是读起来有美感的论文)，需要尤其关注其研究问题和研究方法及其跟软件开发效率和质量的关系。Parnas是软工先驱，他在软件模块化方面的工作至今无人逾越[4]。康威定律[5]是社会管理学文献，Brooks根据他在IBM 360项目的经验在其《人月神话》中主张康威定律在软件开发中的应用和重要性，从此社会技术一致性思想在软件工程延展。</p>

<h3 id="组内主要工作">组内主要工作</h3>

<h4 id="软件项目如何吸引和指导新人涉及developer-expertise的度量newcomer-onboarding-and-retaininglong-term-contributor等">软件项目如何吸引和指导新人（涉及developer expertise的度量，newcomer onboarding and retaining，long term contributor等）</h4>

<ol>
  <li>Zhou, Minghui, and Audris Mockus. “Developer fluency: Achieving true mastery in software projects.” <em>Proceedings of the eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering</em>. 2010.</li>
  <li>Zhou, Minghui, and Audris Mockus. “What make long term contributors: Willingness and opportunity in OSS community.” <em>2012 34th International Conference on Software Engineering (ICSE)</em>. IEEE, 2012.</li>
  <li>Zhou, Minghui, and Audris Mockus. “Who will stay in the floss community? modeling participant’s initial behavior.” <em>IEEE Transactions on Software Engineering</em> 41.1 (2014): 82-99.</li>
  <li>Minghui Zhou. Onboarding and Retaining of Contributors in FLOSS Ecosystem. <em>In book “Towards Engineering Free/Libre Open Source Software (FLOSS) Ecosystems for Impact and Sustainability”</em>, Springer, Singapore, pp:107-117, 2019.</li>
  <li>Tan, Xin, Minghui Zhou, and Zeyu Sun. “A first look at good first issues on GitHub.” <em>Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering</em>. 2020.</li>
</ol>

<h4 id="linux社区代码提交实践和社区可扩展性">Linux社区：代码提交实践和社区可扩展性</h4>

<ol>
  <li>Zhou, Minghui, et al. “On the scalability of Linux kernel maintainers’ work.” <em>Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering</em>. 2017.</li>
  <li>Xu, Yulin, and Minghui Zhou. “A multi-level dataset of Linux kernel patchwork.” <em>2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)</em>. IEEE, 2018.</li>
  <li>Tan, Xin, and Minghui Zhou. “How to communicate when submitting patches: An empirical study of the linux kernel.” <em>Proceedings of the ACM on Human-Computer Interaction</em> 3.CSCW (2019): 1-26.</li>
  <li>Xin Tan and Minghui Zhou. “Scaling open source software communities: Challenges and practices of decentralization.” <em>IEEE Software</em> (2020).</li>
</ol>

<h4 id="开源项目中的公司参与">开源项目中的公司参与</h4>

<ol>
  <li>Zhou, Minghui, et al. “Inflow and retention in OSS communities with commercial involvement: A case study of three hybrid projects.” <em>ACM Transactions on Software Engineering and Methodology (TOSEM)</em> 25.2 (2016): 1-29.</li>
  <li>Zhang, Yuxia, et al. “Companies’ domination in FLOSS development: An empirical study of OpenStack.” <em>Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings</em>. 2018.</li>
  <li>Zhang, Yuxia, et al. “Companies’ participation in OSS development: An empirical study of OpenStack.” <em>IEEE Transactions on Software Engineering</em> (2019).</li>
  <li>Zhang, Yuxia, et al. “How do companies collaborate in open source ecosystems? An empirical study of OpenStack.” <em>2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE)</em>. IEEE, 2020.</li>
</ol>

<h4 id="软件依赖开源供应链">软件依赖/开源供应链</h4>

<ol>
  <li>Hao He, et al. A Large-Scale Empirical Study on Java Library Migrations: Prevalence, Trends, and Rationales. FSE 2021. Aug 2021.</li>
  <li>Hao He, et al. A Multi-Metric Ranking Approach for Library Migration Recommendations. Proceedings of the 28th IEEE International Conference on Software Analysis, Evolution and Reengineering. 2021</li>
  <li>He, Hao, et al. MigrationAdvisor: Recommending Library Migrations from Large-Scale Open-Source Data. ICSE 2021 Tool Demo.</li>
</ol>]]></content><author><name>Hao He</name><email>haohe@andrew.cmu.edu</email></author><category term="Chinese" /><category term="Research Methodology" /><summary type="html"><![CDATA[（出自我为我们实验室撰写的内部资料，是在周明辉老师写的某个早期版本上扩写而成，并增加了一点我自己的思考，因感觉非常有用，故在此留存）]]></summary></entry><entry><title type="html">Ph.D.攻读期间的规划与执行</title><link href="https://hehao98.github.io/posts/2021/12/planning-and-execution/" rel="alternate" type="text/html" title="Ph.D.攻读期间的规划与执行" /><published>2021-12-01T00:00:00-08:00</published><updated>2021-12-01T00:00:00-08:00</updated><id>https://hehao98.github.io/posts/2021/12/planning-and-execution</id><content type="html" xml:base="https://hehao98.github.io/posts/2021/12/planning-and-execution/"><![CDATA[<p>（出自我为我们实验室撰写的内部资料，因感觉很有用，故在此留存）</p>

<p>很好，我现在有了一个绝妙的idea，现在就差把它写成论文发表在顶会上了！这篇文章会拿到Best Paper，然后会让我走向人生巅峰……慢着，请停一下你的想peach行为，仔细思考思考这个idea与一篇好的研究工作之间还有多大的差距。要填上这个差距并作出卓越的研究，我们可能需要聪明的头脑、缜密的计划、钢铁般的执行力、和一些运气。事实上，在历史上的接近时间点往往有很多研究者产生了类似的研究想法，但是最后真正成功做出结果并被后人记住的往往只有一到两个。这告诉我们，有一个好的规划与执行的策略是非常重要的。虽然笔者水平有限，经验不足，不过深感执行力的重要性，因此试图在这里分享一些浅薄的见解与经验。</p>

<h2 id="制定研究protocol">制定研究Protocol</h2>

<p>首先，你的研究的想法不能是过于粗略的，最好是能细化就细化。对于实证研究而言，往往是需要细化到能够回答如下问题</p>

<ol>
  <li>(Motivation) 为什么需要这个研究？他对软件开发的帮助在哪里？</li>
  <li>(Related Work) 你的研究与已有研究相比的核心区别在哪里？</li>
  <li>(Research Question and Methods) 具体有哪些研究问题，每个研究问题采取什么方法来回答？</li>
</ol>

<p>当你有了这样的一个Research Protocol，可以去找比你有经验的人去讨论，有经验的研究者往往能有一些sense和intuition能判断这个问题是否值得做，也能看出一些需要解决的问题。如果大家都觉得这个protocol是一个好的研究protocol，那我们就可以考虑去做了。否则，往往还需要去阅读更多文献，与工业界交流或者参与/观察软件开发，来思考更好的研究protocol。</p>

<h2 id="规划你的研究">规划你的研究</h2>

<p>从规划的角度，不同的类型的研究也是需要不同的策略的。因此，首先需要明确你的研究属于什么类型，并针对你的类型制定合适的策略。</p>

<p>有一类研究是这样的，这个研究的想法可能来源于某种偶然的突发奇想；它的视角可能新颖而有趣；主题可能是针对某种新兴事物；可能想到这个想法并没有那么容易；但是写完了论文再看，观察和分析的视角看起来是朴素而任何人都能想到的。有人将这一类研究叫做“突发想法”类研究，而我则倾向于将其命名为“快枪手”类的研究。虽然这些研究的视角非常新颖，但是做起来的难度可能没那么高，任何人都有可能想到去做这个研究。在软件工程领域常见的一个实证研究的pattern就是，一个新的工具/流程/系统出现了，我们就去调研和总结一下这个东西的使用情况，发现一些改进点。比如说，GitHub推出了Dependabot，有人就赶紧去总结一下Dependabot[1]的使用情况；又例如说，深度学习很火，那我们就赶紧总结一下开发者在深度学习上遇到的困难[2]；等等。在技术文章上也会有类似的情况，当你发现了一个有趣的技术问题时，需要赶紧定义这个问题并提出一个解决方法。对于这种研究，一旦这个主题被人做过了，你要再发一篇文章就会很难，需要做出显著的增量成果（因此往往就会落入后面提到的两类研究范畴）。因此，如果你产生了类似的研究想法，你需要去尽快地将这个研究实现出来，做那个“第一个吃螃蟹”的人。最好的规划方法是面向一个合适的顶会DDL，先搞一篇文章投出去再说。</p>

<p>另一类研究是这样的，这个研究的想法可能已经并不是那么新颖了，相关的研究可能有很多，但是大的研究问题依然并没有得到很好的解决，需要有人接着把这个研究方向往前推，但是往前推进并不是很容易，需要对问题的特性和已有研究做出全面与深入的了解。有人将这一类研究叫做“系统性”研究。值得一提的是，关于一个研究方向到底还需不需要新的研究，也就是这个方向有没有死，是个更加困难和难以判断的问题，在这里不做讨论。这一类研究就不是想法取胜了，而是需要有非常扎实的执行过程和独特有趣又扎实的技术方法。</p>

<p>最后一类研究是这样的，这个研究的主题已经有段时间没有人做过，近年的相关文章很少，虽然也许问题的价值很大，但是似乎是不怎么做的下去了。有人把这一类研究叫做“难题攻关类”研究。还是一样的，关于这一类课题到底还要不要做的问题不在本文的讨论范畴，不过显然，为了你身心健康，我只建议当你在毕业无忧的时候选择这一类问题去攻关。</p>

<p>对于最后两类研究而言，最大的陷阱就是时间陷阱，因为你需要花一个很长的时间周期去做这个题目，很容易就天天摸鱼，把时间和精力都浪费了。如果因此还造成了巨大的身心健康问题，那更是得不偿失。因此，对于这两类研究，需要有明确的阶段性产出，例如首先要有文献调研，然后要分析问题的性质和难点，最后是多轮的解决方案迭代。期间，如果有中间成果，也是可以考虑发表的。第一轮的发表对应Literature Review/Synthesis，第二轮的发表可以是Empirical Study或者Experience Paper，等等。</p>

<h2 id="最大化工作效率">最大化工作效率</h2>

<p>之前提过，Ph.D.最大的陷阱之一就是时间陷阱。科研是一个漫长且反馈周期很长的持久战，就好像跑马拉松，需要一些习惯来保证自己的日常工作效率，从而确保自己能够有足够的产出。</p>

<p>最大化工作效率的前提条件是，拥有健康的身体和积极向上的心态来对待你每一天的科研工作。笔者在这方面也不是什么专家，但也可以有一些小的建议。</p>

<ol>
  <li>对你的研究计划和阶段性结果有文档记录</li>
  <li>每天制定一个小的目标去完成</li>
  <li>尽可能利用碎片化时间来思考你的问题</li>
  <li>尽可能创造出一种<a href="https://zh.wikipedia.org/wiki/%E5%BF%83%E6%B5%81%E7%90%86%E8%AB%96">“心流”</a>的状态</li>
  <li>清楚自己的身心状态并且做出合理规划，该肝就肝，该摸鱼就摸鱼</li>
</ol>

<p>此外，为了最大化工作效率，精通你手里正在使用的工具也很重要。很可惜的是，国内的计算机本科教育极大地忽视了这一点。笔者至今还记得自己的大一编程课老师在课堂上说：</p>

<blockquote>
  <p>工具什么的不重要，北大的学生都这么聪明，你们课下自己去学就行了</p>
</blockquote>

<p>真是<strong>误人子弟</strong>。靠自己自学野路子成材的人可能有，但是更多人可能就陷入迷茫之中，从一开始就没有掌握很多应该掌握的技能和习惯，长远来看对个人发展非常不利。笔者见过不少人，到了研究生阶段还在以一种令人震惊的低效方式工作，而这很容易让自己陷入重复性工作的烂坑，极大地影响了工作效率和产出，甚至导致一个研究项目最后啥也做不出来。在这里笔者极力推荐MIT推出的面向大一新生的一门非常简单但是无比有用的神课：<a href="https://missing.csail.mit.edu/">The Missing Semester of Your CS Education</a>，专门系统性地教你<strong>如何利用常见的计算机工具最大化你在计算机领域进行相关实践（be it engineering or research）的效率</strong>。</p>

<h2 id="注重可复现性">注重可复现性</h2>

<p>令人遗憾的是，软件工程领域的研究者自己往往甚至都不能让自己的研究项目遵循软件工程的最佳实践。太多的论文最后发布的代码就是一坨屎，既不知道里面藏了多少Bug，也不知道114514年后到底还能不能再运行了。甚至有些论文，明知自己的代码有些地方可能不太对，也不去管他，赶紧把文章发出去了事，最后坑害了想要follow的人。所幸的是，现在越来越多的高质量会议都开始要求论文的透明性和课复现性，并要求提交相应的Artifact。</p>

<p>会议往往会对Artifact的质量标准提出明确要求。这里引用ESEC/FSE 2021对Artifact的<a href="https://2021.esec-fse.org/track/fse-2021-artifacts">要求</a>：</p>

<blockquote>

  <ul>
    <li><strong>Artifacts Evaluated - Functional</strong>: This badge is applied to papers whose associated artifacts have successfully completed an independent audit. Artifacts need not be made publicly available to be considered for this badge. However, they do need to be made available to reviewers. Two levels are distinguished, only one of which should be applied in any instance. These artifacts need to be:
      <ul>
        <li><strong>documented</strong>: At minimum, an inventory of artifacts is included, and sufficient description provided to enable the artifacts to be exercised.</li>
        <li><strong>consistent</strong>: The artifacts are relevant to the associated paper, and contribute in some inherent way to the generation of its main results.</li>
        <li><strong>complete</strong>: To the extent possible, all components relevant to the paper in question are included. (Proprietary artifacts need not be included. If they are required to exercise the package then this should be documented, along with instructions on how to obtain them. Proxies for proprietary data should be included so as to demonstrate the analysis.)</li>
        <li><strong>exercisable</strong>: Included scripts and / or software used to generate the results in the associated paper can be successfully executed, and included data can be accessed and appropriately manipulated.</li>
      </ul>
    </li>
    <li><strong>Artifacts Evaluated - Reusable</strong>: The artifacts meet the requirements for the <strong>Artifacts Evaluated - Functional</strong> level and in addition they are of a quality that significantly exceeds the requirements set for the first level. Authors are strongly encouraged to target their artifact submissions for <strong>Artifacts Evaluated - Reusable</strong> as the purpose of artifact badges is, among other things, to facilitate reuse and repurposing, which may not be achieved at the <strong>Artifacts Evaluated - Functional</strong> level.</li>
  </ul>
</blockquote>

<p>我个人的建议是，不要等到最后一刻再尝试去满足这些要求，而是最好再项目创立的伊始，就好好遵循对应技术的软件工程最佳实践。虽然也许这样一开始会很麻烦，但是研究项目不是课程项目，你可能需要花几个月甚至超过一年的时间来反复维护同样的代码，而往往研究项目会涉及非常复杂的代码，因此当你做的项目越来越复杂的时候，你往往就会感受到软件工程带来的好处了，并且可以提升长远来看的工作效率。</p>

<p>此外，在做实验的时候，最好能够尽可能将所有的流程自动化，使用参数来配置代码，避免复制黏贴，并按照有迹可循的方式保存做实验需要的所有参数和输出结果（例如，shell脚本和日志文件），便于最后一刻反复微调实验，加快迭代速度，也可以提升文章的可复现性。</p>

<h2 id="风险管理">风险管理</h2>

<p>任何研究都不是没有风险的。事实上，做研究可能是世界上风险最大的事情之一了。你的研究可能会有结果不符合预期，可能一开始设计的方法根本就不work，可能结果完全无法用你的理论解释，这一切都是完全正常的。在做规划的时候，从一开始就要拥抱变化并快速迭代，才能有比较好的结果。</p>

<h2 id="参考文献">参考文献</h2>

<ol>
  <li>Alfadel, Mahmoud, et al. “On the Use of Dependabot Security Pull Requests.” 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). IEEE, 2021.</li>
  <li>Zhang, Tianyi, et al. “An empirical study of common challenges in developing deep learning applications.” 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 2019.</li>
</ol>]]></content><author><name>Hao He</name><email>haohe@andrew.cmu.edu</email></author><category term="Chinese" /><category term="Research Methodology" /><summary type="html"><![CDATA[（出自我为我们实验室撰写的内部资料，因感觉很有用，故在此留存）]]></summary></entry><entry><title type="html">XV6操作系统代码阅读心得（四）：虚拟内存</title><link href="https://hehao98.github.io/posts/2019/04/xv6-4/" rel="alternate" type="text/html" title="XV6操作系统代码阅读心得（四）：虚拟内存" /><published>2019-04-10T00:00:00-07:00</published><updated>2019-04-10T00:00:00-07:00</updated><id>https://hehao98.github.io/posts/2019/04/xv6-4</id><content type="html" xml:base="https://hehao98.github.io/posts/2019/04/xv6-4/"><![CDATA[<p>本文将会详细介绍Xv6操作系统中虚拟内存的初始化过程。</p>

<h2 id="基本概念">基本概念</h2>

<p>32位X86体系结构采用二级页表来管理虚拟内存。之所以使用二级页表， 是为了节省页表所占用的内存，因为没有内存映射的二级页表可以不用分配地址来存储。在这个二级页表结构中，每个页的大小为4KB，每个页表的大小也为4KB，每个页表项的大小为4字节，一个页表包含1024个页表项。一级页表表项存储的是二级页表的地址，二级页表表项存储的是对应的物理地址。虚拟地址和物理地址的最后12位总是相同，因此页表表项中的这12位可以被用作标记其他信息。对于一个32位虚拟地址，可以通过前10位来找到其对应的一级页表表项的索引，读出二级页表表项的地址，并通过访问二级页表，得到对应的物理地址。显然，这样会使得一次虚拟内存的访问变成三次物理内存的访问，为了最小化其性能影响，CPU中额外有TLB缓存会缓存最近访问的虚拟地址所对应的页表项。虚拟地址到物理地址的转换图如下</p>

<p><img src="https://hehao98.github.io/assets/xv6-pic/pagetable.png" alt="" /></p>

<p>X86还额外支持4MB大页模式，让一个一级页表表项直接映射到4MB大小的页。有些情况下，这样分配会更加方便。后文会提到Xv6系统初始化时，会使用到4MB大页。</p>

<p>需要注意的是，虚拟地址到物理地址的映射过程是由硬件完成的，不是由某个函数完成的。硬件通过<code class="language-plaintext highlighter-rouge">cr3</code>控制寄存器中的一级页表地址取出对应的页表表项，自动完成虚拟地址的翻译，操作系统只负责初始化页表、设置控制寄存器和设置正确的页表表项的值。</p>

<h2 id="main函数执行前内存的情况"><code class="language-plaintext highlighter-rouge">main()</code>函数执行前内存的情况</h2>

<h3 id="物理地址的内容">物理地址的内容</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0x0000-0x7c00     引导程序的栈
0x7c00-0x7d00     引导程序的代码(512字节)
0x10000-0x11000   内核ELF文件头(4096字节)
0xA0000-0x100000  设备区
0x100000-0x400000 Xv6操作系统(未用满)
</code></pre></div></div>

<p>执行到<code class="language-plaintext highlighter-rouge">main.c</code>中的<code class="language-plaintext highlighter-rouge">main()</code>函数开头时，物理地址的具体内容如上。这里面引导程序是由BIOS负责载入内存，设备区是硬件规定占用的区域，而内核ELF文件头和Xv6操作系统是由引导程序(bootmain.c)加载进内存的。</p>

<h3 id="全局描述符表的内容">全局描述符表的内容</h3>

<table>
  <thead>
    <tr>
      <th>索引</th>
      <th>条目内容</th>
      <th>条目含义</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>[0]</td>
      <td>0</td>
      <td>空条目</td>
    </tr>
    <tr>
      <td>[1]</td>
      <td><code class="language-plaintext highlighter-rouge">SEG_ASM(STA_X|STA_R, 0x0, 0xffffffff) </code></td>
      <td>内核代码段</td>
    </tr>
    <tr>
      <td>[2]</td>
      <td><code class="language-plaintext highlighter-rouge">SEG_ASM(STA_W, 0x0, 0xffffffff) </code></td>
      <td>内核数据段</td>
    </tr>
    <tr>
      <td>[3]</td>
      <td>尚未设置</td>
      <td>用户代码段</td>
    </tr>
    <tr>
      <td>[4]</td>
      <td>尚未设置</td>
      <td>用户数据段</td>
    </tr>
    <tr>
      <td>[5]</td>
      <td>尚未设置</td>
      <td>Task State Segment</td>
    </tr>
  </tbody>
</table>

<p>X86体系结构中，全局描述符表用于分段管理内存。为了可移植性，类Unix一般只会以最少的方式使用全局描述符表对内存进行分段。在main.c里的初始化函数执行前，全局描述符表的内容如上。IA32体系结构中使用<code class="language-plaintext highlighter-rouge">cs</code>、<code class="language-plaintext highlighter-rouge">ds</code>、<code class="language-plaintext highlighter-rouge">ss</code>、<code class="language-plaintext highlighter-rouge">es</code>寄存器存放段寄存器的索引。此时<code class="language-plaintext highlighter-rouge">cs</code>寄存器存的索引值是1，<code class="language-plaintext highlighter-rouge">ds,ss,es</code>存的索引值是2，对应内核数据段和内核代码段。除了权限不同外，两个条目的内容完全相同，都是将基地址设为0，最大偏移设为4GB，这样就和一般的32位直接寻址使用起来一样了。</p>

<p>在main.c中，操作系统还会调用<code class="language-plaintext highlighter-rouge">seginit()</code>函数重新设置全局描述符表，并补充未设置的内容。Task State Segment会在第一个用户进程被创建时设置(具体是在<code class="language-plaintext highlighter-rouge">switchuvm()</code>函数中)。</p>

<h3 id="页表的内容">页表的内容</h3>

<p>在进入entry.S之前，系统是运行在段寻址模式下的，entry.S中设置了初始的页表并进入基于页表的虚拟寻址模式，页大小为4MB，初始的一级页表声明如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">__attribute__</span><span class="p">((</span><span class="n">__aligned__</span><span class="p">(</span><span class="n">PGSIZE</span><span class="p">)))</span>
<span class="n">pde_t</span> <span class="n">entrypgdir</span><span class="p">[</span><span class="n">NPDENTRIES</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span>
  <span class="c1">// Map VA's [0, 4MB) to PA's [0, 4MB)</span>
  <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">|</span> <span class="n">PTE_P</span> <span class="o">|</span> <span class="n">PTE_W</span> <span class="o">|</span> <span class="n">PTE_PS</span><span class="p">,</span>
  <span class="c1">// Map VA's [KERNBASE, KERNBASE+4MB) to PA's [0, 4MB)</span>
  <span class="p">[</span><span class="n">KERNBASE</span><span class="o">&gt;&gt;</span><span class="n">PDXSHIFT</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">|</span> <span class="n">PTE_P</span> <span class="o">|</span> <span class="n">PTE_W</span> <span class="o">|</span> <span class="n">PTE_PS</span><span class="p">,</span>
<span class="p">};</span>
</code></pre></div></div>

<p>注释中解释了初始的虚拟地址到物理地址的映射关系。<code class="language-plaintext highlighter-rouge">KERNBASE</code>为0x80000000。<code class="language-plaintext highlighter-rouge">PTE_P</code>表示这个页表项存在，<code class="language-plaintext highlighter-rouge">PTE_W</code>表示可写，<code class="language-plaintext highlighter-rouge">PTE_PS</code>表示这是4MB大页，没有设置<code class="language-plaintext highlighter-rouge">PTE_U</code>，表明这是内核页。注意其中用于内核区域的页只有一个，因此这就限制了内核代码段+数据段的总大小不能超过4MB(实际上是3MB，因为0x0-0x100000的物理地址在启动时被使用，且被设备区占用，实际的内核从物理地址0x100000开始)。</p>

<p>这只是一个初始的页表，在之后的main函数中会重新建立新的页表，并把这个页表丢弃。</p>

<h2 id="xv6对虚拟内存页的管理">Xv6对虚拟内存页的管理</h2>

<p>管理虚拟内存页的代码在<code class="language-plaintext highlighter-rouge">kalloc.c</code>中。<code class="language-plaintext highlighter-rouge">kalloc.c</code>的内存管理思想是把所有可用的空闲内存页串在一起形成一个大链表。每当有内存页被释放时，就将这个内存页加入这个链表(<code class="language-plaintext highlighter-rouge">kfree()</code>函数)；分配内存页时，就从链表头部取出一个内存页返回(<code class="language-plaintext highlighter-rouge">kalloc()</code>函数)。这个内存分配器必须知道它要负责管理的内存范围，并在初始化时将整个物理地址空间都纳入其管理范围。后文会提到，一开始，这个内存分配器管理的物理内存空间是[end, 0x400000]，然后会扩展到[end, 0xE00000]。这就暗含了一个假设，就是物理地址0xE00000必须存在，这就要求Xv6锁运行的系统至少拥有240MB的内存。</p>

<p>用于内存页管理的数据结构定义如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="p">{</span>
  <span class="k">struct</span> <span class="n">spinlock</span> <span class="n">lock</span><span class="p">;</span>
  <span class="kt">int</span> <span class="n">use_lock</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">run</span> <span class="o">*</span><span class="n">freelist</span><span class="p">;</span>
<span class="p">}</span> <span class="n">kmem</span><span class="p">;</span>
</code></pre></div></div>

<p>一开始，锁是没有启动的，直到<code class="language-plaintext highlighter-rouge">main()</code>函数调用了<code class="language-plaintext highlighter-rouge">kvinit2()</code>之后锁才会被使用，因为从这里之后可能会有多个进程和多个处理器并发地访问这个数据结构。 <code class="language-plaintext highlighter-rouge">struct run *freelist</code>就是空闲链表的声明。</p>

<p>对于每一个空内存页，因为这个内存页是空的，所以Xv6可以使用前4个字节来保存指向下一个空内存页的地址。因此，一个空内存页的定义如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">run</span> <span class="p">{</span>
  <span class="k">struct</span> <span class="n">run</span> <span class="o">*</span><span class="n">next</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<p>具体对应到添加和删除操作如下(注意其中的强制类型转换)</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// In kfree()</span>
<span class="c1">// Add virtual page v to freelist</span>
<span class="n">r</span> <span class="o">=</span> <span class="p">(</span><span class="k">struct</span> <span class="n">run</span><span class="o">*</span><span class="p">)</span><span class="n">v</span><span class="p">;</span>
<span class="n">r</span><span class="o">-&gt;</span><span class="n">next</span> <span class="o">=</span> <span class="n">kmem</span><span class="p">.</span><span class="n">freelist</span><span class="p">;</span>
<span class="n">kmem</span><span class="p">.</span><span class="n">freelist</span> <span class="o">=</span> <span class="n">r</span><span class="p">;</span>

<span class="c1">// In kalloc()</span>
<span class="c1">// Return a free page r and remove r from list</span>
<span class="n">r</span> <span class="o">=</span> <span class="n">kmem</span><span class="p">.</span><span class="n">freelist</span><span class="p">;</span>
<span class="k">if</span><span class="p">(</span><span class="n">r</span><span class="p">)</span> <span class="n">kmem</span><span class="p">.</span><span class="n">freelist</span> <span class="o">=</span> <span class="n">r</span><span class="o">-&gt;</span><span class="n">next</span><span class="p">;</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">kalloc()</code>和<code class="language-plaintext highlighter-rouge">kfree()</code>函数的具体实现中还有一些关于锁和错误检查的细节，在此略去。</p>

<p>在使用这个内存分配器时，使用<code class="language-plaintext highlighter-rouge">kfree()</code>就可以向其中添加空闲的内存页，使用<code class="language-plaintext highlighter-rouge">kalloc()</code>就可以从中请求一个内存页。</p>

<h2 id="main函数中虚拟内存的初始化过程"><code class="language-plaintext highlighter-rouge">main()</code>函数中虚拟内存的初始化过程</h2>

<p>Xv6系统使用<code class="language-plaintext highlighter-rouge">end</code>指针来标记Xv6的ELF文件所标记的结尾位置，这样，<code class="language-plaintext highlighter-rouge">[PGROUNDUP(end), 0x400000]</code>范围内的物理内存页是可以被用作内存页分配的。Xv6调用<code class="language-plaintext highlighter-rouge">kinit1(end, P2V(0x400000))</code>来首先将这部分内存纳入虚拟内存页管理。虽然这部分在之前的页表中已经被映射为4MB大页，但是我们的目标是建立一个新的页表，这个页表使用的页大小为4KB。由于这部分内存已经被分配为一个4MB内存大页，且硬件已经会自动执行虚拟内存地址翻译，故需要使用<code class="language-plaintext highlighter-rouge">P2V()</code>函数将物理地址转换为虚拟地址。之后的代码里还会存在很多这样的虚拟地址到物理地址的转换。</p>

<p>Xv6的内存分配器必须知道它要负责管理的内存范围。由于此时虚拟内存已经开启，且页表表项只有两条，因此Xv6必须利用已有的虚拟地址空间，在其中创建新的页表。这就是<code class="language-plaintext highlighter-rouge">main()</code>函数中<code class="language-plaintext highlighter-rouge">kinit1()</code>和<code class="language-plaintext highlighter-rouge">kvmalloc()</code>所做的事情。</p>

<p><code class="language-plaintext highlighter-rouge">kinit1()</code>函数会调用<code class="language-plaintext highlighter-rouge">freerange()</code>函数，按照前文叙述的方式，建立从<code class="language-plaintext highlighter-rouge">PGROUNDUP(end)</code>地址开始直到<code class="language-plaintext highlighter-rouge">0x400000</code>为止的全部内存页的链表。这样，我们得到了第一组可以使用的虚拟内存页，然后内核就可以运行<code class="language-plaintext highlighter-rouge">kvmalloc()</code>使用这些内存页了。<code class="language-plaintext highlighter-rouge">kvmalloc()</code>函数获得一个虚拟内存页并将其初始化一级页表。这个一级页表的内容在<code class="language-plaintext highlighter-rouge">vm.c</code>中的<code class="language-plaintext highlighter-rouge">kmap</code>处被定义，具体内容如下</p>

<table>
  <thead>
    <tr>
      <th>虚拟地址</th>
      <th>映射到物理地址</th>
      <th>内容</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>[0x80000000, 0x80100000]</td>
      <td>[0, 0x100000]</td>
      <td>I/O设备</td>
    </tr>
    <tr>
      <td>[0x80100000, 0x80000000+data]</td>
      <td>[0x100000, data]</td>
      <td>内核代码和只读数据</td>
    </tr>
    <tr>
      <td>[0x80000000+data, 0x80E00000]</td>
      <td>[data, 0xE00000]</td>
      <td>内核数据+可用物理内存</td>
    </tr>
    <tr>
      <td>[0xFE000000, 0]</td>
      <td>[0xFE000000, 0]</td>
      <td>其他通过内存映射的I/O设备</td>
    </tr>
  </tbody>
</table>

<p>注意以上映射规则会被生成为x86所要求的对应一级页表和二级页表。需要的时候，<code class="language-plaintext highlighter-rouge">kvmalloc()</code>函数所调用的<code class="language-plaintext highlighter-rouge">walkpgdir()</code>函数会申请新的内存页用作二级页表。</p>

<p>之后，<code class="language-plaintext highlighter-rouge">main()</code>函数会调用<code class="language-plaintext highlighter-rouge">seginit()</code>函数重新设置GDT。新的GDT与之前的GDT的主要区别在于设置了用户数据段和用户代码段。虽然这些段依然是对32位偏移进行直接映射，但其执行权限与内核的段有所不同。GDT中的TSS表项直到第一个用户进程创立时才会被设置，并且其内容会随着当前用户进程的切换而改变。</p>

<p>最后，<code class="language-plaintext highlighter-rouge">main()</code>函数会调用<code class="language-plaintext highlighter-rouge">kinit2()</code>将[0x400000, 0xE00000]范围内的物理地址纳入到内存页管理之中。至此，Xv6的内存页管理系统和内核页表已经全部建立完毕。需要注意的是，这个内核页表(<code class="language-plaintext highlighter-rouge">kpgdir</code>变量)只会在调度器运行时被使用。对于每一个用户进程，都会拥有自己独自的完整页表，其中也包含了一份一模一样的内核页表。</p>

<p>下面我们来看看第一个用户进程的虚拟地址空间是如何初始化的。<code class="language-plaintext highlighter-rouge">main()</code>函数在<code class="language-plaintext highlighter-rouge">kinit2()</code>之后紧接着调用<code class="language-plaintext highlighter-rouge">userinit()</code>来初始化第一个用户进程。<code class="language-plaintext highlighter-rouge">userinit()</code>在完成有关进程数据结构管理的工作后，会初始化这个进程自己的页表(<code class="language-plaintext highlighter-rouge">struct proc</code>中的<code class="language-plaintext highlighter-rouge">pgdir</code>)。首先，<code class="language-plaintext highlighter-rouge">userinit()</code>会使用<code class="language-plaintext highlighter-rouge">setupkvm()</code>生成与前述一模一样的内核页表，然后使用<code class="language-plaintext highlighter-rouge">inituvm()</code>生成第一个用户内存页(映射到虚拟地址0x0)，并将用户进程初始化代码移动至这个内存页中(这就要求初始化代码不能超过4KB，初始化代码参见initcode.S)。</p>

<p>initcode.S中包含了一个exec系统调用，通过这个系统调用来加载进一个真正的用户进程。exec系统调用的实现在exec.c中。exec会从磁盘里加载一个ELF文件。ELF文件中包含了所有代码段和数据段的信息，并且描述了这些段应该被加载到的虚拟地址(这是在编译时就已经确定好的，所以编译器必须遵循某些约定来分配这些虚拟地址)。</p>

<p>最后，exec会分配两个虚拟内存页，第一个页设置为不可访问，第二个页用作用户栈。由于栈是从上往下增长的，所以当栈的大小超过一个页(4KB)时，会触发错误，因此Xv6系统的用户进程最多只能使用4KB的栈。</p>

<h2 id="最终的虚拟内存布局">最终的虚拟内存布局</h2>

<p>这里我们列出init进程的页表中所记录的全部虚拟地址到物理地址的映射关系。每一个用户进程都有一个这样的页表。其中，有关内核的部分(也就是最后四项)对于所有用户进程都是一样的，而前面的映射会有所不同，表中的信息根据init的进程的ELF文件信息和exec调用的代码确定。</p>

<table>
  <thead>
    <tr>
      <th>虚拟地址</th>
      <th>映射到物理地址</th>
      <th>内容</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>[0x0, 0x1000]</td>
      <td>由分配器提供的地址</td>
      <td>用户进程的代码和数据</td>
    </tr>
    <tr>
      <td>[0x1000, 0x2000]</td>
      <td>由分配器提供的地址</td>
      <td>不可访问页，用于检测栈溢出</td>
    </tr>
    <tr>
      <td>[0x2000, 0x3000]</td>
      <td>由分配器提供的地址</td>
      <td>用户进程的栈</td>
    </tr>
    <tr>
      <td>[0x80000000, 0x80100000]</td>
      <td>[0, 0x100000]</td>
      <td>I/O设备</td>
    </tr>
    <tr>
      <td>[0x80100000, 0x80000000+data]</td>
      <td>[0x100000, data]</td>
      <td>内核代码和只读数据</td>
    </tr>
    <tr>
      <td>[0x80000000+data, 0x80E00000]</td>
      <td>[data, 0xE00000]</td>
      <td>内核数据+可用物理内存</td>
    </tr>
    <tr>
      <td>[0xFE000000, 0]</td>
      <td>[0xFE000000, 0]</td>
      <td>其他通过内存映射的I/O设备</td>
    </tr>
  </tbody>
</table>

<h2 id="中断进程调度与虚拟内存">中断、进程调度与虚拟内存</h2>

<p>中断发生时，使用的的页表依然是对应用户进程的页表。由于每一个用户进程都有一份一模一样的内核页表条目，因此陷入的内核代码依然可以正常执行。只有当中断处理程序决定退出当前进程或者切换到其他进程时，当前页表才会被切换为调度器的页表(全局变量<code class="language-plaintext highlighter-rouge">kpgdir</code>)，并在调度器中切换为新进程的页表。</p>]]></content><author><name>Hao He</name><email>haohe@andrew.cmu.edu</email></author><category term="Chinese" /><category term="C" /><category term="Operating System" /><summary type="html"><![CDATA[本文将会详细介绍Xv6操作系统中虚拟内存的初始化过程。]]></summary></entry><entry><title type="html">XV6操作系统代码阅读心得（五）：文件系统</title><link href="https://hehao98.github.io/posts/2019/04/xv6-5/" rel="alternate" type="text/html" title="XV6操作系统代码阅读心得（五）：文件系统" /><published>2019-04-10T00:00:00-07:00</published><updated>2019-04-10T00:00:00-07:00</updated><id>https://hehao98.github.io/posts/2019/04/xv6-5</id><content type="html" xml:base="https://hehao98.github.io/posts/2019/04/xv6-5/"><![CDATA[<h2 id="unix文件系统">Unix文件系统</h2>

<p>当今的Unix文件系统(Unix File System, UFS)起源于Berkeley Fast File System。和所有的文件系统一样，Unix文件系统是以块(Block)为单位对磁盘进行读写的。一般而言，一个块的大小为512Byte或者4KB。文件系统的所有数据结构都以块为单位存储在硬盘上，一些典型的数据块包括：superblock, inode, data block, directory block and indirection block。</p>

<p>Superblock包含了关于整个文件系统的元信息(metadata)，比如文件系统的类型、大小、状态和关于其他文件系统数据结构的信息。Superblock对文件系统是非常重要的，因此Unix文件系统的实现会保存多个Superblock的副本。</p>

<p>inode是Unix文件系统中用于表示文件的抽象数据结构。inode不仅是指抽象了一组硬盘上的数据的”文件”，目录和外部IO设备等也会用inode数据结构来表示。inode包含了一个文件的元信息，比如拥有者、访问权限、文件类型等等。对于一个文件系统里的所有文件，文件系统会维护一个inode列表，这个列表可能会占据一个或者多个磁盘块。</p>

<p>Data block用于存储实际的文件数据。一些文件系统中可能会存在用于存放目录的Directory Block和Indirection Block，但是在Unix文件系统中这些文件块都被视为数据，上层文件系统通过inode对其加以操作，他们唯一的区别是inode里记录的属性有所不同。</p>

<p>Xv6中的文件系统设计思想与Unix大抵相同，但是实现细节多有简化。在底层实现上，Xv6采用与Linux类似的分层实现思路，层层向上逐级封装，以便能支持多种多样的设备和IO方式。Xv6的文件系统包含了磁盘IO层、Log层、Inode层、File层和系统调用层，下面会依次介绍其实现，</p>

<h2 id="xv6中的磁盘io">Xv6中的磁盘IO</h2>

<p>Xv6中的磁盘IO在<code class="language-plaintext highlighter-rouge">ide.c</code>中，这是一个基于Programmed IO的面向IDE磁盘的简单实现。一个Xv6中的磁盘读写请求用如下的数据结构表示</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">buf</span> <span class="p">{</span>
  <span class="kt">int</span> <span class="n">flags</span><span class="p">;</span>
  <span class="n">uint</span> <span class="n">dev</span><span class="p">;</span>
  <span class="n">uint</span> <span class="n">blockno</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">sleeplock</span> <span class="n">lock</span><span class="p">;</span>
  <span class="n">uint</span> <span class="n">refcnt</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">buf</span> <span class="o">*</span><span class="n">prev</span><span class="p">;</span> <span class="c1">// LRU cache list</span>
  <span class="k">struct</span> <span class="n">buf</span> <span class="o">*</span><span class="n">next</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">buf</span> <span class="o">*</span><span class="n">qnext</span><span class="p">;</span> <span class="c1">// disk queue</span>
  <span class="n">uchar</span> <span class="n">data</span><span class="p">[</span><span class="n">BSIZE</span><span class="p">];</span>
<span class="p">};</span>
</code></pre></div></div>

<p>其中，对IDE磁盘而言，需要关心的域是<code class="language-plaintext highlighter-rouge">flags</code>(DIRTY, VALID)，<code class="language-plaintext highlighter-rouge">dev</code>(设备)，<code class="language-plaintext highlighter-rouge">blockno</code>(磁盘块编号)和<code class="language-plaintext highlighter-rouge">next</code>(指向队列的下一个成员的指针).</p>

<p>磁盘读写实现的思路是这样的：Xv6会维护一个进程请求磁盘操作的队列(<code class="language-plaintext highlighter-rouge">idequeue</code>)。当进程请求磁盘读写时，请求会被加入队列，进程会进入睡眠状态(<code class="language-plaintext highlighter-rouge">iderw()</code>)。任何时候，队列的开头表示当前正在进行的磁盘读写请求。当一个磁盘读写操作完成时，会触发一个中断，中断处理程序(<code class="language-plaintext highlighter-rouge">ideintr()</code>)会移除队列开头的请求，唤醒队列开头请求所对应的进程。如果还有后续的请求，就会将其移到队列开头，开始处理下一个磁盘请求。</p>

<p>磁盘请求队列的声明如下，当然对其访问是必须加锁的。</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">struct</span> <span class="n">spinlock</span> <span class="n">idelock</span><span class="p">;</span>
<span class="k">static</span> <span class="k">struct</span> <span class="n">buf</span> <span class="o">*</span><span class="n">idequeue</span><span class="p">;</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">ide.c</code>中函数及其对应功能如下</p>

<table>
  <thead>
    <tr>
      <th>函数名</th>
      <th>功能</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">idewait()</code></td>
      <td>等待磁盘进入空闲状态</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">ideinit()</code></td>
      <td>初始化IDE磁盘IO</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">idestart()</code></td>
      <td>开始一个磁盘读写请求</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">iderw()</code></td>
      <td>上层文件系统调用的磁盘IO接口</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">ideintr()</code></td>
      <td>当磁盘请求完成后中断处理程序会调用的函数</td>
    </tr>
  </tbody>
</table>

<p>操作系统启动时，<code class="language-plaintext highlighter-rouge">main()</code>函数会调用<code class="language-plaintext highlighter-rouge">ideinit()</code>对<code class="language-plaintext highlighter-rouge">ide</code>磁盘进行初始化，初始化函数中会初始化ide锁，设定磁盘中断控制，并检查是否存在第二个磁盘。</p>

<p><code class="language-plaintext highlighter-rouge">iderw()</code>函数提供了面向顶层文件系统模块的接口。<code class="language-plaintext highlighter-rouge">iderw()</code>既可用于读，也可用于写，只需通过判断<code class="language-plaintext highlighter-rouge">buf-&gt;flag</code>里的DIRTY位和VALID位就能判断出请求是读还是写。如果请求队列为空，证明当前磁盘不是工作状态，那么就需要调用<code class="language-plaintext highlighter-rouge">idestart()</code>函数初始化磁盘请求队列，并设置中断。如果请求是个写请求，那么<code class="language-plaintext highlighter-rouge">idestart()</code>中会向磁盘发出写出数据的指令。之后，<code class="language-plaintext highlighter-rouge">iderw()</code>会将调用者陷入睡眠状态。</p>

<p>当磁盘读取或者写操作完毕时，会触发中断进入<code class="language-plaintext highlighter-rouge">trap.c</code>中的<code class="language-plaintext highlighter-rouge">trap()</code>函数，<code class="language-plaintext highlighter-rouge">trap()</code>函数会调用<code class="language-plaintext highlighter-rouge">ideintr()</code>函数处理磁盘相关的中断。在<code class="language-plaintext highlighter-rouge">ideintr()</code>函数中，如果当前请求是读请求，就读取目前已经在磁盘缓冲区中准备好的数据。最后，<code class="language-plaintext highlighter-rouge">ideintr()</code>会唤醒正在睡眠等待当前请求的进程，如果队列里还有请求，就调用<code class="language-plaintext highlighter-rouge">idestart()</code>来处理新的请求。</p>

<p>##Buffer Cache的功能与实现</p>

<p>在文件系统中，Buffer Cache担任了一个磁盘与内存文件系统交互的中间层。由于对磁盘的读取是非常缓慢的，因此将最近经常访问的磁盘块缓存在内存里是很有益处的。</p>

<p>Xv6中Buffer Cache的实现在bio.c中，Buffer Cache的数据结构如下(rev11版本)</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="p">{</span>
  <span class="k">struct</span> <span class="n">spinlock</span> <span class="n">lock</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">buf</span> <span class="n">buf</span><span class="p">[</span><span class="n">NBUF</span><span class="p">];</span>
  <span class="c1">// Linked list of all buffers, through prev/next.</span>
  <span class="c1">// head.next is most recently used.</span>
  <span class="k">struct</span> <span class="n">buf</span> <span class="n">head</span><span class="p">;</span>
<span class="p">}</span> <span class="n">bcache</span><span class="p">;</span>
</code></pre></div></div>

<p>此数据结构在固定长度的数组上维护了一个由<code class="language-plaintext highlighter-rouge">struct buf</code>组成的双向链表，并且用一个锁来保护对Buffer Cache链表结构的访问。值得注意的是，对链表结构的访问和对一个<code class="language-plaintext highlighter-rouge">struct buf</code>结构的访问需要的是不同的锁。</p>

<p>在缓存初始化时，系统调用<code class="language-plaintext highlighter-rouge">binit()</code>对缓存进行初始化。<code class="language-plaintext highlighter-rouge">binit()</code>函数对缓存内每个元素初始化了睡眠锁，并从后往前连接成一个双向链表。一开始的时候，缓存内所有的块都是空的。</p>

<p>上层文件系统使用<code class="language-plaintext highlighter-rouge">bread()</code>和<code class="language-plaintext highlighter-rouge">bwrite()</code>对缓存中的磁盘块进行读写。关于缓存的全部操作是在<code class="language-plaintext highlighter-rouge">bread()</code>与<code class="language-plaintext highlighter-rouge">bwrite()</code>中自动完成的，不需要上层文件系统的参与。</p>

<p><code class="language-plaintext highlighter-rouge">bread()</code>会首先调用<code class="language-plaintext highlighter-rouge">bget()</code>函数，<code class="language-plaintext highlighter-rouge">bget()</code>函数会检查请求的磁盘块是否在缓存中。如果在缓存中，那么直接返回缓存中对应的磁盘块即可。如果不在缓存中，那么需要先使用最底层的<code class="language-plaintext highlighter-rouge">iderw()</code>函数先将此磁盘块从磁盘加载进缓存中，再返回此磁盘块。</p>

<p><code class="language-plaintext highlighter-rouge">bget()</code>函数的实现有一些Tricky。搜索缓存块的代码非常直接，但是在其中必须仔细考虑多进程同时访问磁盘块时的同步机制。在Xv6 rev7版本中由于没有实现睡眠锁，为了避免等待的缓冲区在等待的过程中改变了内容，必须在从锁中醒来时重新扫描磁盘缓冲区寻找合适的磁盘块，但是在rev11版本中由于实现了睡眠锁，在找到对应的缓存块时，只需释放对Buffer Cache的锁并拿到与当前缓存块有关的睡眠锁即可。</p>

<p><code class="language-plaintext highlighter-rouge">bwrite()</code>函数直接将缓存中的数据写入磁盘。Buffer Cache层不会尝试执行任何延迟写入的操作，何时调用<code class="language-plaintext highlighter-rouge">bwrite()</code>写入磁盘是由上层的文件系统控制的。</p>

<p>上层文件系统调用<code class="language-plaintext highlighter-rouge">brelse()</code>函数来释放一块不再使用的冲区。<code class="language-plaintext highlighter-rouge">brelse()</code>函数中主要涉及的是对双向链表的操作，在此不再赘述。</p>

<h2 id="log层的功能与实现">Log层的功能与实现</h2>

<p>在文件系统中添加Log层是为了能够使得文件系统能够处理诸如系统断电之类的异常情况，避免磁盘上的文件系统出现Inconsistency。Log层的实现思路是这样的，对于上层文件系统的全部磁盘操作，将其分割为一个个transaction，每个transaction都会首先将数据和其对应磁盘号写入磁盘上的Log区域，并且只有在Log区域写入全部完成后，再将Log区域的数据写入真正存储的数据区域。通过这种设计，如果在写入Log的时候断电，那么文件系统会当做这些写入不存在，如果在写入真正区域的时候断电，那么Log区域的数据可以用于恢复文件系统。如此，就可以避免文件系统中文件的损坏。</p>

<p>在Xv6 rev7的文件系统实现中，不允许多个进程并发地向Log层执行transaction，然而rev11的实现有所不同，允许多个进程并发地向Log层执行transaction。以下对实现细节的讨论基于rev11版本。</p>

<p>上层文件系统在使用log层时，必须首先调用<code class="language-plaintext highlighter-rouge">begin_op()</code>函数。<code class="language-plaintext highlighter-rouge">begin_op()</code>函数会记录一个新的transaction信息。在使用完log层后，上层系统必须调用<code class="language-plaintext highlighter-rouge">end_op()</code>函数。只有当没有transaction在执行时，log才会执行真正的磁盘写入。真正的磁盘写入操作在<code class="language-plaintext highlighter-rouge">commit()</code>函数中，可以看到<code class="language-plaintext highlighter-rouge">commit()</code>函数只有在<code class="language-plaintext highlighter-rouge">end_op()</code>结束，<code class="language-plaintext highlighter-rouge">log.outstanding==0</code>时才会被调用（以及开机的时刻）。<code class="language-plaintext highlighter-rouge">commit()</code>函数会先调用<code class="language-plaintext highlighter-rouge">write_log()</code>函数将缓存里的磁盘块写到磁盘上的Log区域里，并将Log Header写入到磁盘区域。只有当磁盘里存在Log Header的区域数据更新了，这一次Log更新才算完成。在Log区域更新后，<code class="language-plaintext highlighter-rouge">commit()</code>函数调用<code class="language-plaintext highlighter-rouge">install_trans()</code>完成真正的磁盘写入步骤，在这之后调用<code class="language-plaintext highlighter-rouge">write_head()</code>函数清空当前的Log数据。</p>

<h2 id="xv6-文件系统的硬盘布局">XV6 文件系统的硬盘布局</h2>

<p>在Xv6操作系统的硬盘中，依次存放了如下几个硬盘块。对这些硬盘块的索引是直接使用一个整数来进行的，</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[boot block | super block | log | inode blocks | free bit map | data blocks]
</code></pre></div></div>

<p>第一个硬盘块boot block会在开机的时候被加载进内存，磁盘块编号是0。第二个superblock占据了一个硬盘块，编号是1，在Xv6中的声明如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">superblock</span> <span class="p">{</span>
  <span class="n">uint</span> <span class="n">size</span><span class="p">;</span>         <span class="c1">// Size of file system image (blocks)</span>
  <span class="n">uint</span> <span class="n">nblocks</span><span class="p">;</span>      <span class="c1">// Number of data blocks</span>
  <span class="n">uint</span> <span class="n">ninodes</span><span class="p">;</span>      <span class="c1">// Number of inodes.</span>
  <span class="n">uint</span> <span class="n">nlog</span><span class="p">;</span>         <span class="c1">// Number of log blocks</span>
  <span class="n">uint</span> <span class="n">logstart</span><span class="p">;</span>     <span class="c1">// Block number of first log block</span>
  <span class="n">uint</span> <span class="n">inodestart</span><span class="p">;</span>   <span class="c1">// Block number of first inode block</span>
  <span class="n">uint</span> <span class="n">bmapstart</span><span class="p">;</span>    <span class="c1">// Block number of first free map block</span>
<span class="p">};</span>
</code></pre></div></div>

<p>Superblock中存储了文件系统有关的元信息。操作系统必须先读入Super Block才知道剩下的log块，inode块，bitmap块和datablock块的大小和位置。在Superblock之后顺序存储了多个log块、多个inode块、多个bitmap块。磁盘剩余的部分存储了data block块。</p>

<h2 id="xv6中的文件">XV6中的文件</h2>

<p>Xv6中的文件(包括目录)全部用inode数据结构加以表示，所有文件的inode都会被存储在磁盘上。系统和进程需要使用某个inode时，这个inode会被加载到inode缓存里。存储在内存里的inode会比存储在磁盘上的inode多一些运行时信息。内存里的inode数据结构声明如下。</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// in-memory copy of an inode</span>
<span class="k">struct</span> <span class="n">inode</span> <span class="p">{</span>
  <span class="n">uint</span> <span class="n">dev</span><span class="p">;</span>           <span class="c1">// Device number</span>
  <span class="n">uint</span> <span class="n">inum</span><span class="p">;</span>          <span class="c1">// Inode number</span>
  <span class="kt">int</span> <span class="n">ref</span><span class="p">;</span>            <span class="c1">// Reference count</span>
  <span class="k">struct</span> <span class="n">sleeplock</span> <span class="n">lock</span><span class="p">;</span> <span class="c1">// protects everything below here</span>
  <span class="kt">int</span> <span class="n">valid</span><span class="p">;</span>          <span class="c1">// inode has been read from disk?</span>

  <span class="kt">short</span> <span class="n">type</span><span class="p">;</span>         <span class="c1">// copy of disk inode</span>
  <span class="kt">short</span> <span class="n">major</span><span class="p">;</span>
  <span class="kt">short</span> <span class="n">minor</span><span class="p">;</span>
  <span class="kt">short</span> <span class="n">nlink</span><span class="p">;</span>
  <span class="n">uint</span> <span class="n">size</span><span class="p">;</span>
  <span class="n">uint</span> <span class="n">addrs</span><span class="p">[</span><span class="n">NDIRECT</span><span class="o">+</span><span class="mi">1</span><span class="p">];</span>
<span class="p">};</span>
</code></pre></div></div>

<p>其中，<code class="language-plaintext highlighter-rouge">inode.type</code>指明了这个文件的类型。Xv6中，这个类型可以是普通文件，目录，或者是特殊文件。</p>

<p>内核会在内存中维护一个inode缓存，缓存的数据结构声明如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="p">{</span>
  <span class="k">struct</span> <span class="n">spinlock</span> <span class="n">lock</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">inode</span> <span class="n">inode</span><span class="p">[</span><span class="n">NINODE</span><span class="p">];</span>
<span class="p">}</span> <span class="n">icache</span><span class="p">;</span>
</code></pre></div></div>

<p>对于Inode节点的基本操作如下</p>

<table>
  <thead>
    <tr>
      <th>函数名</th>
      <th>功能</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">iinit()</code></td>
      <td>读取Superblock，初始化inode相关的锁</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">ialloc()</code></td>
      <td>在磁盘上分配一个inode</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">iupdate()</code></td>
      <td>将内存里的一个inode写入磁盘</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">iget()</code></td>
      <td>获取指定inode，更新缓存</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">iput()</code></td>
      <td>对内存内一个Inode引用减1，引用为0则释放inode</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">ilock()</code></td>
      <td>获取指定inode的锁</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">iunlock()</code></td>
      <td>释放指定inode的锁</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">readi()</code></td>
      <td>往inode读数据</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">writei()</code></td>
      <td>往inode写数据</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">bmap()</code></td>
      <td>返回inode的第n个数据块的磁盘地址</td>
    </tr>
  </tbody>
</table>

<p>一个Inode有12(<code class="language-plaintext highlighter-rouge">NDIRECT</code>)个直接映射的磁盘块，有128个间接映射的磁盘块，这些合计起来，Xv6系统支持的最大文件大小为140*512B=70KB。</p>

<h2 id="xv6系统中的文件描述符">Xv6系统中的文件描述符</h2>

<p>Unix系统一个著名的设计哲学就是”Everything is a file”，这句话更准确地说是”Everything is a file descriptor”。上文所提的inode数据结构用于抽象文件系统中的文件和目录，而文件描述符除了抽象文件之外，还能抽象包含Pipe、Socket之类的其他IO，成为了一种通用的I/O接口。</p>

<p>Xv6中，一个文件的数据结构表示如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">file</span> <span class="p">{</span>
  <span class="k">enum</span> <span class="p">{</span> <span class="n">FD_NONE</span><span class="p">,</span> <span class="n">FD_PIPE</span><span class="p">,</span> <span class="n">FD_INODE</span> <span class="p">}</span> <span class="n">type</span><span class="p">;</span>
  <span class="kt">int</span> <span class="n">ref</span><span class="p">;</span> <span class="c1">// reference count</span>
  <span class="kt">char</span> <span class="n">readable</span><span class="p">;</span>
  <span class="kt">char</span> <span class="n">writable</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">pipe</span> <span class="o">*</span><span class="n">pipe</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">inode</span> <span class="o">*</span><span class="n">ip</span><span class="p">;</span>
  <span class="n">uint</span> <span class="n">off</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<p>从中可见，一个file数据结构既可以表示一个inode，也可以表示一个pipe。多个file数据结构可以抽象同一个inode，但是Offset可以不同。</p>

<p>系统所有的打开文件都在全局文件描述符表<code class="language-plaintext highlighter-rouge">ftable</code>中，<code class="language-plaintext highlighter-rouge">ftable</code>数据结构的声明如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="p">{</span>
  <span class="k">struct</span> <span class="n">spinlock</span> <span class="n">lock</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">file</span> <span class="n">file</span><span class="p">[</span><span class="n">NFILE</span><span class="p">];</span>
<span class="p">}</span> <span class="n">ftable</span><span class="p">;</span>
</code></pre></div></div>

<p>从中可以看出Xv6最多支持同时打开100(<code class="language-plaintext highlighter-rouge">NFILE</code>)个文件，从<code class="language-plaintext highlighter-rouge">struct proc</code>中可以看出Xv6中每个进程最多同时可以打开16(<code class="language-plaintext highlighter-rouge">NOFILE</code>)个文件。</p>

<p>对File数据结构的基本操作包括<code class="language-plaintext highlighter-rouge">filealloc()</code>, <code class="language-plaintext highlighter-rouge">filedup()</code>, <code class="language-plaintext highlighter-rouge">fileclose()</code>, <code class="language-plaintext highlighter-rouge">fileread()</code>, <code class="language-plaintext highlighter-rouge">filewrite()</code>和<code class="language-plaintext highlighter-rouge">filestat()</code>。命名风格与Unix提供的接口一致，因此从名字很容易就能看出其基本功能。</p>

<p>对于Inode类型的file而言，上述操作的实现依赖于inode的诸如<code class="language-plaintext highlighter-rouge">iread()</code>，<code class="language-plaintext highlighter-rouge">iwrite()</code>等基本操作。</p>

<h2 id="xv6中文件相关的系统调用">Xv6中文件相关的系统调用</h2>

<p>利用上一层的实现，大多数系统调用的实现都是比较直接的。Xv6中支持的文件相关系统调用列表如下</p>

<table>
  <thead>
    <tr>
      <th>名称</th>
      <th>功能</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">sys_link()</code></td>
      <td>为已有的inode创建一个新的名字</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">sys_unlink()</code></td>
      <td>为已有的inode移除一个名字，可能会移除这个inode</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">sys_open()</code></td>
      <td>打开一个指定的文件描述符</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">sys_mkdir()</code></td>
      <td>创建一个新目录</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">sys_mknod()</code></td>
      <td>创建一个新文件</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">sys_chdir()</code></td>
      <td>改变进程当前目录</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">sys_fstat()</code></td>
      <td>改变文件统计信息</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">sys_read()</code></td>
      <td>读文件描述符</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">sys_write()</code></td>
      <td>写文件描述符</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">sys_dup()</code></td>
      <td>增加文件描述符的引用</td>
    </tr>
  </tbody>
</table>

<p>绝大多数系统调用的语义都与Unix标准相同。</p>]]></content><author><name>Hao He</name><email>haohe@andrew.cmu.edu</email></author><category term="Chinese" /><category term="C" /><category term="Operating System" /><summary type="html"><![CDATA[Unix文件系统]]></summary></entry><entry><title type="html">XV6操作系统代码阅读心得（三）：锁</title><link href="https://hehao98.github.io/posts/2019/04/xv6-3/" rel="alternate" type="text/html" title="XV6操作系统代码阅读心得（三）：锁" /><published>2019-04-09T00:00:00-07:00</published><updated>2019-04-09T00:00:00-07:00</updated><id>https://hehao98.github.io/posts/2019/04/xv6-3</id><content type="html" xml:base="https://hehao98.github.io/posts/2019/04/xv6-3/"><![CDATA[<p>锁是操作系统中实现进程同步的重要机制。</p>

<h2 id="基本概念">基本概念</h2>

<p>临界区(Critical Section)是指对共享数据进行访问与操作的代码区域。所谓共享数据，就是可能有多个代码执行流并发地执行，并在执行中可能会同时访问的数据。</p>

<p>同步(Synchronization)是指让两个或多个进程/线程能够按照程序员期望的方式来协调执行的顺序。比如，让A进程必须完成某个操作后，B进程才能执行。互斥(Mutual Exclusion)则是指让多个线程不能够同时访问某些数据，必须要一个进程访问完后，另一个进程才能访问。</p>

<p>当多个进程/线程并发地执行并且访问一块数据，并且进程/线程的执行结果依赖于它们的执行顺序，我们就称这种情况为竞争状态(Race Condition)。</p>

<p>Xv6操作系统要求在内核临界区操作时中断必须关闭。如果此时中断开启，那么可能会出现以下死锁情况：A进程在内核态运行并拿下了p锁时，触发中断进入中断处理程序，中断处理程序也在内核态中请求p锁，由于锁在A进程手里，且只有A进程执行时才能释放p锁，因此中断处理程序必须返回，p锁才能被释放。那么此时中断处理程序会永远拿不到锁，陷入无限循环，进入死锁。</p>

<p>Xv6中实现了自旋锁(Spinlock)用于内核临界区访问的同步和互斥。自旋锁最大的特征是当进程拿不到锁时会进入无限循环，直到拿到锁退出循环。Xv6使用100ms一次的时钟中断和Round-Robin调度算法来避免陷入自旋锁的进程一直无限循环下去。显然，自旋锁看上去效率很低，我们很容易想到更加高效的基于等待队列的方法，让等待进程陷入阻塞而不是无限循环。然而，Xv6允许同时运行多个CPU核，多核CPU上的等待队列实现相当复杂，因此使用自旋锁是相对比较简单且能正确执行的实现方案。</p>

<h2 id="xv6的spinlock">Xv6的Spinlock</h2>

<p>Xv6中锁的定义如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Mutual exclusion lock.</span>
<span class="k">struct</span> <span class="n">spinlock</span> <span class="p">{</span>
  <span class="n">uint</span> <span class="n">locked</span><span class="p">;</span>       <span class="c1">// Is the lock held?</span>

  <span class="c1">// For debugging:</span>
  <span class="kt">char</span> <span class="o">*</span><span class="n">name</span><span class="p">;</span>        <span class="c1">// Name of lock.</span>
  <span class="k">struct</span> <span class="n">cpu</span> <span class="o">*</span><span class="n">cpu</span><span class="p">;</span>   <span class="c1">// The cpu holding the lock.</span>
  <span class="n">uint</span> <span class="n">pcs</span><span class="p">[</span><span class="mi">10</span><span class="p">];</span>      <span class="c1">// The call stack (an array of program counters)</span>
                     <span class="c1">// that locked the lock.</span>
<span class="p">};</span>
</code></pre></div></div>

<p>核心的变量只有一个<code class="language-plaintext highlighter-rouge">locked</code>，当<code class="language-plaintext highlighter-rouge">locked</code>为1时代表锁已被占用，反之未被占用，初始值为0。</p>

<p>在调用锁之前，必须对锁进行初始化。</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">initlock</span><span class="p">(</span><span class="k">struct</span> <span class="n">spinlock</span> <span class="o">*</span><span class="n">lk</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="n">name</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">lk</span><span class="o">-&gt;</span><span class="n">name</span> <span class="o">=</span> <span class="n">name</span><span class="p">;</span>
  <span class="n">lk</span><span class="o">-&gt;</span><span class="n">locked</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
  <span class="n">lk</span><span class="o">-&gt;</span><span class="n">cpu</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>最困难的地方是如何对<code class="language-plaintext highlighter-rouge">locked</code>变量进行原子操作占用锁和释放锁。这两步具体被实现为<code class="language-plaintext highlighter-rouge">acquire()</code>和<code class="language-plaintext highlighter-rouge">release()</code>函数。(注意v7版本和v11版本的实现略有不同，本文使用的是v11版本)</p>

<h3 id="acquire函数"><code class="language-plaintext highlighter-rouge">acquire()</code>函数</h3>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Acquire the lock.</span>
<span class="c1">// Loops (spins) until the lock is acquired.</span>
<span class="c1">// Holding a lock for a long time may cause</span>
<span class="c1">// other CPUs to waste time spinning to acquire it.</span>
<span class="kt">void</span> <span class="nf">acquire</span><span class="p">(</span><span class="k">struct</span> <span class="n">spinlock</span> <span class="o">*</span><span class="n">lk</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">pushcli</span><span class="p">();</span> <span class="c1">// disable interrupts to avoid deadlock.</span>
  <span class="k">if</span><span class="p">(</span><span class="n">holding</span><span class="p">(</span><span class="n">lk</span><span class="p">))</span>
    <span class="n">panic</span><span class="p">(</span><span class="s">"acquire"</span><span class="p">);</span>

  <span class="c1">// The xchg is atomic.</span>
  <span class="k">while</span><span class="p">(</span><span class="n">xchg</span><span class="p">(</span><span class="o">&amp;</span><span class="n">lk</span><span class="o">-&gt;</span><span class="n">locked</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span>
    <span class="p">;</span>

  <span class="c1">// Tell the C compiler and the processor to not move loads or stores</span>
  <span class="c1">// past this point, to ensure that the critical section's memory</span>
  <span class="c1">// references happen after the lock is acquired.</span>
  <span class="n">__sync_synchronize</span><span class="p">();</span>

  <span class="c1">// Record info about lock acquisition for debugging.</span>
  <span class="n">lk</span><span class="o">-&gt;</span><span class="n">cpu</span> <span class="o">=</span> <span class="n">mycpu</span><span class="p">();</span>
  <span class="n">getcallerpcs</span><span class="p">(</span><span class="o">&amp;</span><span class="n">lk</span><span class="p">,</span> <span class="n">lk</span><span class="o">-&gt;</span><span class="n">pcs</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">acquire()</code>函数首先禁止了中断，并且使用专门的<code class="language-plaintext highlighter-rouge">pushcli()</code>函数，这个函数保证了如果有两个<code class="language-plaintext highlighter-rouge">acquire()</code>禁止了中断，那么也必须调用两次<code class="language-plaintext highlighter-rouge">release()</code>中的<code class="language-plaintext highlighter-rouge">popcli()</code>后中断才会被允许。然后，<code class="language-plaintext highlighter-rouge">acquire()</code>函数采用<code class="language-plaintext highlighter-rouge">xchg</code>指令来实现在设置<code class="language-plaintext highlighter-rouge">locked</code>为1的同时获得其原来的值的操作。这里的C代码中封装了一个<code class="language-plaintext highlighter-rouge">xchg()</code>函数，在<code class="language-plaintext highlighter-rouge">xchg()</code>函数中采用GCC的内联汇编特性，实现如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kr">inline</span> <span class="n">uint</span> <span class="nf">xchg</span><span class="p">(</span><span class="k">volatile</span> <span class="n">uint</span> <span class="o">*</span><span class="n">addr</span><span class="p">,</span> <span class="n">uint</span> <span class="n">newval</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">uint</span> <span class="n">result</span><span class="p">;</span>
  <span class="c1">// The + in "+m" denotes a read-modify-write operand.</span>
  <span class="n">asm</span> <span class="k">volatile</span><span class="p">(</span><span class="s">"lock; xchgl %0, %1"</span> <span class="o">:</span>
               <span class="s">"+m"</span> <span class="p">(</span><span class="o">*</span><span class="n">addr</span><span class="p">),</span> <span class="s">"=a"</span> <span class="p">(</span><span class="n">result</span><span class="p">)</span> <span class="o">:</span>
               <span class="s">"1"</span> <span class="p">(</span><span class="n">newval</span><span class="p">)</span> <span class="o">:</span>
               <span class="s">"cc"</span><span class="p">);</span>
  <span class="k">return</span> <span class="n">result</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>其中，<code class="language-plaintext highlighter-rouge">volatile</code>标志用于避免gcc对其进行一些优化；第一个冒号后的<code class="language-plaintext highlighter-rouge">"+m" (*addr), "=a" (result)</code>是这个汇编指令的两个输出值；<code class="language-plaintext highlighter-rouge">newval</code>是这个汇编指令的输入值。假设<code class="language-plaintext highlighter-rouge">newval</code>位于<code class="language-plaintext highlighter-rouge">eax</code>寄存器中，<code class="language-plaintext highlighter-rouge">addr</code>位于<code class="language-plaintext highlighter-rouge">rax</code>寄存器中，那么gcc会翻译得到如下汇编指令</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> lock; xchgl (%rdx), %eax
</code></pre></div></div>

<p>由于<code class="language-plaintext highlighter-rouge">xchg</code>函数是<code class="language-plaintext highlighter-rouge">inline</code>的，它会被直接嵌入调用<code class="language-plaintext highlighter-rouge">xchg</code>函数的代码中，使用的寄存器可能会有所不同。</p>

<p>下面我们来分析一下上面的指令的语义。·<code class="language-plaintext highlighter-rouge">lock</code>是一个指令前缀，它保证了这条指令对总线和缓存的独占权，也就是这条指令的执行过程中不会有其他CPU或同CPU内的指令访问缓存和内存。由于现代CPU一般是多发射流水线+乱序执行的，因此一般情况下并不能保证这一点。<code class="language-plaintext highlighter-rouge">xchgl</code>指令是一条古老的x86指令，作用是交换两个寄存器或者内存地址里的4字节值，两个值不能都是内存地址，他不会设置条件码。</p>

<p>那么，仔细思考一下就能发现，以上一条<code class="language-plaintext highlighter-rouge">xchg</code>指令就同时做到了交换locked和1的值，并且在之后通过检查<code class="language-plaintext highlighter-rouge">eax</code>寄存器就能知道locked的值是否为0。并且，以上操作是原子的，这就保证了有且只有一个进程能够拿到locked的0值并且进入临界区。</p>

<p>最后，<code class="language-plaintext highlighter-rouge">acquire()</code>函数使用<code class="language-plaintext highlighter-rouge">__sync_synchronize</code>为了避免编译器对这段代码进行指令顺序调整的话和避免CPU在这块代码采用乱序执行的优化。</p>

<h3 id="release函数"><code class="language-plaintext highlighter-rouge">release()</code>函数</h3>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Release the lock.</span>
<span class="kt">void</span> <span class="nf">release</span><span class="p">(</span><span class="k">struct</span> <span class="n">spinlock</span> <span class="o">*</span><span class="n">lk</span><span class="p">)</span> <span class="p">{</span>
  <span class="k">if</span><span class="p">(</span><span class="o">!</span><span class="n">holding</span><span class="p">(</span><span class="n">lk</span><span class="p">))</span>
    <span class="n">panic</span><span class="p">(</span><span class="s">"release"</span><span class="p">);</span>

  <span class="n">lk</span><span class="o">-&gt;</span><span class="n">pcs</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
  <span class="n">lk</span><span class="o">-&gt;</span><span class="n">cpu</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

  <span class="c1">// Tell the C compiler and the processor to not move loads or stores</span>
  <span class="c1">// past this point, to ensure that all the stores in the critical</span>
  <span class="c1">// section are visible to other cores before the lock is released.</span>
  <span class="c1">// Both the C compiler and the hardware may re-order loads and</span>
  <span class="c1">// stores; __sync_synchronize() tells them both not to.</span>
  <span class="n">__sync_synchronize</span><span class="p">();</span>

  <span class="c1">// Release the lock, equivalent to lk-&gt;locked = 0.</span>
  <span class="c1">// This code can't use a C assignment, since it might</span>
  <span class="c1">// not be atomic. A real OS would use C atomics here.</span>
  <span class="n">asm</span> <span class="k">volatile</span><span class="p">(</span><span class="s">"movl $0, %0"</span> <span class="o">:</span> <span class="s">"+m"</span> <span class="p">(</span><span class="n">lk</span><span class="o">-&gt;</span><span class="n">locked</span><span class="p">)</span> <span class="o">:</span> <span class="p">);</span>

  <span class="n">popcli</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">release</code>函数为了保证设置locked为0的操作的原子性，同样使用了内联汇编。最后，使用popcli()来允许中断（或者弹出一个cli，但因为其他锁未释放使得中断依然被禁止）。</p>

<h2 id="在xv6中实现信号量">在Xv6中实现信号量</h2>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">semaphore</span> <span class="p">{</span>
  <span class="kt">int</span> <span class="n">value</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">spinlock</span> <span class="n">lock</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">proc</span> <span class="o">*</span><span class="n">queue</span><span class="p">[</span><span class="n">NPROC</span><span class="p">];</span>
  <span class="kt">int</span> <span class="n">end</span><span class="p">;</span>
  <span class="kt">int</span> <span class="n">start</span><span class="p">;</span>
<span class="p">};</span>

<span class="kt">void</span> <span class="nf">sem_init</span><span class="p">(</span><span class="k">struct</span> <span class="n">semaphore</span> <span class="o">*</span><span class="n">s</span><span class="p">,</span> <span class="kt">int</span> <span class="n">value</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">s</span><span class="o">-&gt;</span><span class="n">value</span> <span class="o">=</span> <span class="n">value</span><span class="p">;</span>
  <span class="n">initlock</span><span class="p">(</span><span class="o">&amp;</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">lock</span><span class="p">,</span> <span class="s">"semaphore_lock"</span><span class="p">);</span>
  <span class="n">end</span> <span class="o">=</span> <span class="n">start</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">sem_wait</span><span class="p">(</span><span class="k">struct</span> <span class="n">semaphore</span> <span class="o">*</span><span class="n">s</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">acquire</span><span class="p">(</span><span class="o">&amp;</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">lock</span><span class="p">);</span>
  <span class="n">s</span><span class="o">-&gt;</span><span class="n">value</span><span class="o">--</span><span class="p">;</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">value</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">s</span><span class="o">-&gt;</span><span class="n">queue</span><span class="p">[</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">end</span><span class="p">]</span> <span class="o">=</span> <span class="n">myproc</span><span class="p">();</span>
    <span class="n">s</span><span class="o">-&gt;</span><span class="n">end</span> <span class="o">=</span> <span class="p">(</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">end</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">%</span> <span class="n">NPROC</span><span class="p">;</span>
    <span class="n">sleep</span><span class="p">(</span><span class="n">myproc</span><span class="p">(),</span> <span class="o">&amp;</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">lock</span><span class="p">)</span>
  <span class="p">}</span>
  <span class="n">release</span><span class="p">(</span><span class="o">&amp;</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">lock</span><span class="p">);</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">sem_signal</span><span class="p">(</span><span class="k">struct</span> <span class="n">semaphore</span> <span class="o">*</span><span class="n">s</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">acquire</span><span class="p">(</span><span class="o">&amp;</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">lock</span><span class="p">);</span>
  <span class="n">s</span><span class="o">-&gt;</span><span class="n">value</span><span class="o">++</span><span class="p">;</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">value</span> <span class="o">&lt;=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">wakeup</span><span class="p">(</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">queue</span><span class="p">[</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">start</span><span class="p">]);</span>
    <span class="n">s</span><span class="o">-&gt;</span><span class="n">queue</span><span class="p">[</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">start</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="n">s</span><span class="o">-&gt;</span><span class="n">start</span> <span class="o">=</span> <span class="p">(</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">start</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">%</span> <span class="n">NPROC</span><span class="p">;</span>
  <span class="p">}</span>
  <span class="n">release</span><span class="p">(</span><span class="o">&amp;</span><span class="n">s</span><span class="o">-&gt;</span><span class="n">lock</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>上面的代码使用Xv6提供的接口实现了信号量，格式和命名与POSIX标准类似。这个信号量的实现采用等待队列的方式。当一个进程因信号量陷入阻塞时，会将自己放进等待队列并睡眠(18-22行)。当一个进程释放信号量时，会从等待队列中取出一个进程继续执行(29-33行)。</p>]]></content><author><name>Hao He</name><email>haohe@andrew.cmu.edu</email></author><category term="Chinese" /><category term="C" /><category term="Operating System" /><summary type="html"><![CDATA[锁是操作系统中实现进程同步的重要机制。]]></summary></entry><entry><title type="html">Mining GitHub Repository Information using the Official REST API</title><link href="https://hehao98.github.io/posts/2019/04/mining-github-repo/" rel="alternate" type="text/html" title="Mining GitHub Repository Information using the Official REST API" /><published>2019-04-04T00:00:00-07:00</published><updated>2019-04-04T00:00:00-07:00</updated><id>https://hehao98.github.io/posts/2019/04/mining-github-repos</id><content type="html" xml:base="https://hehao98.github.io/posts/2019/04/mining-github-repo/"><![CDATA[<p>GitHub provides a (not very convinent and well documented) HTTP API for requesting information from GitHub. We can use <code class="language-plaintext highlighter-rouge">https://api.github.com/search/repositories</code> for requesting repository information in JSON format. You can apply various search conditions and sort them if necessary. For example, if you want to collect 1000 most starred repositories whose language is Java, you can use the following request.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>https://api.github.com/search/repositories?q=language:java&amp;sort=stars&amp;order=desc
</code></pre></div></div>

<p>See the following links for a complete documentation.</p>

<ol>
  <li><a href="https://developer.github.com/v3/search/">https://developer.github.com/v3/search/</a></li>
  <li><a href="https://help.github.com/en/articles/searching-for-repositories">https://help.github.com/en/articles/searching-for-repositories</a></li>
</ol>

<p>However, there are several restrictions (restriction 1 is not documented):</p>

<ol>
  <li>Only one page of results (30) are returned for each request</li>
  <li>You are limited to send only 10 requests per minute (if authenticated, 30 requests per minute).</li>
  <li>You can only get up to 1000 search results for one set of given conditions.</li>
</ol>

<p>Therefore, you cannot get more than 1000 results for a given search request, limiting the scale of possible analysis. You also cannot send more than 10 requests per minute. Also, you have to fetch results page by page using the <code class="language-plaintext highlighter-rouge">page</code> parameter, using this list of URL</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>https://api.github.com/search/repositories?q=language:java&amp;sort=stars&amp;order=desc&amp;page=1
https://api.github.com/search/repositories?q=language:java&amp;sort=stars&amp;order=desc&amp;page=2
...
https://api.github.com/search/repositories?q=language:java&amp;sort=stars&amp;order=desc&amp;page=34
</code></pre></div></div>

<p>Note that the maximum page number is 34 due to the 1000 result restriction.</p>

<p>If you made any error during the request, the error message will be in the <code class="language-plaintext highlighter-rouge">message</code> field in the returned JSON object. Otherwise, the array of repository information will be in the <code class="language-plaintext highlighter-rouge">item</code> field.</p>

<h2 id="example-implementation-in-python">Example Implementation in Python</h2>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s">'''
Returns a json object that contains information of GitHub repos returned by GitHub REST v3 API

Example search url: https://api.github.com/search/repositories?q=language:java&amp;sort=stars&amp;order=desc
This URL collects GitHub Java project sorted by starts in descending order.
Remember that:
1. The results are returned in pages, so you have to fetch them page by page
2. You are limited to send only 10 requests per minute
3. You can only get up to 1000 search results

Reference Documentation: 
https://developer.github.com/v3/search/
https://help.github.com/en/articles/searching-for-repositories
'''</span>
<span class="k">def</span> <span class="nf">get_repolist_by_stars</span><span class="p">(</span><span class="n">num</span><span class="o">=</span><span class="mi">30</span><span class="p">,</span> <span class="n">lang</span><span class="o">=</span><span class="s">''</span><span class="p">):</span>
    <span class="n">url</span> <span class="o">=</span> <span class="s">'https://api.github.com/search/repositories'</span>
    <span class="n">params</span> <span class="o">=</span> <span class="p">{</span><span class="s">'q'</span><span class="p">:</span><span class="s">'stars:&gt;1000'</span><span class="p">,</span> <span class="s">'sort'</span><span class="p">:</span><span class="s">'stars'</span><span class="p">,</span> <span class="s">'order'</span><span class="p">:</span><span class="s">'desc'</span><span class="p">,</span> <span class="s">'page'</span><span class="p">:</span><span class="s">'1'</span><span class="p">}</span>
    <span class="n">repolist</span> <span class="o">=</span> <span class="p">[]</span>

    <span class="k">if</span> <span class="n">lang</span> <span class="o">!=</span> <span class="s">''</span><span class="p">:</span>
        <span class="n">params</span><span class="p">[</span><span class="s">'q'</span><span class="p">]</span> <span class="o">=</span> <span class="s">'language:'</span> <span class="o">+</span> <span class="n">lang</span>

    <span class="k">print</span><span class="p">(</span><span class="s">'Sending HTTP requests to GitHub, may need several minutes to complete...'</span><span class="p">)</span>

    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nb">int</span><span class="p">(</span><span class="n">num</span> <span class="o">/</span> <span class="mi">30</span><span class="p">)</span> <span class="o">+</span> <span class="mi">2</span><span class="p">):</span>
        <span class="n">params</span><span class="p">[</span><span class="s">'page'</span><span class="p">]</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
        <span class="n">json</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">params</span><span class="p">).</span><span class="n">json</span><span class="p">()</span>
        <span class="k">if</span> <span class="n">json</span><span class="p">[</span><span class="s">'items'</span><span class="p">]</span> <span class="o">==</span> <span class="bp">None</span><span class="p">:</span>
            <span class="k">print</span><span class="p">(</span><span class="s">'Error: No result in page '</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="o">+</span> <span class="s">'!'</span><span class="p">)</span> 
            <span class="k">print</span><span class="p">(</span><span class="s">'Message from GitHub: '</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">json</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">'message'</span><span class="p">)))</span>

        <span class="n">repolist</span><span class="p">.</span><span class="n">extend</span><span class="p">(</span><span class="n">json</span><span class="p">[</span><span class="s">'items'</span><span class="p">])</span>

        <span class="k">print</span><span class="p">(</span><span class="s">'Downloaded repository information in page '</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">))</span>
        <span class="n">time</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">7</span><span class="p">)</span> <span class="c1"># This rate is imposed by GitHub
</span>
    <span class="k">return</span> <span class="n">repolist</span><span class="p">[</span><span class="mi">0</span><span class="p">:</span><span class="n">num</span><span class="p">]</span>
</code></pre></div></div>]]></content><author><name>Hao He</name><email>haohe@andrew.cmu.edu</email></author><category term="English" /><category term="Python" /><category term="Mining Software Repositories" /><summary type="html"><![CDATA[GitHub provides a (not very convinent and well documented) HTTP API for requesting information from GitHub. We can use https://api.github.com/search/repositories for requesting repository information in JSON format. You can apply various search conditions and sort them if necessary. For example, if you want to collect 1000 most starred repositories whose language is Java, you can use the following request.]]></summary></entry><entry><title type="html">基于五阶段流水线的RISC-V CPU模拟器实现</title><link href="https://hehao98.github.io/posts/2019/03/riscv-simulator/" rel="alternate" type="text/html" title="基于五阶段流水线的RISC-V CPU模拟器实现" /><published>2019-03-27T00:00:00-07:00</published><updated>2019-03-27T00:00:00-07:00</updated><id>https://hehao98.github.io/posts/2019/03/riscv-simulator</id><content type="html" xml:base="https://hehao98.github.io/posts/2019/03/riscv-simulator/"><![CDATA[<p>RISC-V是源自Berkeley的开源体系结构和指令集标准。这个模拟器实现的是RISC-V Specification 2.2中所规定RV64I指令集，基于标准的五阶段流水线，并且实现了分支预测模块和虚拟内存模拟。实现一个完整的CPU模拟器可以很好地锻炼系统编程能力，并且加深对体系结构有关知识的理解。在开始实现前，应当阅读并深入理解Computer Systems: A Programmer’s Perspective中的第四章，或者Computer Organizaton and Design: Hardware/Software Interface中的有关章节。</p>

<p>本模拟器的代码在GitHub上：<a href="https://github.com/hehao98/RISCV-Simulator">https://github.com/hehao98/RISCV-Simulator</a></p>

<h2 id="一开发环境">一、开发环境</h2>

<h3 id="11-risc-v环境的安装与配置">1.1 RISC-V环境的安装与配置</h3>

<p>首先，必须搭建RISC-V相关的编译、运行和测试环境。简便起见，本次实验全部基于RISC-V 64I指令集，参考的指令集标准是RISC-V Specification 2.2。为了配置环境，执行了如下步骤。</p>

<ol>
  <li>从GitHub上下载了<code class="language-plaintext highlighter-rouge">riscv-tools，</code>从中针对Linux平台配置，编译和安装了<code class="language-plaintext highlighter-rouge">riscv-gnu-toolchain</code>。</li>
  <li>为了使用官方模拟器作为参照，从GitHub上下载、编译和安装了<code class="language-plaintext highlighter-rouge">riscv-qemu</code>。</li>
</ol>

<p>需要特别注意的是，在编译<code class="language-plaintext highlighter-rouge">riscv-gnu-toolchain</code>时，必须指定工具链和C语言标准库所使用的指令集为RV64I，否则在编译的时候编译器会使用RV64C、RV64D等扩展指令集。即使设置编译器编译时只使用RV64I指令集，编译器也会链接进使用扩展指令集的标准库函数。因此，为了获得只使用RV64I标准指令集的ELF程序，必须在<code class="language-plaintext highlighter-rouge">riscv-gnu-toolchain</code>中采用如下选项重新编译</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir build; cd build
../configure --with-arch=rv64i --prefix=/path/to/riscv64i
make -j$(nproc)
</code></pre></div></div>

<p>在编译时，使用<code class="language-plaintext highlighter-rouge">-march=rv64i</code>让编译器针对RV64I标准指令集生成ELF程序。</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>riscv64-unknown-elf-gcc -march=rv64i test/arithmetic.c test/lib.c -o riscv-elf/arithmetic.riscv
</code></pre></div></div>

<h3 id="12-使用的测试程序和测试方法">1.2 使用的测试程序和测试方法</h3>

<p>对一个体系结构模拟器进行测试有一定难度，主要是由于指令数众多、代码庞大、从而对模拟器代码进行100%覆盖率的测试比较困难。因此，为了便于测试，本模拟器使用了一组由简单到复杂的测试程序，并且实现了单步调试和打印CPU状态的接口。此外，为了便于进行调试和性能分析，还实现了记录执行历史的模块，在程序出错时可以获得完整的指令执行历史和内存快照，便于对出错进行分析。</p>

<p>为了对RISC-V模拟器进行测试，编写了如下程序（见<code class="language-plaintext highlighter-rouge">test/</code>文件夹）。比较复杂的是快速排序、矩阵乘法和求Ackermann函数三个。其中，快速排序和矩阵乘法涉及比较多的指令和数据，求解Ackermann函数涉及非常深的递归调用。</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>lib.c             # 自定义的系统调用实现
helloworld.c      # 最简单的程序
test_arithmetic.c # 对运算指令的测试
test_branch.c     # 对基本分支的测试
test_syscall.c    # 对系统调用的测试
quicksort.c       # 快速排序
matrixmulti.c     # 矩阵乘法
ackermann.c       # 求解Ackermann函数
</code></pre></div></div>

<p>所有程序编译后得到的二进制程序和反编译得到的汇编代码均保存在<code class="language-plaintext highlighter-rouge">riscv-elfs/</code>文件夹中。</p>

<h2 id="二设计概述">二、设计概述</h2>

<h3 id="21-开发环境">2.1 开发环境</h3>

<p>我测试的模拟器运行环境为Mac OS X，使用的编程语言为C++ 11，构建环境为CMake，编译器为<code class="language-plaintext highlighter-rouge">Apple Clang 10.0.0</code>，编译使用的Flag为<code class="language-plaintext highlighter-rouge">-O2 -Wall</code>。开发使用的工具为VS Code。不过，模拟器代码尽量避免了使用标准库以外的平台相关功能，所以应该也能在其他平台和编译器上编译运行。</p>

<h3 id="22-设计考量">2.2 设计考量</h3>

<p>首先，模拟器的运行必须是健壮的。具体地说，必须能够处理各种非法输入，包括不正常的访存，不正常的ELF文件，非法指令，非法的访存地址等等。编写细致全面的错误处理不仅有助于锻炼系统编程能力，也有助于在早期发现细微的程序错误。</p>

<p>其次，模拟器的实现必须简单、易于理解和易于调试。此模拟器是一个课程项目级别的模拟器，允许的实现时间有限，因此代码实现必须简单，调试系统必须完备，从而尽可能地减少编写程序和调试程序所需要的时间。</p>

<p>此外，模拟器实现的主要目的是能够被用于简单性能评测，因此必须能够尽可能贴近流水线硬件，并可以扩展出分支预测和缓存模拟等各种功能，便于在真正的程序上实验和评测流水线的性能，以及各种分支预测和缓存模拟策略。</p>

<p>本次模拟器的实现并不是要做一个成熟可用的工业级体系结构模拟器，也就是说，本次模拟器的实现并不注重性能和功能的全面性。在性能上，对于极端复杂和庞大的程序，模拟器的程序会执行缓慢，也有可能会消耗过多内存，对于模拟器本身的性能优化不在本实验的范围内。在功能上，为了实现简单，本模拟器使用自定义的系统调用，而不是兼容Linux的系统调用，因此，此模拟器只能运行专门为此编译的RISC-V程序（程序源码参见<code class="language-plaintext highlighter-rouge">test/</code>文件夹）。</p>

<h3 id="23-编译与运行">2.3 编译与运行</h3>

<p>编译方法与一个典型的CMake项目一样，在编译之前必须先安装CMake。在Linux或者Mac OS X系统上可以采用如下命令</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir build
cd build
cmake ..
make
</code></pre></div></div>

<p>编译会得到可执行程序<code class="language-plaintext highlighter-rouge">Simulator</code>。该模拟器是一个命令行程序，在命令行上的执行方式是</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./Simulator riscv-elf-file-name [-v] [-s] [-d] [-b param]
Parameters: 
        [-v] verbose output 
        [-s] single step
        [-d] dump memory and register trace to dump.txt
        [-b param] branch perdiction strategy, accepted param AT, NT, BTFNT, BPB
</code></pre></div></div>

<p>其中<code class="language-plaintext highlighter-rouge">riscv-elf-file-name</code>对应可执行的RISC-V ELF文件，比如<code class="language-plaintext highlighter-rouge">riscv-elf/</code>文件夹下的所有<code class="language-plaintext highlighter-rouge">*.riscv</code>文件。一个典型的运行流程和输出如下</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hehaodeMacBook-Pro:build hehao$ ./Simulator ../riscv-elf/ackermann.riscv
Ackermann(0,0) = 1
Ackermann(0,1) = 2
Ackermann(0,2) = 3
Ackermann(0,3) = 4
Ackermann(0,4) = 5
Ackermann(1,0) = 2
Ackermann(1,1) = 3
Ackermann(1,2) = 4
Ackermann(1,3) = 5
Ackermann(1,4) = 6
Ackermann(2,0) = 3
Ackermann(2,1) = 5
Ackermann(2,2) = 7
Ackermann(2,3) = 9
Ackermann(2,4) = 11
Ackermann(3,0) = 5
Ackermann(3,1) = 13
Ackermann(3,2) = 29
Ackermann(3,3) = 61
Ackermann(3,4) = 125
Program exit from an exit() system call
------------ STATISTICS -----------
Number of Instructions: 430754
Number of Cycles: 574548
Avg Cycles per Instrcution: 1.3338
Branch Perdiction Accuacy: 0.5045 (Strategy: Always Not Taken)
Number of Control Hazards: 48010
Number of Data Hazards: 279916
Number of Memory Hazards: 47774
-----------------------------------
</code></pre></div></div>

<p>在默认的设置下，一开始会首先打印执行的程序的输出，然后会输出一组关于CPU执行情况的统计数据。</p>

<p>如果要进行单步调试的话，可以使用<code class="language-plaintext highlighter-rouge">-s</code>和<code class="language-plaintext highlighter-rouge">-v</code>参数</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./Simulator ../riscv-elf/ackermann.riscv -s -v
</code></pre></div></div>

<p>得到的输出如下</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hehaodeMacBook-Pro:build hehao$ ./Simulator ../riscv-elf/ackermann.riscv -s -v
==========ELF Information==========
Type: ELF64
Encoding: Little Endian
ISA: RISC-V(0xf3)
Number of Sections: 19
ID      Name            Address Size
[0]                     0x0     0
[1]     .text           0x100b0 3668
[2]     .rodata         0x10f08 29
[3]     .eh_frame       0x10f28 4
[4]     .init_array     0x11000 8
[5]     .fini_array     0x11008 8
[6]     .data           0x11010 1864
[7]     .sdata          0x11758 24
[8]     .sbss           0x11770 8
[9]     .bss            0x11778 72
[10]    .comment        0x0     26
[11]    .debug_aranges  0x0     48
[12]    .debug_info     0x0     46
[13]    .debug_abbrev   0x0     20
[14]    .debug_line     0x0     222
[15]    .debug_str      0x0     267
[16]    .symtab         0x0     2616
[17]    .strtab         0x0     913
[18]    .shstrtab       0x0     172
Number of Segments: 2
ID      Flags   Address FSize   MSize
[0]     0x5     0x10000 3884    3884
[1]     0x6     0x11000 1904    1984
===================================
Memory Pages: 
0x0-0x400000:
  0x10000-0x11000
  0x11000-0x12000
Fetched instruction 0x00002197 at address 0x100b0
Decode: Bubble
Execute: Bubble
Memory Access: Bubble
WriteBack: Bubble
------------ CPU STATE ------------
PC: 0x100b4
zero: 0x00000000(0) ra: 0x00000000(0) sp: 0x80000000(2147483648) gp: 0x00000000(0) 
tp: 0x00000000(0) t0: 0x00000000(0) t1: 0x00000000(0) t2: 0x00000000(0) 
s0: 0x00000000(0) s1: 0x00000000(0) a0: 0x00000000(0) a1: 0x00000000(0) 
a2: 0x00000000(0) a3: 0x00000000(0) a4: 0x00000000(0) a5: 0x00000000(0) 
a6: 0x00000000(0) a7: 0x00000000(0) s2: 0x00000000(0) s3: 0x00000000(0) 
s4: 0x00000000(0) s5: 0x00000000(0) s6: 0x00000000(0) s7: 0x00000000(0) 
s8: 0x00000000(0) s9: 0x00000000(0) s10: 0x00000000(0) s11: 0x00000000(0) 
t3: 0x00000000(0) t4: 0x00000000(0) t5: 0x00000000(0) t6: 0x00000000(0) 
-----------------------------------
Type d to dump memory in dump.txt, press ENTER to continue: 
</code></pre></div></div>

<p>在单步调试中，可以输入<code class="language-plaintext highlighter-rouge">d</code>来保存内存快照，使用ENTER前进到下一条指令。命令行显示的信息包括ELF信息、流水线状态和CPU寄存器状态。</p>

<p>使用<code class="language-plaintext highlighter-rouge">-v</code>参数并重定向标准输出可以得到关于流水线执行状态和寄存器状态的完整历史。</p>

<p>此外，可以使用<code class="language-plaintext highlighter-rouge">-b</code>参数指定不同的分支预测策略，例如</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./Simulator ../riscv-elf/ackermann.riscv -b AT
./Simulator ../riscv-elf/ackermann.riscv -b NT
./Simulator ../riscv-elf/ackermann.riscv -b BTFNT
./Simulator ../riscv-elf/ackermann.riscv -b BPB
</code></pre></div></div>

<p>其中，AT表示Always Taken，NT表示Not Taken，BTFNT表示Back Taken Forward Not Taken，BPB表示Branch Prediction Buffer。</p>

<h3 id="24-代码架构">2.4 代码架构</h3>

<p><img src="https://hehao98.github.io/assets/riscv/arch.png" alt="" /></p>

<p>模拟器代码架构的概览图见上。模拟器的入口是<code class="language-plaintext highlighter-rouge">Main.cpp</code>，其中包含了解析参数、加载ELF文件、初始化模拟器的模块，并在最后调用模拟器的<code class="language-plaintext highlighter-rouge">simulate()</code>函数进入模拟器的执行。除非模拟器执行出错，否者<code class="language-plaintext highlighter-rouge">simulate()</code>函数理论上不会返回。</p>

<p>模拟器本身被设计成一个巨大的类，也就是代码中的<code class="language-plaintext highlighter-rouge">class Simulator</code>(参见<code class="language-plaintext highlighter-rouge">Simulator.h</code>、<code class="language-plaintext highlighter-rouge">Simulator.cpp</code>)。<code class="language-plaintext highlighter-rouge">Simulator</code>类中的数据包含了PC、通用寄存器、流水线寄存器、执行历史记录器、内存模块和分支预测模块，其中，由于内存模块和分支预测模块相对比较独立，因此实现为独立的两个类<code class="language-plaintext highlighter-rouge">MemoryManager</code>和<code class="language-plaintext highlighter-rouge">BranchPredictor</code>。</p>

<p>模拟器中最核心的函数是<code class="language-plaintext highlighter-rouge">simulate()</code>函数，这个函数对模拟器进行周期级模拟，每次模拟中，会执行<code class="language-plaintext highlighter-rouge">fetch()</code>、<code class="language-plaintext highlighter-rouge">decode()</code>、<code class="language-plaintext highlighter-rouge">execute()</code>、<code class="language-plaintext highlighter-rouge">accessMemory()</code>和<code class="language-plaintext highlighter-rouge">writeBack()</code>五个函数，每个函数会以上一个周期的流水线寄存器作为输入，并输出到下一个周期的流水线寄存器。在周期结束时，新的寄存器的内容会被拷贝到作为输入的寄存器中。在执行过程中，每个函数都会处理有关数据、控制和内存访问冒险的内容，并且在适当的地方记录历史信息。由于之间的交互关系比较复杂，因此在上图中并没有画出。由于相关函数代码过长，不便于在此贴出，因此关于实现的更多细节请参见<code class="language-plaintext highlighter-rouge">src/Simulator.cpp</code>。</p>

<h2 id="三具体设计和实现">三、具体设计和实现</h2>

<h3 id="31-内存管理模块memorymanager">3.1 内存管理模块<code class="language-plaintext highlighter-rouge">MemoryManager</code></h3>

<p><code class="language-plaintext highlighter-rouge">MemoryManager</code>的功能是为模拟器提供一个简单易使用的内存访问接口，必须支持任意内存大小、内存地址的访存，还要能检测到非法内存地址访问。事实上，这非常类似于操作系统中虚拟内存的机制。因此，<code class="language-plaintext highlighter-rouge">MemoryManager</code>的内部实现采用了类似x86体系结构中使用的二级页表的机制。具体地说，将32位内存空间在逻辑上划分为大小为4KB(2^12)的页，并且采用内存地址的前10位作为一级页表的索引，紧接着10位作为二级页表的索引，最后12位作为一个内存页里的下标。</p>

<p>页表结构可以如下声明</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">uint8_t</span> <span class="o">**</span><span class="n">memory</span><span class="p">[</span><span class="mi">1024</span><span class="p">];</span>
</code></pre></div></div>

<p>其中，<code class="language-plaintext highlighter-rouge">memory</code>指向一个长度为1024的一级页表数组，<code class="language-plaintext highlighter-rouge">memory[i]</code>指向长度为1024的二级页表数组，<code class="language-plaintext highlighter-rouge">memory[i][j]</code>指向具体的内存页，<code class="language-plaintext highlighter-rouge">memory[i][j][k]</code>可以取出内存地址为<code class="language-plaintext highlighter-rouge">(i&lt;&lt;22)|(j&lt;&lt;12)|k</code>的一个字节。可以在需要的时候对<code class="language-plaintext highlighter-rouge">memory</code>进行动态内存分配和释放。模拟器对<code class="language-plaintext highlighter-rouge">memory</code>的一个访存过程的示例如下</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">uint8_t</span> <span class="n">MemoryManager</span><span class="o">::</span><span class="n">getByte</span><span class="p">(</span><span class="kt">uint32_t</span> <span class="n">addr</span><span class="p">)</span> <span class="p">{</span>
  <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="k">this</span><span class="o">-&gt;</span><span class="n">isAddrExist</span><span class="p">(</span><span class="n">addr</span><span class="p">))</span> <span class="p">{</span>
    <span class="n">dbgprintf</span><span class="p">(</span><span class="s">"Byte read to invalid addr 0x%x!</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">addr</span><span class="p">);</span>
    <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
  <span class="p">}</span>
  <span class="kt">uint32_t</span> <span class="n">i</span> <span class="o">=</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">getFirstEntryId</span><span class="p">(</span><span class="n">addr</span><span class="p">);</span>
  <span class="kt">uint32_t</span> <span class="n">j</span> <span class="o">=</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">getSecondEntryId</span><span class="p">(</span><span class="n">addr</span><span class="p">);</span>
  <span class="kt">uint32_t</span> <span class="n">k</span> <span class="o">=</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">getPageOffset</span><span class="p">(</span><span class="n">addr</span><span class="p">);</span>
  <span class="k">return</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">memory</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="n">j</span><span class="p">][</span><span class="n">k</span><span class="p">];</span>
<span class="p">}</span>
</code></pre></div></div>

<p>关于<code class="language-plaintext highlighter-rouge">MemoryManager</code>实现的更多信息，参见<code class="language-plaintext highlighter-rouge">src/MemoryManager.cpp</code>。</p>

<h3 id="32-可执行文件的装载初始化">3.2 可执行文件的装载、初始化</h3>

<p>本模拟器的可执行文件加载部分采用了GitHub上的开源库ELFIO(https://github.com/serge1/ELFIO)，由于这个库只有头文件，所以导入工程相当容易，相关头文件在<code class="language-plaintext highlighter-rouge">include/</code>文件夹下。</p>

<p>使用这个库进行ELF文件加载相当容易</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Read ELF file</span>
<span class="n">ELFIO</span><span class="o">::</span><span class="n">elfio</span> <span class="n">reader</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">reader</span><span class="p">.</span><span class="n">load</span><span class="p">(</span><span class="n">elfFile</span><span class="p">))</span> <span class="p">{</span>
  <span class="n">fprintf</span><span class="p">(</span><span class="n">stderr</span><span class="p">,</span> <span class="s">"Fail to load ELF file %s!</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">elfFile</span><span class="p">);</span>
  <span class="k">return</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>加载ELF文件进内存的代码如下，直接按照ELF文件头的信息将每个数据段拷贝到指定的内存位置即可，唯一需要注意的是文件内数据长度可能小于指定的内存长度，需要用0填充。值得一提的是本模拟器在设计时并未考虑支持32位以上的内存，因为内存占用如此之大的用户程序是比较罕见的，在我们用的测试程序中不会出现这种情况。</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">loadElfToMemory</span><span class="p">(</span><span class="n">ELFIO</span><span class="o">::</span><span class="n">elfio</span> <span class="o">*</span><span class="n">reader</span><span class="p">,</span> <span class="n">MemoryManager</span> <span class="o">*</span><span class="n">memory</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">ELFIO</span><span class="o">::</span><span class="n">Elf_Half</span> <span class="n">seg_num</span> <span class="o">=</span> <span class="n">reader</span><span class="o">-&gt;</span><span class="n">segments</span><span class="p">.</span><span class="n">size</span><span class="p">();</span>
  <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">seg_num</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">const</span> <span class="n">ELFIO</span><span class="o">::</span><span class="n">segment</span> <span class="o">*</span><span class="n">pseg</span> <span class="o">=</span> <span class="n">reader</span><span class="o">-&gt;</span><span class="n">segments</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>

    <span class="kt">uint64_t</span> <span class="n">fullmemsz</span> <span class="o">=</span> <span class="n">pseg</span><span class="o">-&gt;</span><span class="n">get_memory_size</span><span class="p">();</span>
    <span class="kt">uint64_t</span> <span class="n">fulladdr</span> <span class="o">=</span> <span class="n">pseg</span><span class="o">-&gt;</span><span class="n">get_virtual_address</span><span class="p">();</span>
    <span class="c1">// Our 32bit simulator cannot handle this</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">fulladdr</span> <span class="o">+</span> <span class="n">fullmemsz</span> <span class="o">&gt;</span> <span class="mh">0xFFFFFFFF</span><span class="p">)</span> <span class="p">{</span>
      <span class="n">dbgprintf</span><span class="p">(</span>
          <span class="s">"ELF address space larger than 32bit! Seg %d has max addr of 0x%lx</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span>
          <span class="n">i</span><span class="p">,</span> <span class="n">fulladdr</span> <span class="o">+</span> <span class="n">fullmemsz</span><span class="p">);</span>
      <span class="n">exit</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="kt">uint32_t</span> <span class="n">filesz</span> <span class="o">=</span> <span class="n">pseg</span><span class="o">-&gt;</span><span class="n">get_file_size</span><span class="p">();</span>
    <span class="kt">uint32_t</span> <span class="n">memsz</span> <span class="o">=</span> <span class="n">pseg</span><span class="o">-&gt;</span><span class="n">get_memory_size</span><span class="p">();</span>
    <span class="kt">uint32_t</span> <span class="n">addr</span> <span class="o">=</span> <span class="p">(</span><span class="kt">uint32_t</span><span class="p">)</span><span class="n">pseg</span><span class="o">-&gt;</span><span class="n">get_virtual_address</span><span class="p">();</span>

    <span class="k">for</span> <span class="p">(</span><span class="kt">uint32_t</span> <span class="n">p</span> <span class="o">=</span> <span class="n">addr</span><span class="p">;</span> <span class="n">p</span> <span class="o">&lt;</span> <span class="n">addr</span> <span class="o">+</span> <span class="n">memsz</span><span class="p">;</span> <span class="o">++</span><span class="n">p</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">memory</span><span class="o">-&gt;</span><span class="n">isPageExist</span><span class="p">(</span><span class="n">p</span><span class="p">))</span> <span class="p">{</span>
        <span class="n">memory</span><span class="o">-&gt;</span><span class="n">addPage</span><span class="p">(</span><span class="n">p</span><span class="p">);</span>
      <span class="p">}</span>

      <span class="k">if</span> <span class="p">(</span><span class="n">p</span> <span class="o">&lt;</span> <span class="n">addr</span> <span class="o">+</span> <span class="n">filesz</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">memory</span><span class="o">-&gt;</span><span class="n">setByte</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">pseg</span><span class="o">-&gt;</span><span class="n">get_data</span><span class="p">()[</span><span class="n">p</span> <span class="o">-</span> <span class="n">addr</span><span class="p">]);</span>
      <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
        <span class="n">memory</span><span class="o">-&gt;</span><span class="n">setByte</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
      <span class="p">}</span>
    <span class="p">}</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>最后，需要在模拟器初始化时手动设置PC的值。模拟器还需要很多其他的初始化操作，具体可以参考<code class="language-plaintext highlighter-rouge">src/Main.cpp</code>。</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">simulator</span><span class="p">.</span><span class="n">pc</span> <span class="o">=</span> <span class="n">reader</span><span class="p">.</span><span class="n">get_entry</span><span class="p">();</span>
</code></pre></div></div>

<h3 id="33-指令语义的解析和控制信号的处理">3.3 指令语义的解析和控制信号的处理</h3>

<p>本小节中涉及代码由于普遍过长，且存在非常强的相互依赖，单独贴出可能难以理解，因此不会在此直接贴出代码，具体内容请参见<code class="language-plaintext highlighter-rouge">src/Simulator.cpp</code>。</p>

<p>指令的取值过程参见<code class="language-plaintext highlighter-rouge">Simulator::fetch()</code>函数，由于RV64I指令集都是4字节定长，所以实现起来非常简单。</p>

<p>指令的解码过程参见<code class="language-plaintext highlighter-rouge">Simulator::decode()</code>函数，其中绝大多数内容都是对RISC-V Specification 2.2中规定的指令编码的直接翻译。在解码过程中，为了便于调试，<code class="language-plaintext highlighter-rouge">decode()</code>函数会按照RISC-V汇编格式翻译出指令字符串。此外，<code class="language-plaintext highlighter-rouge">decode</code>函数会模仿硬件实现在指令中抽象出<code class="language-plaintext highlighter-rouge">op1</code>、<code class="language-plaintext highlighter-rouge">op2</code>、<code class="language-plaintext highlighter-rouge">dest</code>等几个共有的域。分支预测模块会在解码阶段做出预测判断。</p>

<p>指令的执行过程参见<code class="language-plaintext highlighter-rouge">Simulator::execute()</code>函数，这个函数简单粗暴地根据指令类型直接执行相应的行为。在结尾，会根据当前指令和解码阶段的情况，检测数据冒险、控制冒险和内存访问冒险，并作出相应的操作。在这个阶段，跳转指令会得到是否跳转的结果，并在预测错误的情况下在流水线寄存器中插入对应的Bubble。</p>

<p>指令的访存过程参见<code class="language-plaintext highlighter-rouge">Simulator::memoryAccess()</code>函数，这个函数首先执行内存读写操作，并且检测数据冒险和转发数据。在检测数据冒险时，既需要考虑到一般的数据冒险，也必须考虑到上个周期因为内存访问冒险而流水线Stall的情况，此外，也必须考虑数据转发的优先级，<code class="language-plaintext highlighter-rouge">memoryAccess()</code>作为后面的指令，数据转发的优先级是低于<code class="language-plaintext highlighter-rouge">execute()</code>的，否则可能会出现较老的数据被转发并覆盖新数据的情况。</p>

<p>指令的写回过程参见<code class="language-plaintext highlighter-rouge">Simulator::writeBack()</code>函数，这个函数将执行结果写回寄存器，并且类似之前的情况处理相关的数据冒险。</p>

<p>流水线寄存器的控制信号设置如下，注意其中fReg表示的是下一个周期开始时，从取值阶段传输到解码阶段的数据，以此类推。</p>

<table>
  <thead>
    <tr>
      <th>出现的情况</th>
      <th>fReg</th>
      <th>dReg</th>
      <th>eReg</th>
      <th>mReg</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>分支预测错误</td>
      <td>Bubble</td>
      <td>Bubble</td>
      <td>Normal</td>
      <td>Normal</td>
    </tr>
    <tr>
      <td>内存访问冒险</td>
      <td>Stall</td>
      <td>Stall</td>
      <td>Bubble</td>
      <td>Normal</td>
    </tr>
    <tr>
      <td>预测跳转</td>
      <td>Bubble</td>
      <td>Normal</td>
      <td>Normal</td>
      <td>Normal</td>
    </tr>
  </tbody>
</table>

<p>有一种情况需要特别说明，就是分支预测器的情况。在当前的模拟器设计中，由于到了解码阶段结束才得知跳转指令的存在，因此如果预测跳转的话必须向流水线中插入一个Bubble，才能确保取指阶段取出的是跳转后的指令。这不会增加分支预测错误的开销，但是会使得预测正确的开销多了一个周期。如果要改进这个设计的话，必须将分支预测模块转移到取指阶段实现。</p>

<h3 id="34-系统调用和库函数接口的处理">3.4 系统调用和库函数接口的处理</h3>

<p>本模拟器使用自定义的系统调用接口。系统调用的<code class="language-plaintext highlighter-rouge">ecall</code>指令会使用<code class="language-plaintext highlighter-rouge">a0</code>和<code class="language-plaintext highlighter-rouge">a7</code>寄存器，其中<code class="language-plaintext highlighter-rouge">a7</code>寄存器保存的是系统调用号，<code class="language-plaintext highlighter-rouge">a0</code>寄存器保存的是系统调用参数，返回值会保存在<code class="language-plaintext highlighter-rouge">a0</code>寄存器中。为了能让系统调用指令能被集成进当前的流水线，<code class="language-plaintext highlighter-rouge">ecall</code>指令只支持一个返回值和一个参数。所有系统调用的语义见下表。</p>

<table>
  <thead>
    <tr>
      <th>系统调用名称</th>
      <th>系统调用号</th>
      <th>参数</th>
      <th>返回值</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>输出字符串</td>
      <td>0</td>
      <td>字符串起始地址</td>
      <td>无</td>
    </tr>
    <tr>
      <td>输出字符</td>
      <td>1</td>
      <td>字符的值</td>
      <td>无</td>
    </tr>
    <tr>
      <td>输出数字</td>
      <td>2</td>
      <td>数字的值</td>
      <td>无</td>
    </tr>
    <tr>
      <td>退出程序</td>
      <td>3</td>
      <td>无</td>
      <td>无</td>
    </tr>
    <tr>
      <td>读入字符</td>
      <td>4</td>
      <td>无</td>
      <td>读入的字符</td>
    </tr>
    <tr>
      <td>读入数字</td>
      <td>5</td>
      <td>无</td>
      <td>读入的数字</td>
    </tr>
  </tbody>
</table>

<p>对应的系统调用接口如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">print_d</span><span class="p">(</span><span class="kt">int</span> <span class="n">num</span><span class="p">);</span>
<span class="kt">void</span> <span class="nf">print_s</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">str</span><span class="p">);</span>
<span class="kt">void</span> <span class="nf">print_c</span><span class="p">(</span><span class="kt">char</span> <span class="n">ch</span><span class="p">);</span>
<span class="kt">void</span> <span class="nf">exit_proc</span><span class="p">();</span>
<span class="kt">char</span> <span class="nf">read_char</span><span class="p">();</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="nf">read_num</span><span class="p">();</span>
</code></pre></div></div>

<p>具体的实现需要使用内联汇编，请参考<code class="language-plaintext highlighter-rouge">test/lib.c</code>。</p>

<h3 id="35-性能计数相关模块">3.5 性能计数相关模块</h3>

<p>在当前模拟器架构下，对于模拟器进行性能统计只需在代码里适当的地方加入统计代码即可。数据统计模块的定义如下</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">History</span> <span class="p">{</span>
  <span class="kt">uint32_t</span> <span class="n">instCount</span><span class="p">;</span>
  <span class="kt">uint32_t</span> <span class="n">cycleCount</span><span class="p">;</span>
  <span class="kt">uint32_t</span> <span class="n">predictedBranch</span><span class="p">;</span> <span class="c1">// Number of branch that is predicted successfully</span>
  <span class="kt">uint32_t</span> <span class="n">unpredictedBranch</span><span class="p">;</span> <span class="c1">// Number of branch that is not predicted successfully</span>
  <span class="kt">uint32_t</span> <span class="n">dataHazardCount</span><span class="p">;</span>
  <span class="kt">uint32_t</span> <span class="n">controlHazardCount</span><span class="p">;</span>
  <span class="kt">uint32_t</span> <span class="n">memoryHazardCount</span><span class="p">;</span>
  <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">string</span><span class="o">&gt;</span> <span class="n">instRecord</span><span class="p">;</span>
  <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">string</span><span class="o">&gt;</span> <span class="n">regRecord</span><span class="p">;</span>
  <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">memoryDump</span><span class="p">;</span>
<span class="p">}</span> <span class="n">history</span><span class="p">;</span>
</code></pre></div></div>

<p>其中，最后三个数据项用于记载CPU的执行历史，便于在调试的时候使用。为了防止模拟器占用过多内存，<code class="language-plaintext highlighter-rouge">instRecord</code>和<code class="language-plaintext highlighter-rouge">regRecord</code>当内容多于100000条时会被清空，<code class="language-plaintext highlighter-rouge">memoryDump</code>只会在要求生成内存快照时被使用。</p>

<h3 id="36-调试接口">3.6 调试接口</h3>

<p>由于对CPU模拟器的调试相对比较困难，CPU模拟器的调试接口和错误执行接口必须被非常小心地设计，以便于尽可能早地发现程序中的Bug。在当前模拟器的代码中，存在大量对模拟器状态和输入值合法性的检查，以便尽可能早地发现错误。<code class="language-plaintext highlighter-rouge">Simulator</code>类中存在专门的错误处理函数<code class="language-plaintext highlighter-rouge">panic()</code>。</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">Simulator</span><span class="o">::</span><span class="n">panic</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">format</span><span class="p">,</span> <span class="p">...)</span> <span class="p">{</span>
  <span class="kt">char</span> <span class="n">buf</span><span class="p">[</span><span class="n">BUFSIZ</span><span class="p">];</span>
  <span class="kt">va_list</span> <span class="n">args</span><span class="p">;</span>
  <span class="n">va_start</span><span class="p">(</span><span class="n">args</span><span class="p">,</span> <span class="n">format</span><span class="p">);</span>
  <span class="n">vsprintf</span><span class="p">(</span><span class="n">buf</span><span class="p">,</span> <span class="n">format</span><span class="p">,</span> <span class="n">args</span><span class="p">);</span>
  <span class="n">fprintf</span><span class="p">(</span><span class="n">stderr</span><span class="p">,</span> <span class="s">"%s"</span><span class="p">,</span> <span class="n">buf</span><span class="p">);</span>
  <span class="n">va_end</span><span class="p">(</span><span class="n">args</span><span class="p">);</span>
  <span class="k">this</span><span class="o">-&gt;</span><span class="n">dumpHistory</span><span class="p">();</span>
  <span class="n">fprintf</span><span class="p">(</span><span class="n">stderr</span><span class="p">,</span> <span class="s">"Execution history and memory dump in dump.txt</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
  <span class="n">exit</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>此外，模拟器还支持单步调试和<code class="language-plaintext highlighter-rouge">verbose</code>输出的功能，使用<code class="language-plaintext highlighter-rouge">-s</code>和<code class="language-plaintext highlighter-rouge">-v</code>参数即可开启单步调试模式。使用<code class="language-plaintext highlighter-rouge">-v</code>参数并重定向标准输出可以得到寄存器状态和流水线状态的完整执行历史并在事后进行分析。一条典型的CPU执行状态记录如下</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Fetched instruction 0x00000593 at address 0x100c4
Decoded instruction 0x40a60633 as sub a2,a2,a0
Execute: addi
  Forward Data a2 to Decode op1
Memory Access: addi
  Forward Data a0 to Decode op2
WriteBack: addi
------------ CPU STATE ------------
PC: 0x100c8
zero: 0x00000000(0) ra: 0x00000000(0) sp: 0x80000000(2147483648) gp: 0x00011f58(73560) 
tp: 0x00000000(0) t0: 0x00000000(0) t1: 0x00000000(0) t2: 0x00000000(0) 
s0: 0x00000000(0) s1: 0x00000000(0) a0: 0x00000000(0) a1: 0x00000000(0) 
a2: 0x00000000(0) a3: 0x00000000(0) a4: 0x00000000(0) a5: 0x00000000(0) 
a6: 0x00000000(0) a7: 0x00000000(0) s2: 0x00000000(0) s3: 0x00000000(0) 
s4: 0x00000000(0) s5: 0x00000000(0) s6: 0x00000000(0) s7: 0x00000000(0) 
s8: 0x00000000(0) s9: 0x00000000(0) s10: 0x00000000(0) s11: 0x00000000(0) 
t3: 0x00000000(0) t4: 0x00000000(0) t5: 0x00000000(0) t6: 0x00000000(0) 
-----------------------------------
</code></pre></div></div>

<h3 id="37-实现中遇到的坑">3.7 实现中遇到的坑</h3>

<p>在整个实现中，我在第一阶段的单周期指令级模拟的实现并没有遇到什么问题，但是流水线相关的模拟中，遇到了几个相当微妙的错误。</p>

<ol>
  <li>一个根本的困难在于我们对流水线的模拟程序本质上还是线性执行的，并不能像硬件那样多阶段并行执行。因此，必须非常小心地设计五个阶段的代码的执行流和对数据结构的访问，才能模拟出硬件的效果。</li>
  <li>当多个阶段发现数据冒险并向前转发数据时，必须优先传送更新的数据。在模拟器中，由于相关阶段的执行顺序是执行-&gt;访存-&gt;写回，因此会存在前面的阶段向前转发的数据被后面的阶段的旧数据覆盖的可能。对于这种情况，模拟器中必须加以特别的判定。</li>
  <li>分支预测模块应当在解码阶段根据预测结果修改PC的值，但是，如果这个跳转指令是被错误取进来，并且应该在之后被Bubble的话怎么办？必须想办法恢复被修改的PC值，或者延迟写入预测的PC值。</li>
  <li>也是由于代码是顺序执行的，因此当执行阶段发现访存指令，而解码阶段的指令依赖访存数据并导致内存冒险时，必须非常小心地设计整个执行过程和数据访问流程，才能模拟出正确的结果。</li>
  <li>用于系统调用的<code class="language-plaintext highlighter-rouge">ecall</code>指令也会导致数据冒险！并且产生数据冒险的条目，取决于这个系统调用的参数数量和其对应的寄存器！当前的系统调用会依赖的寄存器有<code class="language-plaintext highlighter-rouge">a0</code>和<code class="language-plaintext highlighter-rouge">a7</code>两个，因此刚好能作为<code class="language-plaintext highlighter-rouge">op1</code>和<code class="language-plaintext highlighter-rouge">op2</code>塞入流水线，但是如果系统调用需要的参数更多，实现将会变得更为复杂。</li>
  <li><code class="language-plaintext highlighter-rouge">zero</code>寄存器是一个相当独特的存在，理论上他任何时候值应该都是0，所以进行数据转发的时候必须处处特判零寄存器，如果向零寄存器里的值进行数据转发就会导致非常难以发现的错误。</li>
</ol>

<h2 id="四功能测试与性能评测">四、功能测试与性能评测</h2>

<h3 id="41-模拟器的功能正确性测试">4.1 模拟器的功能正确性测试</h3>

<p>我自己编写的测试程序见下表，注意所有的程序都需要和<code class="language-plaintext highlighter-rouge">test/lib.c</code>一起编译。</p>

<table>
  <thead>
    <tr>
      <th>代码文件</th>
      <th>对应的ELF文件</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/helloworld.c</code></td>
      <td><code class="language-plaintext highlighter-rouge">riscv-elf/helloworld.riscv</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/test_arithmetic.c</code></td>
      <td><code class="language-plaintext highlighter-rouge">riscv-elf/test_arithmetic.riscv</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/test_syscall.c</code></td>
      <td><code class="language-plaintext highlighter-rouge">riscv-elf/test_syscall.riscv</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/test_branch.c</code></td>
      <td><code class="language-plaintext highlighter-rouge">riscv-elf/test_branch.riscv</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/quicksort.c</code></td>
      <td><code class="language-plaintext highlighter-rouge">riscv-elf/quicksort.riscv</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/matrixmulti.c</code></td>
      <td><code class="language-plaintext highlighter-rouge">riscv-elf/matrixmulti.riscv</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/ackermann.c</code></td>
      <td><code class="language-plaintext highlighter-rouge">riscv-elf/ackermann.riscv</code></td>
    </tr>
  </tbody>
</table>

<p>每个代码文件的功能描述如下</p>

<table>
  <thead>
    <tr>
      <th>代码文件</th>
      <th>功能描述</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/helloworld.c</code></td>
      <td>最简单的Hello, World</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/test_arithmetic.c</code></td>
      <td>测试一组算术运算</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/test_syscall.c</code></td>
      <td>测试全部的系统调用</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/test_branch.c</code></td>
      <td>测试条件和循环语句</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/quicksort.c</code></td>
      <td>分别对10和100个元素进行快速排序</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/matrixmulti.c</code></td>
      <td>10*10矩阵乘法</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">test/ackermann.c</code></td>
      <td>求解一组Ackermann函数的值</td>
    </tr>
  </tbody>
</table>

<p>如果模拟器程序<code class="language-plaintext highlighter-rouge">Simulator</code>在项目中的<code class="language-plaintext highlighter-rouge">build/</code>目录下，可以运行如下命令，得到运行结果，来验证模拟器的正确性。注意<code class="language-plaintext highlighter-rouge">test_syscall.riscv</code>程序中存在用户输入的部分。</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./Simulator ../riscv-elf/helloworld.riscv
./Simulator ../riscv-elf/test_arithmetic.riscv
./Simulator ../riscv-elf/test_syscall.riscv
./Simulator ../riscv-elf/test_branch.riscv
./Simulator ../riscv-elf/quicksort.riscv
./Simulator ../riscv-elf/matrixmulti.riscv
./Simulator ../riscv-elf/ackermann.riscv
</code></pre></div></div>

<p>得到的执行结果如下</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hehaodeMacBook-Pro:build hehao$ ./Simulator ../riscv-elf/helloworld.riscv
Hello, World!
Program exit from an exit() system call
------------ STATISTICS -----------
Number of Instructions: 141
Number of Cycles: 188
Avg Cycles per Instrcution: 1.3333
Branch Perdiction Accuacy: 0.5833 (Strategy: Always Not Taken)
Number of Control Hazards: 23
Number of Data Hazards: 73
Number of Memory Hazards: 1
-----------------------------------
hehaodeMacBook-Pro:build hehao$ ./Simulator ../riscv-elf/test_arithmetic.riscv
30
-10
370350
411
49380
771
Program exit from an exit() system call
------------ STATISTICS -----------
Number of Instructions: 508
Number of Cycles: 703
Avg Cycles per Instrcution: 1.3839
Branch Perdiction Accuacy: 0.4268 (Strategy: Always Not Taken)
Number of Control Hazards: 91
Number of Data Hazards: 224
Number of Memory Hazards: 13
-----------------------------------
hehaodeMacBook-Pro:build hehao$ ./Simulator ../riscv-elf/test_syscall.riscv
This is string from print_s()
123456abc
Enter a number: 123456
The number is: 123456
Enter a character: g
The character is: g
Program exit from an exit() system call
------------ STATISTICS -----------
Number of Instructions: 350
Number of Cycles: 461
Avg Cycles per Instrcution: 1.3171
Branch Perdiction Accuacy: 0.5833 (Strategy: Always Not Taken)
Number of Control Hazards: 53
Number of Data Hazards: 178
Number of Memory Hazards: 5
-----------------------------------
hehaodeMacBook-Pro:build hehao$ ./Simulator ../riscv-elf/quicksort.riscv
Prev A: 5 3 5 6 7 1 3 5 6 1 
Sorted A: 1 1 3 3 5 5 5 6 6 7 
Prev B: 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 
Sorted B: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 
Program exit from an exit() system call
------------ STATISTICS -----------
Number of Instructions: 103671
Number of Cycles: 141697
Avg Cycles per Instrcution: 1.3668
Branch Perdiction Accuacy: 0.4926 (Strategy: Always Not Taken)
Number of Control Hazards: 7314
Number of Data Hazards: 86448
Number of Memory Hazards: 23398
-----------------------------------
hehaodeMacBook-Pro:build hehao$ ./Simulator ../riscv-elf/matrixmulti.riscv
The content of A is: 
0 0 0 0 0 0 0 0 0 0 
1 1 1 1 1 1 1 1 1 1 
2 2 2 2 2 2 2 2 2 2 
3 3 3 3 3 3 3 3 3 3 
4 4 4 4 4 4 4 4 4 4 
5 5 5 5 5 5 5 5 5 5 
6 6 6 6 6 6 6 6 6 6 
7 7 7 7 7 7 7 7 7 7 
8 8 8 8 8 8 8 8 8 8 
9 9 9 9 9 9 9 9 9 9 
The content of B is: 
0 1 2 3 4 5 6 7 8 9 
0 1 2 3 4 5 6 7 8 9 
0 1 2 3 4 5 6 7 8 9 
0 1 2 3 4 5 6 7 8 9 
0 1 2 3 4 5 6 7 8 9 
0 1 2 3 4 5 6 7 8 9 
0 1 2 3 4 5 6 7 8 9 
0 1 2 3 4 5 6 7 8 9 
0 1 2 3 4 5 6 7 8 9 
0 1 2 3 4 5 6 7 8 9 
The content of C=A*B is: 
0 0 0 0 0 0 0 0 0 0 
0 10 20 30 40 50 60 70 80 90 
0 20 40 60 80 100 120 140 160 180 
0 30 60 90 120 150 180 210 240 270 
0 40 80 120 160 200 240 280 320 360 
0 50 100 150 200 250 300 350 400 450 
0 60 120 180 240 300 360 420 480 540 
0 70 140 210 280 350 420 490 560 630 
0 80 160 240 320 400 480 560 640 720 
0 90 180 270 360 450 540 630 720 810 
Program exit from an exit() system call
------------ STATISTICS -----------
Number of Instructions: 225441
Number of Cycles: 318532
Avg Cycles per Instrcution: 1.4129
Branch Perdiction Accuacy: 0.3765 (Strategy: Always Not Taken)
Number of Control Hazards: 40678
Number of Data Hazards: 110957
Number of Memory Hazards: 11735
-----------------------------------
hehaodeMacBook-Pro:build hehao$ ./Simulator ../riscv-elf/ackermann.riscv
Ackermann(0,0) = 1
Ackermann(0,1) = 2
Ackermann(0,2) = 3
Ackermann(0,3) = 4
Ackermann(0,4) = 5
Ackermann(1,0) = 2
Ackermann(1,1) = 3
Ackermann(1,2) = 4
Ackermann(1,3) = 5
Ackermann(1,4) = 6
Ackermann(2,0) = 3
Ackermann(2,1) = 5
Ackermann(2,2) = 7
Ackermann(2,3) = 9
Ackermann(2,4) = 11
Ackermann(3,0) = 5
Ackermann(3,1) = 13
Ackermann(3,2) = 29
Ackermann(3,3) = 61
Ackermann(3,4) = 125
Program exit from an exit() system call
------------ STATISTICS -----------
Number of Instructions: 430754
Number of Cycles: 574548
Avg Cycles per Instrcution: 1.3338
Branch Perdiction Accuacy: 0.5045 (Strategy: Always Not Taken)
Number of Control Hazards: 48010
Number of Data Hazards: 279916
Number of Memory Hazards: 47774
-----------------------------------
</code></pre></div></div>

<h3 id="42-运行给定的5个测试程序">4.2 运行给定的5个测试程序</h3>

<h4 id="421-原始的执行结果">4.2.1 原始的执行结果</h4>

<p>给定的5个程序在<code class="language-plaintext highlighter-rouge">test-inclass/</code>文件夹中，有如下5个</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>add.c
mul-div.c
n!.c
qsort.c
simple-function.c
</code></pre></div></div>

<p>类似之前的执行方式，得到如下原始运行结果</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hehaodeMacBook-Pro:build hehao$ ./Simulator ../test-inclass/add.riscv
Program exit from an exit() system call
------------ STATISTICS -----------
Number of Instructions: 876
Number of Cycles: 1183
Avg Cycles per Instrcution: 1.3505
Branch Perdiction Accuacy: 0.4639 (Strategy: Always Not Taken)
Number of Control Hazards: 124
Number of Data Hazards: 433
Number of Memory Hazards: 58
-----------------------------------
hehaodeMacBook-Pro:build hehao$ ./Simulator ../test-inclass/mul-div.riscv
Program exit from an exit() system call
------------ STATISTICS -----------
Number of Instructions: 901
Number of Cycles: 1208
Avg Cycles per Instrcution: 1.3407
Branch Perdiction Accuacy: 0.4639 (Strategy: Always Not Taken)
Number of Control Hazards: 124
Number of Data Hazards: 463
Number of Memory Hazards: 58
-----------------------------------
hehaodeMacBook-Pro:build hehao$ ./Simulator ../test-inclass/n\!.riscv
Program exit from an exit() system call
------------ STATISTICS -----------
Number of Instructions: 1112
Number of Cycles: 1525
Avg Cycles per Instrcution: 1.3714
Branch Perdiction Accuacy: 0.4661 (Strategy: Always Not Taken)
Number of Control Hazards: 189
Number of Data Hazards: 515
Number of Memory Hazards: 34
-----------------------------------
hehaodeMacBook-Pro:build hehao$ ./Simulator ../test-inclass/qsort.riscv
Program exit from an exit() system call
------------ STATISTICS -----------
Number of Instructions: 19427
Number of Cycles: 25328
Avg Cycles per Instrcution: 1.3038
Branch Perdiction Accuacy: 0.4701 (Strategy: Always Not Taken)
Number of Control Hazards: 1363
Number of Data Hazards: 14156
Number of Memory Hazards: 3174
-----------------------------------
hehaodeMacBook-Pro:build hehao$ ./Simulator ../test-inclass/simple-function.riscv
Program exit from an exit() system call
------------ STATISTICS -----------
Number of Instructions: 886
Number of Cycles: 1197
Avg Cycles per Instrcution: 1.3510
Branch Perdiction Accuacy: 0.4639 (Strategy: Always Not Taken)
Number of Control Hazards: 126
Number of Data Hazards: 438
Number of Memory Hazards: 58
-----------------------------------
</code></pre></div></div>

<p>从这些原始数据中可以分析得到要求的结果，下面会对这些结果进行总结。</p>

<h4 id="422-动态执行的指令数">4.2.2 动态执行的指令数</h4>

<table>
  <thead>
    <tr>
      <th>程序名</th>
      <th>执行的指令数</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">add.riscv</code></td>
      <td>876</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">mul-div.riscv</code></td>
      <td>901</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">n!.riscv</code></td>
      <td>1112</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">qsort.riscv</code></td>
      <td>19427</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">simple-function.riscv</code></td>
      <td>886</td>
    </tr>
  </tbody>
</table>

<h4 id="423-执行周期数和平均cpi">4.2.3 执行周期数和平均CPI</h4>

<table>
  <thead>
    <tr>
      <th>程序名</th>
      <th>执行周期数</th>
      <th>平均CPI</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">add.riscv</code></td>
      <td>1183</td>
      <td>1.3505</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">mul-div.riscv</code></td>
      <td>1208</td>
      <td>1.3407</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">n!.riscv</code></td>
      <td>1525</td>
      <td>1.3714</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">qsort.riscv</code></td>
      <td>25328</td>
      <td>1.3038</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">simple-function.riscv</code></td>
      <td>1197</td>
      <td>1.3510</td>
    </tr>
  </tbody>
</table>

<p>可以发现，对于各种类型的程序，本模拟器流水线实现的平均CPI在1.33左右，和单周期相比能实现大约3.76倍的指令吞吐量。</p>

<h4 id="424-不同类型的冒险统计">4.2.4 不同类型的冒险统计</h4>

<table>
  <thead>
    <tr>
      <th>程序名</th>
      <th>数据冒险</th>
      <th>控制冒险</th>
      <th>内存访问冒险</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">add.riscv</code></td>
      <td>433</td>
      <td>124</td>
      <td>58</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">mul-div.riscv</code></td>
      <td>463</td>
      <td>124</td>
      <td>58</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">n!.riscv</code></td>
      <td>515</td>
      <td>189</td>
      <td>34</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">qsort.riscv</code></td>
      <td>14156</td>
      <td>1363</td>
      <td>3174</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">simple-function.riscv</code></td>
      <td>438</td>
      <td>126</td>
      <td>58</td>
    </tr>
  </tbody>
</table>

<h2 id="五其它内容">五、其它内容</h2>

<h3 id="51-分支预测模块">5.1 分支预测模块</h3>

<p>分支预测模块是一个相对比较独立的模块，因此单独实现为<code class="language-plaintext highlighter-rouge">BranchPredictor</code>类。<code class="language-plaintext highlighter-rouge">BranchPredictor</code>类需要指定一个分支预测有关的策略，并保存与这个策略有关的数据结构。本模拟器实现了如下几种策略</p>

<table>
  <thead>
    <tr>
      <th>策略名称</th>
      <th>策略说明</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>NT</td>
      <td>Always Not Taken</td>
    </tr>
    <tr>
      <td>AT</td>
      <td>Always Taken</td>
    </tr>
    <tr>
      <td>BTFNT</td>
      <td>Back Taken Forward Not Taken</td>
    </tr>
    <tr>
      <td>BPB</td>
      <td>Branch Prediction Buffer</td>
    </tr>
  </tbody>
</table>

<p>其中，Branch Prediction Buffer采用”Computer Organization and Design: Hardware/Software Interface”中所介绍的四状态，两位历史信息的方法。具体地说，使用内存后12位作为索引维护一个长度为4096的直接映射高速缓存，用于存储分支指令的地址。对于一个缓存条目，其状态为以下四个状态之一：<code class="language-plaintext highlighter-rouge">Strong Taken</code>, <code class="language-plaintext highlighter-rouge">Weak Taken</code>, <code class="language-plaintext highlighter-rouge">Weak Not Taken</code>, <code class="language-plaintext highlighter-rouge">Strong Not Taken</code>. 状态转换图如下</p>

<p><img src="https://hehao98.github.io/assets/riscv/state.png" alt="" /></p>

<p>具体实现如下</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">bool</span> <span class="n">BranchPredictor</span><span class="o">::</span><span class="n">predict</span><span class="p">(</span><span class="kt">uint32_t</span> <span class="n">pc</span><span class="p">,</span> <span class="kt">uint32_t</span> <span class="n">insttype</span><span class="p">,</span> <span class="kt">int64_t</span> <span class="n">op1</span><span class="p">,</span>
                              <span class="kt">int64_t</span> <span class="n">op2</span><span class="p">,</span> <span class="kt">int64_t</span> <span class="n">offset</span><span class="p">)</span> <span class="p">{</span>
  <span class="k">switch</span> <span class="p">(</span><span class="k">this</span><span class="o">-&gt;</span><span class="n">strategy</span><span class="p">)</span> <span class="p">{</span>
  <span class="k">case</span> <span class="n">NT</span><span class="p">:</span>
    <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
  <span class="k">case</span> <span class="n">AT</span><span class="p">:</span>
    <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
  <span class="k">case</span> <span class="n">BTFNT</span><span class="p">:</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">offset</span> <span class="o">&gt;=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
      <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
    <span class="p">}</span>
  <span class="p">}</span>
  <span class="k">break</span><span class="p">;</span>
  <span class="k">case</span> <span class="n">BPB</span><span class="p">:</span> <span class="p">{</span>
    <span class="n">PredictorState</span> <span class="n">state</span> <span class="o">=</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">predbuf</span><span class="p">[</span><span class="n">pc</span> <span class="o">%</span> <span class="n">PRED_BUF_SIZE</span><span class="p">];</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">state</span> <span class="o">==</span> <span class="n">STRONG_TAKEN</span> <span class="o">||</span> <span class="n">state</span> <span class="o">==</span> <span class="n">WEAK_TAKEN</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">state</span> <span class="o">==</span> <span class="n">STRONG_NOT_TAKEN</span> <span class="o">||</span> <span class="n">state</span> <span class="o">==</span> <span class="n">WEAK_NOT_TAKEN</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
      <span class="n">dbgprintf</span><span class="p">(</span><span class="s">"Strange Prediction Buffer!</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
    <span class="p">}</span>   
  <span class="p">}</span>
  <span class="k">break</span><span class="p">;</span>
  <span class="nl">default:</span>
    <span class="n">dbgprintf</span><span class="p">(</span><span class="s">"Unknown Branch Perdiction Strategy!</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
    <span class="k">break</span><span class="p">;</span>
  <span class="p">}</span>
  <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="n">BranchPredictor</span><span class="o">::</span><span class="n">update</span><span class="p">(</span><span class="kt">uint32_t</span> <span class="n">pc</span><span class="p">,</span> <span class="kt">bool</span> <span class="n">branch</span><span class="p">)</span> <span class="p">{</span>
  <span class="kt">int</span> <span class="n">id</span> <span class="o">=</span> <span class="n">pc</span> <span class="o">%</span> <span class="n">PRED_BUF_SIZE</span><span class="p">;</span>
  <span class="n">PredictorState</span> <span class="n">state</span> <span class="o">=</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">predbuf</span><span class="p">[</span><span class="n">id</span><span class="p">];</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">branch</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">state</span> <span class="o">==</span> <span class="n">STRONG_NOT_TAKEN</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">this</span><span class="o">-&gt;</span><span class="n">predbuf</span><span class="p">[</span><span class="n">id</span><span class="p">]</span> <span class="o">=</span> <span class="n">WEAK_NOT_TAKEN</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">state</span> <span class="o">==</span> <span class="n">WEAK_NOT_TAKEN</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">this</span><span class="o">-&gt;</span><span class="n">predbuf</span><span class="p">[</span><span class="n">id</span><span class="p">]</span> <span class="o">=</span> <span class="n">WEAK_TAKEN</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">state</span> <span class="o">==</span> <span class="n">WEAK_TAKEN</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">this</span><span class="o">-&gt;</span><span class="n">predbuf</span><span class="p">[</span><span class="n">id</span><span class="p">]</span> <span class="o">=</span> <span class="n">STRONG_TAKEN</span><span class="p">;</span>
    <span class="p">}</span> <span class="c1">// do nothing if STRONG_TAKEN</span>
  <span class="p">}</span> <span class="k">else</span> <span class="p">{</span> <span class="c1">// not taken</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">state</span> <span class="o">==</span> <span class="n">STRONG_TAKEN</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">this</span><span class="o">-&gt;</span><span class="n">predbuf</span><span class="p">[</span><span class="n">id</span><span class="p">]</span> <span class="o">=</span> <span class="n">WEAK_TAKEN</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">state</span> <span class="o">==</span> <span class="n">WEAK_TAKEN</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">this</span><span class="o">-&gt;</span><span class="n">predbuf</span><span class="p">[</span><span class="n">id</span><span class="p">]</span> <span class="o">=</span> <span class="n">WEAK_NOT_TAKEN</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">state</span> <span class="o">==</span> <span class="n">WEAK_NOT_TAKEN</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">this</span><span class="o">-&gt;</span><span class="n">predbuf</span><span class="p">[</span><span class="n">id</span><span class="p">]</span> <span class="o">=</span> <span class="n">STRONG_NOT_TAKEN</span><span class="p">;</span>
    <span class="p">}</span> <span class="c1">// do noting if STRONG_NOT_TAKEN</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>并且在解码阶段和执行阶段添加有关分支预测的代码</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Sumulator::decode()</span>
<span class="kt">bool</span> <span class="n">predictedBranch</span> <span class="o">=</span> <span class="nb">false</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">isBranch</span><span class="p">(</span><span class="n">insttype</span><span class="p">))</span> <span class="p">{</span>
  <span class="n">predictedBranch</span> <span class="o">=</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">branchPredictor</span><span class="o">-&gt;</span><span class="n">predict</span><span class="p">(</span><span class="k">this</span><span class="o">-&gt;</span><span class="n">fReg</span><span class="p">.</span><span class="n">pc</span><span class="p">,</span> <span class="n">insttype</span><span class="p">,</span>
                                                   <span class="n">op1</span><span class="p">,</span> <span class="n">op2</span><span class="p">,</span> <span class="n">offset</span><span class="p">);</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">predictedBranch</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">this</span><span class="o">-&gt;</span><span class="n">predictedPC</span> <span class="o">=</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">fReg</span><span class="p">.</span><span class="n">pc</span> <span class="o">+</span> <span class="n">offset</span><span class="p">;</span>
    <span class="k">this</span><span class="o">-&gt;</span><span class="n">anotherPC</span> <span class="o">=</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">fReg</span><span class="p">.</span><span class="n">pc</span> <span class="o">+</span> <span class="mi">4</span><span class="p">;</span>
    <span class="k">this</span><span class="o">-&gt;</span><span class="n">fRegNew</span><span class="p">.</span><span class="n">bubble</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
  <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
     <span class="k">this</span><span class="o">-&gt;</span><span class="n">anotherPC</span> <span class="o">=</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">fReg</span><span class="p">.</span><span class="n">pc</span> <span class="o">+</span> <span class="n">offset</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Simulator::execute()</span>
<span class="k">if</span> <span class="p">(</span><span class="n">isBranch</span><span class="p">(</span><span class="n">inst</span><span class="p">))</span> <span class="p">{</span>
  <span class="p">...</span>
  <span class="c1">// this-&gt;dReg.pc: fetch original inst addr, not the modified one</span>
  <span class="k">this</span><span class="o">-&gt;</span><span class="n">branchPredictor</span><span class="o">-&gt;</span><span class="n">update</span><span class="p">(</span><span class="k">this</span><span class="o">-&gt;</span><span class="n">dReg</span><span class="p">.</span><span class="n">pc</span><span class="p">,</span> <span class="n">branch</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>需要注意的是将PC修改为预测器预测的PC的时机，必须要在一个周期的结束时，也就是<code class="language-plaintext highlighter-rouge">simulate()</code>函数中循环的末尾处。</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// The Branch perdiction happens here to avoid strange bugs in branch prediction</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="k">this</span><span class="o">-&gt;</span><span class="n">dReg</span><span class="p">.</span><span class="n">bubble</span> <span class="o">&amp;&amp;</span> <span class="o">!</span><span class="k">this</span><span class="o">-&gt;</span><span class="n">dReg</span><span class="p">.</span><span class="n">stall</span> <span class="o">&amp;&amp;</span> <span class="o">!</span><span class="k">this</span><span class="o">-&gt;</span><span class="n">fReg</span><span class="p">.</span><span class="n">stall</span> <span class="o">&amp;&amp;</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">dReg</span><span class="p">.</span><span class="n">predictedBranch</span><span class="p">)</span> <span class="p">{</span>
  <span class="k">this</span><span class="o">-&gt;</span><span class="n">pc</span> <span class="o">=</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">predictedPC</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>这样即可完成分支预测模块的实现，并且很容易能够扩展出新的分支预测策略。</p>

<h3 id="52-分支预测模块的性能评测">5.2 分支预测模块的性能评测</h3>

<p>有趣的是，有了这个分支预测模块之后，我们可以对不同分支预测策略的性能进行评测。下面的表格是一个对分支预测准确率的简单统计。</p>

<table>
  <thead>
    <tr>
      <th>评测程序</th>
      <th>Always Taken</th>
      <th>BTFNT</th>
      <th>Prediction Buffer</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">helloworld.riscv</code></td>
      <td>0.4706</td>
      <td>0.7059</td>
      <td>0.4706</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">quicksort.riscv</code></td>
      <td>0.5075</td>
      <td>0.9506</td>
      <td>0.9587</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">matrixmult.riscv</code></td>
      <td>0.6235</td>
      <td>0.6325</td>
      <td>0.6275</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">ackermann.riscv</code></td>
      <td>0.4955</td>
      <td>0.5053</td>
      <td>0.9593</td>
    </tr>
  </tbody>
</table>

<p>我们可以看到，对于<code class="language-plaintext highlighter-rouge">helloworld</code>程序，由于程序过于简单，其中绝大多数指令只会被执行一次，所以基于历史信息的Prediction Buffer方法退化到了Always Taken方法(因为默认预测是选择跳转)，而基于程序结构的经验性判断方法BTFNT反而取得了最高的准确率。</p>

<p>对于快速排序评测程序，我们发现Prediction Buffer和BTFNT都取得了极其高的预测准确率。这是因为排序元素较多(100个)，并且绝大多数情况下都在反复执行很少的一段代码。由于这些代码绝大多数都满足向前会跳转的性质，所以BTFNT方法的准确率很高。由于循环的执行长度非常长(约100次)，所以基于历史信息的Predicton Buffer能够很好地获得较高的预测准确性。</p>

<p>对于矩阵乘法程序，三个分支预测算法的表现非常接近。这可能是由于矩阵乘法中每次循环的执行长度都很短(10个元素)，限制了BTFNT和Prediction Buffer的性能。</p>

<p>对于Ackermann函数求解程序，其中完全没有循环语句，只有函数递归调用和条件判断语句，绝大多数的分支指令都在递归调用的函数内，因此，这时基于历史信息的Prediction Buffer就能发挥出最大威力，得出相当高的预测准确率，而BTFNT在此则相对比较受限了，如果递归函数内刚好两个if语句，一个if语句是向前跳转，一个if语句是向后跳转，而两条语句在大多数情况下都是跳转，那么BTFNT的准确率就会在50%左右徘徊。</p>

<h3 id="53-意见和建议">5.3 意见和建议</h3>

<ol>
  <li>编写RISC-V CPU模拟器极大地锻炼了我的系统编程能力。虽然在编写的过程中遇到了一些难以解决的Bug，但在解决它们的过程中，使我收获了很多Debug经验，并且更加深刻地认识到了编写健壮和包含完备错误处理程序的重要性。</li>
  <li>在配置RISC-V环境的过程中，我发现RISC-V工具链存在一些文档缺失的问题，有时会遇到默认配置比较奇怪或者一些参数过时的问题，为安装相关工具造成了一些困难。我希望要是能在每次Lab发布前，能给出配置环境的一些有关教程就更好了。</li>
  <li>计算机体系结构课教的体系结构是MIPS，不知道为什么Lab却要做RISC-V，有一点增加了学习成本和完成Lab的时间。</li>
</ol>]]></content><author><name>Hao He</name><email>haohe@andrew.cmu.edu</email></author><category term="Chinese" /><category term="C++" /><category term="Computer Architecture" /><summary type="html"><![CDATA[RISC-V是源自Berkeley的开源体系结构和指令集标准。这个模拟器实现的是RISC-V Specification 2.2中所规定RV64I指令集，基于标准的五阶段流水线，并且实现了分支预测模块和虚拟内存模拟。实现一个完整的CPU模拟器可以很好地锻炼系统编程能力，并且加深对体系结构有关知识的理解。在开始实现前，应当阅读并深入理解Computer Systems: A Programmer’s Perspective中的第四章，或者Computer Organizaton and Design: Hardware/Software Interface中的有关章节。]]></summary></entry><entry><title type="html">XV6操作系统代码阅读心得（二）：进程</title><link href="https://hehao98.github.io/posts/2019/03/xv6-2/" rel="alternate" type="text/html" title="XV6操作系统代码阅读心得（二）：进程" /><published>2019-03-25T00:00:00-07:00</published><updated>2019-03-25T00:00:00-07:00</updated><id>https://hehao98.github.io/posts/2019/03/xv6-2</id><content type="html" xml:base="https://hehao98.github.io/posts/2019/03/xv6-2/"><![CDATA[<p><a href="https://hehao98.github.io/posts/2019/03/xv6-1/">上一篇的地址</a></p>

<h2 id="1-进程的基本概念">1. 进程的基本概念</h2>

<p>从抽象的意义来说，进程是指一个正在运行的程序的实例，而线程是一个CPU指令执行流的最小单位。进程是操作系统资源分配的最小单位，线程是操作系统中调度的最小单位。从实现的角度上讲，XV6系统中只实现了进程， 并没有提供对线程的额外支持，一个用户进程永远只会有一个用户可见的执行流。</p>

<h2 id="2-进程管理的数据结构">2. 进程管理的数据结构</h2>

<p>根据[1]，进程管理的数据结构被叫做进程控制块(Process Control Block, PCB)。一个进程的PCB必须存储以下两类信息：</p>

<ol>
  <li>操作系统管理运行的进程所需要信息，比如优先级、进程ID、进程上下文等</li>
  <li>一个应用程序运行所需要的全部环境，比如虚拟内存的信息、打开的文件和IO设备的信息等。</li>
</ol>

<h3 id="xv6中进程相关的数据结构">XV6中进程相关的数据结构</h3>

<p>在XV6中，与进程有关的数据结构如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Per-process state</span>
<span class="k">struct</span> <span class="n">proc</span> <span class="p">{</span>
  <span class="n">uint</span> <span class="n">sz</span><span class="p">;</span>                     <span class="c1">// Size of process memory (bytes)</span>
  <span class="n">pde_t</span><span class="o">*</span> <span class="n">pgdir</span><span class="p">;</span>                <span class="c1">// Page table</span>
  <span class="kt">char</span> <span class="o">*</span><span class="n">kstack</span><span class="p">;</span>                <span class="c1">// Bottom of kernel stack for this process</span>
  <span class="k">enum</span> <span class="n">procstate</span> <span class="n">state</span><span class="p">;</span>        <span class="c1">// Process state</span>
  <span class="kt">int</span> <span class="n">pid</span><span class="p">;</span>                     <span class="c1">// Process ID</span>
  <span class="k">struct</span> <span class="n">proc</span> <span class="o">*</span><span class="n">parent</span><span class="p">;</span>         <span class="c1">// Parent process</span>
  <span class="k">struct</span> <span class="n">trapframe</span> <span class="o">*</span><span class="n">tf</span><span class="p">;</span>        <span class="c1">// Trap frame for current syscall</span>
  <span class="k">struct</span> <span class="n">context</span> <span class="o">*</span><span class="n">context</span><span class="p">;</span>     <span class="c1">// swtch() here to run process</span>
  <span class="kt">void</span> <span class="o">*</span><span class="n">chan</span><span class="p">;</span>                  <span class="c1">// If non-zero, sleeping on chan</span>
  <span class="kt">int</span> <span class="n">killed</span><span class="p">;</span>                  <span class="c1">// If non-zero, have been killed</span>
  <span class="k">struct</span> <span class="n">file</span> <span class="o">*</span><span class="n">ofile</span><span class="p">[</span><span class="n">NOFILE</span><span class="p">];</span>  <span class="c1">// Open files</span>
  <span class="k">struct</span> <span class="n">inode</span> <span class="o">*</span><span class="n">cwd</span><span class="p">;</span>           <span class="c1">// Current directory</span>
  <span class="kt">char</span> <span class="n">name</span><span class="p">[</span><span class="mi">16</span><span class="p">];</span>               <span class="c1">// Process name (debugging)</span>
<span class="p">};</span>
</code></pre></div></div>

<p>与前述的两类信息的对应关系如下</p>

<ol>
  <li>
    <p>操作系统管理进程有关的信息：内核栈<code class="language-plaintext highlighter-rouge">kstack</code>，进程的状态<code class="language-plaintext highlighter-rouge">state</code>，进程的<code class="language-plaintext highlighter-rouge">pid</code>，进程的父进程<code class="language-plaintext highlighter-rouge">parent</code>，进程的中断帧<code class="language-plaintext highlighter-rouge">tf</code>，进程的上下文<code class="language-plaintext highlighter-rouge">context</code>，与<code class="language-plaintext highlighter-rouge">sleep</code>和<code class="language-plaintext highlighter-rouge">kill</code>有关的<code class="language-plaintext highlighter-rouge">chan</code>和<code class="language-plaintext highlighter-rouge">killed</code>变量。</p>
  </li>
  <li>
    <p>进程本身运行所需要的全部环境：虚拟内存信息<code class="language-plaintext highlighter-rouge">sz</code>和<code class="language-plaintext highlighter-rouge">pgdir</code>，打开的文件<code class="language-plaintext highlighter-rouge">ofile</code>和当前目录<code class="language-plaintext highlighter-rouge">cwd</code>。</p>
  </li>
</ol>

<p>额外地，<code class="language-plaintext highlighter-rouge">proc</code>中还有一条用于调试的进程名字<code class="language-plaintext highlighter-rouge">name</code>。</p>

<p>在操作系统中，所有的进程信息<code class="language-plaintext highlighter-rouge">struct proc</code>都存储在<code class="language-plaintext highlighter-rouge">ptable</code>中，<code class="language-plaintext highlighter-rouge">ptable</code>的定义如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="p">{</span>
  <span class="k">struct</span> <span class="n">spinlock</span> <span class="n">lock</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">proc</span> <span class="n">proc</span><span class="p">[</span><span class="n">NPROC</span><span class="p">];</span>
<span class="p">}</span> <span class="n">ptable</span><span class="p">;</span>
</code></pre></div></div>

<p>除了互斥锁<code class="language-plaintext highlighter-rouge">lock</code>之外，一个值得注意的一点是XV6系统中允许同时存在的进程数量是有上限的。在这里<code class="language-plaintext highlighter-rouge">NPROC</code>为64，所以XV6最多只允许同时存在64个进程。</p>

<p>在<code class="language-plaintext highlighter-rouge">proc.c</code>中，<code class="language-plaintext highlighter-rouge">userinit()</code>用于创建第一个用户进程，<code class="language-plaintext highlighter-rouge">allocproc()</code>则被用于在<code class="language-plaintext highlighter-rouge">ptable</code>中寻找空位并在空位上创建一个新的进程。当操作系统初始化时，通过<code class="language-plaintext highlighter-rouge">userinit()</code>调用<code class="language-plaintext highlighter-rouge">allocproc</code>创建第一个进程<code class="language-plaintext highlighter-rouge">init</code>。绝大多数进程相关信息都会在这里初始化。由于XV6系统只允许中断返回一种从内核态进入用户态的方式，因此<code class="language-plaintext highlighter-rouge">allocproc()</code>会创建中断调用的栈结构，而<code class="language-plaintext highlighter-rouge">userinit</code>会设置其中的值，仿佛是从一次真正的中断里返回进程一样。最后，在<code class="language-plaintext highlighter-rouge">mpmain()</code>中，系统调用<code class="language-plaintext highlighter-rouge">schedule()</code>函数，开始用户进程的调度。在<code class="language-plaintext highlighter-rouge">init</code>进程被调度启动后，会创建<code class="language-plaintext highlighter-rouge">shell</code>进程，用于和用户交互。</p>

<h3 id="linux中进程相关的数据结构">Linux中进程相关的数据结构</h3>

<p>Linux系统的实现中并不刻意区分进程和线程，而是将其一概存储在被称作<code class="language-plaintext highlighter-rouge">task_struct</code>的数据结构中。当两个<code class="language-plaintext highlighter-rouge">task_struct</code>共享同一个虚拟地址空间时，它们就是同一个进程的两个线程。与Linux进程有关的数据结构定义大多数都在<code class="language-plaintext highlighter-rouge">/include/linux/sched.h</code>中。<code class="language-plaintext highlighter-rouge">task_struct</code>数据结构相当复杂，在32位机器上一条能占据1.7KiB的空间。<code class="language-plaintext highlighter-rouge">task_struct</code>中主要包含的数据结构有管理处理器底层信息的<code class="language-plaintext highlighter-rouge">thread_struct</code>、管理虚拟内存的<code class="language-plaintext highlighter-rouge">mm_struct</code>、管理文件描述符的<code class="language-plaintext highlighter-rouge">file_struct</code>、管理信号的<code class="language-plaintext highlighter-rouge">signal_struct</code>等等。Linux中的进程与XV6一样都有独立的内核栈，内核模式下的代码是在内核栈中运行的。</p>

<p><img src="https://hehao98.github.io/assets/xv6-pic/linux_proc1.png" alt="" /></p>

<p>操作系统维护多个<code class="language-plaintext highlighter-rouge">task_struct</code>队列来实现不同的功能。所有的队列都是用双向链表实现的。有一个队列存放了所有的进程；另一个队列存放了所有正在运行的进程（<code class="language-plaintext highlighter-rouge">kernel/sched.c</code>中的<code class="language-plaintext highlighter-rouge">struct runqueue</code> ）；此外，对于每一个会导致进程挂起的等待事件，都有一个队列存放因为等待此事件而挂起的进程（<code class="language-plaintext highlighter-rouge">include/linux/wait.h</code>中的<code class="language-plaintext highlighter-rouge">wait_queue_t</code>）。</p>

<p><img src="https://hehao98.github.io/assets/xv6-pic/linux_proc2.png" alt="" /></p>

<p>Linux会将<code class="language-plaintext highlighter-rouge">task_struct</code>数据结构分配到这个进程的内核栈的顶部，将<code class="language-plaintext highlighter-rouge">thread_info</code>数据结构分配到这个进程的内核栈的底部。<code class="language-plaintext highlighter-rouge">thread_info</code>的名称有些误导，它存储的其实是一个<code class="language-plaintext highlighter-rouge">task</code>中更加底层和更加体系结构相关的属性。进程数据结构的分配方法被称为Slab Allocator，通过精心优化的虚拟内存机制来提升进程管理的效率、实现对象重用。</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">thread_info</span> <span class="p">{</span>
    <span class="k">struct</span> <span class="n">task_struct</span> <span class="o">*</span><span class="n">task</span><span class="p">;</span>
    <span class="k">struct</span> <span class="n">exec_domain</span> <span class="o">*</span><span class="n">exec_domain</span><span class="p">;</span>
    <span class="n">__u32</span> <span class="n">flags</span><span class="p">;</span>
    <span class="n">__u32</span> <span class="n">status</span><span class="p">;</span>
    <span class="n">__u32</span> <span class="n">cpu</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">preempt_count</span><span class="p">;</span>
    <span class="n">mm_segment_t</span> <span class="n">addr_limit</span><span class="p">;</span>
    <span class="k">struct</span> <span class="n">restart_block</span> <span class="n">restart_block</span><span class="p">;</span>
    <span class="kt">void</span> <span class="o">*</span><span class="n">sysenter_return</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">uaccess_err</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<h3 id="windows中进程相关的数据结构">Windows中进程相关的数据结构</h3>

<p>在Windows NT以后的Windows系统中，进程用<code class="language-plaintext highlighter-rouge">EPROCESS</code>对象表示，线程用<code class="language-plaintext highlighter-rouge">ETHREAD</code>对象表示。在一个<code class="language-plaintext highlighter-rouge">EPROCESS</code>对象中，包含了进程的资源相关信息，比如句柄表、虚拟内存、安全、调试、异常、创建信息、I/O转移统计以及进程计时等。每个<code class="language-plaintext highlighter-rouge">EPROCESS</code>对象都包含一个指向<code class="language-plaintext highlighter-rouge">ETHREAD</code>结构体的链表。值得一提的是Windows系统中<code class="language-plaintext highlighter-rouge">EPROCESS</code>和<code class="language-plaintext highlighter-rouge">ETHREAD</code>的设计都是分层的，<code class="language-plaintext highlighter-rouge">KPROCESS</code>和<code class="language-plaintext highlighter-rouge">KTHREAD</code>成员对象专门用来处理体系结构有关的细节，而Process Environment Block和Thead Environment Block对象则暴露给应用程序来访问。</p>

<h2 id="3-进程的状态">3. 进程的状态</h2>

<p>在大多数教科书使用的标准五状态进程模型中，进程分为New、Ready、Running、Waiting和Terminated五个状态，状态转换图如图所示（图出自Operating System Concepts, 7th Edition）</p>

<p><img src="https://hehao98.github.io/assets/xv6-pic/process_state.png" alt="" /></p>

<p>除去标记进程块未被使用的<code class="language-plaintext highlighter-rouge">UNUSED</code>状态，XV6操作系统中的状态与上述的五状态模型完全对应。在XV6中这五个状态的定义为</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">enum</span> <span class="n">procstate</span> <span class="p">{</span> <span class="n">UNUSED</span><span class="p">,</span> <span class="n">EMBRYO</span><span class="p">,</span> <span class="n">SLEEPING</span><span class="p">,</span> <span class="n">RUNNABLE</span><span class="p">,</span> <span class="n">RUNNING</span><span class="p">,</span> <span class="n">ZOMBIE</span> <span class="p">};</span>
</code></pre></div></div>

<p>XV6实现中源代码中具体的转换关系如下</p>

<p><img src="https://hehao98.github.io/assets/xv6-pic/xv6-process.png" alt="" /></p>

<p>这个转换关系图中，标识出了在XV6系统中发生状态转换所需要的函数或者事件。</p>

<p><img src="https://hehao98.github.io/assets/xv6-pic/linux_proc_state.png" alt="" /></p>

<p>Linux中的进程转换图与Xv6大致相同，但是有如下区别：</p>

<ol>
  <li>Linux中的Waiting有两个状态，分别是Interruptible Waiting和Uninterruptible Waiting。</li>
  <li>Linux中额外有两个调试用状态<code class="language-plaintext highlighter-rouge">TASK_TRACED</code>和<code class="language-plaintext highlighter-rouge">TASK_STOPPED</code>。</li>
</ol>

<p>关于Windows的进程状态，网上并没有关于实现细节的特别详细的解释。一份关于Windows进程的文档[6]使用了如下的进程转换图，但并没有显式地说明其与Windows进程实现之间的关系。</p>

<p><img src="https://hehao98.github.io/assets/xv6-pic/winproc.png" alt="" /></p>

<p>从中可以看出，Windows进程可能额外多出了Suspend状态，用于以下情况的一种：</p>

<ol>
  <li>当内存不足使得此进程被移入硬盘时</li>
  <li>当操作系统决定挂起某个后台进程时</li>
  <li>当用户出于调试或者其他原因手动挂起了某个进程，或者一个进程的父进程使用系统调用挂起一个进程时</li>
  <li>对于定时启动的系统，在指定的运行时间以外会被挂起</li>
</ol>

<p>我认为操作系统设计这些状态，是出于在有限的计算机系统资源上，对管理和调度多个进程的需求。如果一个CPU核同一时间只会有一个进程运行，那就完全不需要设置进程的状态。但是一个实用的现代操作系统必须支持大量进程共享一个CPU，也必须支持进程的不断创建与终止，从而实现资源利用效率的最大化和系统功能的多样化。因此，操作系统选取了现在的五状态进程设计，并且出于不同系统的需求不同，也会有更加细化的设计。</p>

<h2 id="4-进程的调度算法">4. 进程的调度算法</h2>

<p>Xv6系统中的进程可以使用<code class="language-plaintext highlighter-rouge">fork()</code>系统调用创建新进程。为了创建一个进程，操作系统必须为这个进程分配相应的资源，包括内存、CPU时间、文件等，与此同时，操作系统必须对此进程做出相应的管理，包括设置它的进程ID、调度优先级、虚拟内存结构、运行资源限制等等。为了能够维持多个进程在一个CPU上运行，必须对此做出相应的调度。调度算法有很多种，由简到难如下</p>

<ol>
  <li>First Come, First Served. 这个调度算法的思想是让操作系统维护一个可以运行的进程的等待队列，让先进入等待队列的进程先运行。这个调度算法的优势在于实现简单，只要是能实现链表的系统就能实现这个调度算法。但是这个算法的问题在于，算法在很多量度下都是次优的，其优劣的程度完全取决于进程进入队列的先后顺序。</li>
  <li>Shortest Task First. 这个调度算法的思想是让期望运行时间最短的进程先执行。可以从理论上证明，给定一组执行时间已知的进程，这个调度算法能让所有进程的等待时间之和最小。但是，预估一个进程的执行时间是极为困难的，妨碍了这个算法的实践有效性。此外，这个算法还存在饥饿(Starvation)的问题，也就是大量短时进程的不断进入会使得一个长时进程无法执行。</li>
  <li>Round-Robin. 这个算法的思想在于对所有运行的进程分配一个时间片，让所有进程轮流执行。这个调度算法虽然效率低下，但是公平性是最好的，并且不会存在饥饿的问题。</li>
  <li>Priority Based Multilevel Queue. 这个算法的思想是对不同进程分配不同的优先级，每个优先级会有一个进程队列，优先级高的先执行，同一优先级内使用某种前述调度算法分配。为了解决低优先级进程饥饿的问题，常常会采用某种动态优先级调整机制。</li>
</ol>

<p>一个现代操作系统所使用的调度算法通常是Priority Based Multilevel Queue的一种变体。具体地说，根据操作系统的具体需求，将不同类别的进程赋予不同的优先级。比如，Windows系统中，用户当前使用的窗体进程具有非常高的优先级。对于每一个优先级内的进程都会维护一个独自的队列，每个优先级可以使用不同的调度算法。高优先级的前台进程可以使用Round-Robin，后台进程可以使用First Come First Served。如果一个进程很久没有得到执行，那么可以提升它的优先级，从低优先级队列进入高优先级队列，从而避免饥饿的问题。</p>

<p>一般而言，出于CPU资源的限制和操作系统内核空间的内存限制，操作系统会指定允许同时存在的最大进程数。在Xv6系统中，最多同时存在64个进程。操作系统会维护一个大小为64的<code class="language-plaintext highlighter-rouge">struct proc</code>数组，并在其中分配新的进程。</p>

<p>进程的上下文包含了这个进程执行时所需要的全部信息，主要是寄存器的值和运行时栈。在Xv6系统中，执行进程的上下文切换就意味着要保存原进程的调用保存寄存器 <code class="language-plaintext highlighter-rouge">%ebp %ebx %esi %ebp</code>，栈指针<code class="language-plaintext highlighter-rouge">%esp</code>和程序指针<code class="language-plaintext highlighter-rouge">eip</code>，并载入新的进程的上述寄存器。特别地，Xv6中的进程切换只会切换到内核调度器进程，并通过内核调度器切换到新的进程。</p>

<p>关于进程调度的具体细节，官方文档具有精彩的描述，在此不再赘述。</p>

<p>多进程和多CPU之间的关系在于，在操作系统面前，每个进程都好似占用了一个独立的虚拟CPU，但事实上操作系统会将多个进程分配在一个或多个CPU上运行，进程的数量与CPU的数量之间并没有直接的关系。</p>

<h2 id="5-内核态进程与用户态进程">5. 内核态进程与用户态进程</h2>

<p>内核态进程，顾名思义，是在操作系统内核态下执行的进程。在内核态下运行的进程一般用于完成操作系统最底层，最为核心，无法在用户态下完成的功能。比如，调度器进程是Xv6中的一个内核态进程，因为在用户态下是无法进行进程调度的。相比较而言，用户态进程用于完成的功能可以多种多样，并且其功能只依赖于操作系统提供的系统调用，不需要深入操作内核的数据结构。比如init进程和shell进程就是xv6中的用户态进程。</p>

<h2 id="6-进程的内存布局">6. 进程的内存布局</h2>

<p><img src="https://hehao98.github.io/assets/xv6-pic/xv6mem.png" alt="" /></p>

<p>Xv6进程在虚拟内存中的布局如上图。当然，其中的每一页在物理内存中大概率并不是这样排列的，但是虚拟内存系统为每个进程提供了统一的内存抽象。进程的栈用于存放运行时数据和运行时轨迹，包含了函数调用的嵌套，函数调用的参数和临时变量的存储等。栈通常较小，不会在运行时增长，不适合存储大量数据。相比较而言，堆提供了一个存放全局变量和动态增长的数据的机制。堆的大小通常可以动态增长，并且一般用于存储较大的数据和程序执行过程中始终会被访问的全局变量。</p>

<h2 id="7-forkwaitexit系统调用的实现">7. fork、wait、exit系统调用的实现。</h2>

<h3 id="fork函数"><code class="language-plaintext highlighter-rouge">fork()</code>函数</h3>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Create a new process copying p as the parent.</span>
<span class="c1">// Sets up stack to return as if from system call.</span>
<span class="c1">// Caller must set state of returned proc to RUNNABLE.</span>
<span class="kt">int</span>
<span class="nf">fork</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>
<span class="p">{</span>
  <span class="kt">int</span> <span class="n">i</span><span class="p">,</span> <span class="n">pid</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">proc</span> <span class="o">*</span><span class="n">np</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">proc</span> <span class="o">*</span><span class="n">curproc</span> <span class="o">=</span> <span class="n">myproc</span><span class="p">();</span>

  <span class="c1">// Allocate process.</span>
  <span class="k">if</span><span class="p">((</span><span class="n">np</span> <span class="o">=</span> <span class="n">allocproc</span><span class="p">())</span> <span class="o">==</span> <span class="mi">0</span><span class="p">){</span>
    <span class="k">return</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
  <span class="p">}</span>

  <span class="c1">// Copy process state from proc.</span>
  <span class="k">if</span><span class="p">((</span><span class="n">np</span><span class="o">-&gt;</span><span class="n">pgdir</span> <span class="o">=</span> <span class="n">copyuvm</span><span class="p">(</span><span class="n">curproc</span><span class="o">-&gt;</span><span class="n">pgdir</span><span class="p">,</span> <span class="n">curproc</span><span class="o">-&gt;</span><span class="n">sz</span><span class="p">))</span> <span class="o">==</span> <span class="mi">0</span><span class="p">){</span>
    <span class="n">kfree</span><span class="p">(</span><span class="n">np</span><span class="o">-&gt;</span><span class="n">kstack</span><span class="p">);</span>
    <span class="n">np</span><span class="o">-&gt;</span><span class="n">kstack</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="n">np</span><span class="o">-&gt;</span><span class="n">state</span> <span class="o">=</span> <span class="n">UNUSED</span><span class="p">;</span>
    <span class="k">return</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
  <span class="p">}</span>
  <span class="n">np</span><span class="o">-&gt;</span><span class="n">sz</span> <span class="o">=</span> <span class="n">curproc</span><span class="o">-&gt;</span><span class="n">sz</span><span class="p">;</span>
  <span class="n">np</span><span class="o">-&gt;</span><span class="n">parent</span> <span class="o">=</span> <span class="n">curproc</span><span class="p">;</span>
  <span class="o">*</span><span class="n">np</span><span class="o">-&gt;</span><span class="n">tf</span> <span class="o">=</span> <span class="o">*</span><span class="n">curproc</span><span class="o">-&gt;</span><span class="n">tf</span><span class="p">;</span>

  <span class="c1">// Clear %eax so that fork returns 0 in the child.</span>
  <span class="n">np</span><span class="o">-&gt;</span><span class="n">tf</span><span class="o">-&gt;</span><span class="n">eax</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

  <span class="k">for</span><span class="p">(</span><span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">NOFILE</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
    <span class="k">if</span><span class="p">(</span><span class="n">curproc</span><span class="o">-&gt;</span><span class="n">ofile</span><span class="p">[</span><span class="n">i</span><span class="p">])</span>
      <span class="n">np</span><span class="o">-&gt;</span><span class="n">ofile</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">filedup</span><span class="p">(</span><span class="n">curproc</span><span class="o">-&gt;</span><span class="n">ofile</span><span class="p">[</span><span class="n">i</span><span class="p">]);</span>
  <span class="n">np</span><span class="o">-&gt;</span><span class="n">cwd</span> <span class="o">=</span> <span class="n">idup</span><span class="p">(</span><span class="n">curproc</span><span class="o">-&gt;</span><span class="n">cwd</span><span class="p">);</span>

  <span class="n">safestrcpy</span><span class="p">(</span><span class="n">np</span><span class="o">-&gt;</span><span class="n">name</span><span class="p">,</span> <span class="n">curproc</span><span class="o">-&gt;</span><span class="n">name</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">curproc</span><span class="o">-&gt;</span><span class="n">name</span><span class="p">));</span>

  <span class="n">pid</span> <span class="o">=</span> <span class="n">np</span><span class="o">-&gt;</span><span class="n">pid</span><span class="p">;</span>

  <span class="n">acquire</span><span class="p">(</span><span class="o">&amp;</span><span class="n">ptable</span><span class="p">.</span><span class="n">lock</span><span class="p">);</span>

  <span class="n">np</span><span class="o">-&gt;</span><span class="n">state</span> <span class="o">=</span> <span class="n">RUNNABLE</span><span class="p">;</span>

  <span class="n">release</span><span class="p">(</span><span class="o">&amp;</span><span class="n">ptable</span><span class="p">.</span><span class="n">lock</span><span class="p">);</span>

  <span class="k">return</span> <span class="n">pid</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">fork()</code>函数的代码如上。<code class="language-plaintext highlighter-rouge">fork()</code>函数首先调用<code class="language-plaintext highlighter-rouge">allocproc()</code>函数获得并初始化一个进程控制块<code class="language-plaintext highlighter-rouge">struct proc</code>（12-14行）。此外，在<code class="language-plaintext highlighter-rouge">allocproc()</code>函数中还会对进程的内核栈进行初始化，在内核栈里设置一个Trap Frame，把Trap Frame的上下文部分都置为0。然后，<code class="language-plaintext highlighter-rouge">fork()</code>函数使用<code class="language-plaintext highlighter-rouge">copyuvm()</code>函数复制原进程的虚拟内存结构（17-24行）。为了能让子进程返回时处在和父进程一模一样的状态，Trap Frame也会被拷贝（25行，需要注意这里的运算符优先级）。为了让子进程系统调用的返回值为0，子进程的<code class="language-plaintext highlighter-rouge">eax</code>寄存器会被置为0（28行）。然后，父进程打开的文件描述符会被全部拷贝给子进程（30-32行），还有父进程所处于的目录（33行）。这些操作都会增加文件描述符和目录的被引用数。最后，<code class="language-plaintext highlighter-rouge">fork()</code>函数拷贝了父进程的名字，设置了子进程的状态为<code class="language-plaintext highlighter-rouge">RUNNABLE</code>，然后返回子进程<code class="language-plaintext highlighter-rouge">pid</code>给父进程。子进程被创建后，在某个时刻调度子进程运行时，<code class="language-plaintext highlighter-rouge">fork()</code>函数会第二次返回给子进程，此时返回值为0。</p>

<h3 id="wait函数"><code class="language-plaintext highlighter-rouge">wait()</code>函数</h3>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Wait for a child process to exit and return its pid.</span>
<span class="c1">// Return -1 if this process has no children.</span>
<span class="kt">int</span>
<span class="nf">wait</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>
<span class="p">{</span>
  <span class="k">struct</span> <span class="n">proc</span> <span class="o">*</span><span class="n">p</span><span class="p">;</span>
  <span class="kt">int</span> <span class="n">havekids</span><span class="p">,</span> <span class="n">pid</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">proc</span> <span class="o">*</span><span class="n">curproc</span> <span class="o">=</span> <span class="n">myproc</span><span class="p">();</span>
  
  <span class="n">acquire</span><span class="p">(</span><span class="o">&amp;</span><span class="n">ptable</span><span class="p">.</span><span class="n">lock</span><span class="p">);</span>
  <span class="k">for</span><span class="p">(;;){</span>
    <span class="c1">// Scan through table looking for exited children.</span>
    <span class="n">havekids</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">for</span><span class="p">(</span><span class="n">p</span> <span class="o">=</span> <span class="n">ptable</span><span class="p">.</span><span class="n">proc</span><span class="p">;</span> <span class="n">p</span> <span class="o">&lt;</span> <span class="o">&amp;</span><span class="n">ptable</span><span class="p">.</span><span class="n">proc</span><span class="p">[</span><span class="n">NPROC</span><span class="p">];</span> <span class="n">p</span><span class="o">++</span><span class="p">){</span>
      <span class="k">if</span><span class="p">(</span><span class="n">p</span><span class="o">-&gt;</span><span class="n">parent</span> <span class="o">!=</span> <span class="n">curproc</span><span class="p">)</span>
        <span class="k">continue</span><span class="p">;</span>
      <span class="n">havekids</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
      <span class="k">if</span><span class="p">(</span><span class="n">p</span><span class="o">-&gt;</span><span class="n">state</span> <span class="o">==</span> <span class="n">ZOMBIE</span><span class="p">){</span>
        <span class="c1">// Found one.</span>
        <span class="n">pid</span> <span class="o">=</span> <span class="n">p</span><span class="o">-&gt;</span><span class="n">pid</span><span class="p">;</span>
        <span class="n">kfree</span><span class="p">(</span><span class="n">p</span><span class="o">-&gt;</span><span class="n">kstack</span><span class="p">);</span>
        <span class="n">p</span><span class="o">-&gt;</span><span class="n">kstack</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="n">freevm</span><span class="p">(</span><span class="n">p</span><span class="o">-&gt;</span><span class="n">pgdir</span><span class="p">);</span>
        <span class="n">p</span><span class="o">-&gt;</span><span class="n">pid</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="n">p</span><span class="o">-&gt;</span><span class="n">parent</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="n">p</span><span class="o">-&gt;</span><span class="n">name</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="n">p</span><span class="o">-&gt;</span><span class="n">killed</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="n">p</span><span class="o">-&gt;</span><span class="n">state</span> <span class="o">=</span> <span class="n">UNUSED</span><span class="p">;</span>
        <span class="n">release</span><span class="p">(</span><span class="o">&amp;</span><span class="n">ptable</span><span class="p">.</span><span class="n">lock</span><span class="p">);</span>
        <span class="k">return</span> <span class="n">pid</span><span class="p">;</span>
      <span class="p">}</span>
    <span class="p">}</span>

    <span class="c1">// No point waiting if we don't have any children.</span>
    <span class="k">if</span><span class="p">(</span><span class="o">!</span><span class="n">havekids</span> <span class="o">||</span> <span class="n">curproc</span><span class="o">-&gt;</span><span class="n">killed</span><span class="p">){</span>
      <span class="n">release</span><span class="p">(</span><span class="o">&amp;</span><span class="n">ptable</span><span class="p">.</span><span class="n">lock</span><span class="p">);</span>
      <span class="k">return</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="c1">// Wait for children to exit.  (See wakeup1 call in proc_exit.)</span>
    <span class="n">sleep</span><span class="p">(</span><span class="n">curproc</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">ptable</span><span class="p">.</span><span class="n">lock</span><span class="p">);</span>  <span class="c1">//DOC: wait-sleep</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">wait()</code>函数的代码如上。<code class="language-plaintext highlighter-rouge">wait()</code>函数首先必须要获得<code class="language-plaintext highlighter-rouge">ptable</code>的锁（10行），因为它有可能会对<code class="language-plaintext highlighter-rouge">ptable</code>做出修改。然后它会遍历<code class="language-plaintext highlighter-rouge">ptable</code>，从中寻找自己的子进程（14-32行）。如果发现僵尸子进程，就把僵尸子进程回收，具体地说要回收它的虚拟内存，内核栈，并设置状态为<code class="language-plaintext highlighter-rouge">UNUSED</code>（18-30行），有趣的是，在这里<code class="language-plaintext highlighter-rouge">wait()</code>函数根本没有回收这个子进程打开的文件描述符，因为在<code class="language-plaintext highlighter-rouge">exit()</code>函数内这个进程打开的文件描述符已经全部被关闭了，而且只有<code class="language-plaintext highlighter-rouge">exit()</code>之后的进程才可能是<code class="language-plaintext highlighter-rouge">ZOMBIE</code>状态。对于没有子进程的情况，<code class="language-plaintext highlighter-rouge">wait()</code>会直接返回，否则他会调用<code class="language-plaintext highlighter-rouge">sleep()</code>，并传入<code class="language-plaintext highlighter-rouge">ptable</code>的锁作为参数。之所以要在<code class="language-plaintext highlighter-rouge">sleep</code>函数中传入<code class="language-plaintext highlighter-rouge">ptable</code>锁，是为了避免在<code class="language-plaintext highlighter-rouge">wait()</code>把进程设置为<code class="language-plaintext highlighter-rouge">SLEEP</code>状态之前，子进程就已经成为僵死进程并在<code class="language-plaintext highlighter-rouge">exit()</code>函数中调用了<code class="language-plaintext highlighter-rouge">wakeup()</code>，这会使得父进程接收不到<code class="language-plaintext highlighter-rouge">wakeup</code>从而进入死锁状态。</p>

<h3 id="exit函数"><code class="language-plaintext highlighter-rouge">exit()</code>函数</h3>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Exit the current process.  Does not return.</span>
<span class="c1">// An exited process remains in the zombie state</span>
<span class="c1">// until its parent calls wait() to find out it exited.</span>
<span class="kt">void</span>
<span class="nf">exit</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>
<span class="p">{</span>
  <span class="k">struct</span> <span class="n">proc</span> <span class="o">*</span><span class="n">curproc</span> <span class="o">=</span> <span class="n">myproc</span><span class="p">();</span>
  <span class="k">struct</span> <span class="n">proc</span> <span class="o">*</span><span class="n">p</span><span class="p">;</span>
  <span class="kt">int</span> <span class="n">fd</span><span class="p">;</span>

  <span class="k">if</span><span class="p">(</span><span class="n">curproc</span> <span class="o">==</span> <span class="n">initproc</span><span class="p">)</span>
    <span class="n">panic</span><span class="p">(</span><span class="s">"init exiting"</span><span class="p">);</span>

  <span class="c1">// Close all open files.</span>
  <span class="k">for</span><span class="p">(</span><span class="n">fd</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">fd</span> <span class="o">&lt;</span> <span class="n">NOFILE</span><span class="p">;</span> <span class="n">fd</span><span class="o">++</span><span class="p">){</span>
    <span class="k">if</span><span class="p">(</span><span class="n">curproc</span><span class="o">-&gt;</span><span class="n">ofile</span><span class="p">[</span><span class="n">fd</span><span class="p">]){</span>
      <span class="n">fileclose</span><span class="p">(</span><span class="n">curproc</span><span class="o">-&gt;</span><span class="n">ofile</span><span class="p">[</span><span class="n">fd</span><span class="p">]);</span>
      <span class="n">curproc</span><span class="o">-&gt;</span><span class="n">ofile</span><span class="p">[</span><span class="n">fd</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="p">}</span>
  <span class="p">}</span>

  <span class="n">begin_op</span><span class="p">();</span>
  <span class="n">iput</span><span class="p">(</span><span class="n">curproc</span><span class="o">-&gt;</span><span class="n">cwd</span><span class="p">);</span>
  <span class="n">end_op</span><span class="p">();</span>
  <span class="n">curproc</span><span class="o">-&gt;</span><span class="n">cwd</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

  <span class="n">acquire</span><span class="p">(</span><span class="o">&amp;</span><span class="n">ptable</span><span class="p">.</span><span class="n">lock</span><span class="p">);</span>

  <span class="c1">// Parent might be sleeping in wait().</span>
  <span class="n">wakeup1</span><span class="p">(</span><span class="n">curproc</span><span class="o">-&gt;</span><span class="n">parent</span><span class="p">);</span>

  <span class="c1">// Pass abandoned children to init.</span>
  <span class="k">for</span><span class="p">(</span><span class="n">p</span> <span class="o">=</span> <span class="n">ptable</span><span class="p">.</span><span class="n">proc</span><span class="p">;</span> <span class="n">p</span> <span class="o">&lt;</span> <span class="o">&amp;</span><span class="n">ptable</span><span class="p">.</span><span class="n">proc</span><span class="p">[</span><span class="n">NPROC</span><span class="p">];</span> <span class="n">p</span><span class="o">++</span><span class="p">){</span>
    <span class="k">if</span><span class="p">(</span><span class="n">p</span><span class="o">-&gt;</span><span class="n">parent</span> <span class="o">==</span> <span class="n">curproc</span><span class="p">){</span>
      <span class="n">p</span><span class="o">-&gt;</span><span class="n">parent</span> <span class="o">=</span> <span class="n">initproc</span><span class="p">;</span>
      <span class="k">if</span><span class="p">(</span><span class="n">p</span><span class="o">-&gt;</span><span class="n">state</span> <span class="o">==</span> <span class="n">ZOMBIE</span><span class="p">)</span>
        <span class="n">wakeup1</span><span class="p">(</span><span class="n">initproc</span><span class="p">);</span>
    <span class="p">}</span>
  <span class="p">}</span>

  <span class="c1">// Jump into the scheduler, never to return.</span>
  <span class="n">curproc</span><span class="o">-&gt;</span><span class="n">state</span> <span class="o">=</span> <span class="n">ZOMBIE</span><span class="p">;</span>
  <span class="n">sched</span><span class="p">();</span>
  <span class="n">panic</span><span class="p">(</span><span class="s">"zombie exit"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">exit()</code>函数首先关闭这个进程打开的所有文件描述符（15-20行），然后除去自己对所处的文件目录的引用（22-25行），对文件管理相关数据结构的访问必须要获得和释放相关的锁（<code class="language-plaintext highlighter-rouge">begin_op()</code>和<code class="language-plaintext highlighter-rouge">end_op()</code>）。清除这些引用可以允许文件系统管理当前的缓存。如果这个进程的父进程正在等待子进程结束，那么这个进程必须唤醒父进程（30行），只有这样父进程才能够在某个时刻回收僵尸子进程。如果这个进程有子进程的话，就把这个进程的子进程都传给<code class="language-plaintext highlighter-rouge">init</code>进程，并由<code class="language-plaintext highlighter-rouge">init</code>进程来负责回收僵尸子进程（33-39行）。最后，这个进程的状态会被设置为<code class="language-plaintext highlighter-rouge">ZOMBIE</code>，调度器调度其他进程运行（42-44行）。</p>

<h2 id="参考资料">参考资料</h2>

<ol>
  <li>
    <p>Operating System Concepts, 7th Edition</p>
  </li>
  <li>
    <p>Computer Systems: a Programmer’s Perspective, 3rd Edition</p>
  </li>
  <li>
    <p>Process in Linux, https://www.cs.columbia.edu/~junfeng/10sp-w4118/lectures/l07-proc-linux.pdf</p>
  </li>
  <li>
    <p>10 Things Every Linux Programmer Should Know, http://www.mulix.org/lectures/kernel_workshop_mar_2004/things.pdf</p>
  </li>
  <li>
    <p>Introduction to Linux Kernel, Chapter 3 Process Management, https://notes.shichao.io/lkd/ch3/#chapter-3-process-management</p>
  </li>
  <li>
    <p>A Complete Introduction to Windows Processes, Threads and Related Resources, https://www.tenouk.com/ModuleT.html</p>
  </li>
  <li>
    <p>Windows进程数据结构及创建流程，https://blog.csdn.net/cuit/article/details/9200097</p>
  </li>
</ol>]]></content><author><name>Hao He</name><email>haohe@andrew.cmu.edu</email></author><category term="Chinese" /><category term="C" /><category term="Operating System" /><summary type="html"><![CDATA[上一篇的地址]]></summary></entry><entry><title type="html">XV6操作系统代码阅读心得（一）：启动加载、中断与系统调用</title><link href="https://hehao98.github.io/posts/2019/03/xv6-1/" rel="alternate" type="text/html" title="XV6操作系统代码阅读心得（一）：启动加载、中断与系统调用" /><published>2019-03-15T00:00:00-07:00</published><updated>2019-03-15T00:00:00-07:00</updated><id>https://hehao98.github.io/posts/2019/03/xv6-1</id><content type="html" xml:base="https://hehao98.github.io/posts/2019/03/xv6-1/"><![CDATA[<p>XV6操作系统是MIT 6.828课程中使用的教学操作系统，是在现代硬件上对Unix V6系统的重写。XV6总共只有一万多行，非常适合初学者用于学习和实践操作系统相关知识。</p>

<p>MIT 6.828的课程网站是https://pdos.csail.mit.edu/6.828/。XV6操作系统有官方文档，英文版在前面的网站可以下载，中文版翻译参见https://th0ar.gitbooks.io/xv6-chinese/content/。</p>

<p><a href="https://hehao98.github.io/files/Xv6中断与系统调用.pdf">此部分内容另有PPT</a></p>

<h2 id="前置知识">前置知识</h2>

<p>在阅读XV6操作系统代码前，需要熟练掌握C语言，了解有关X86体系结构的基本知识，操作系统相关的基本概念，以及关于编译、链接相关的基本知识。关于相关理论知识，个人推荐的教材是文末的参考文献[1]、[2]。此外，阅读过程中可能遇到很多新概念，熟练掌握Google和Stack Overflow也是必须的。其中，尤其有用的资料是OS Dev Wiki和x86指令手册。最后，推荐能熟练使用某种代码编辑器，提升自己阅读代码的效率。</p>

<h2 id="相关知识总结">相关知识总结</h2>

<h3 id="1-内核态与用户态">1. 内核态与用户态</h3>

<p>在操作系统中，内核态指的是操作系统内核在运行时系统的状态，在这个状态下，内核程序具有访问任何已有硬件和执行任何已有指令的权限；用户态指的是用户进程在执行时系统的状态，在这个状态下，用户进程只能执行一部分指令，按照操作系统提供的系统调用来访问硬件和与其他进程交互。将内核态与用户态隔离是为了提升系统整体的安全性和健壮性，避免恶意进程和出错进程破坏系统。</p>

<h3 id="2-中断与系统调用">2. 中断与系统调用</h3>

<p>中断是一种能让操作系统响应外部硬件的机制，比如说，在一个用户进程执行时，另一个用户进程请求的磁盘文件加载完毕，那么需要设计一个中断信号来通知操作系统，暂停当前用户进程，让操作系统处理这个中断事件；而系统调用则是使得用户进程能够陷入内核态，请求某种系统服务的机制，比如利用系统提供的syscall指令陷入内核，为进程完成需要内核权限的输入输出任务，然后返回用户态，进程继续执行。</p>

<p>计算机在运行时，通过CPU内某些寄存器的权限位来得知当前是处于内核态还是用户态。比如，在x86系统中，CPU通过检查%cs寄存器内的CPL位，来检查当前指令的执行权限级别。在XV6系统中，CPL0代表内核态，CPL3代表用户态。如果指令的执行权限不符合CPL位的值，那么就会产生一个通用保护异常(General Protection Fault)。</p>

<h3 id="3-elf文件">3. ELF文件</h3>

<p>ELF是Unix系统中主要被使用的可执行文件格式，详细信息可以参考https://en.wikipedia.org/wiki/Executable_and_Linkable_Format。在bootmain()函数中，涉及到了ELF中两个重要的概念，ELF Header和Program Header。ELF Header记录了ELF文件相关的基本信息，其中包含一组Program Header，每个Program Header记录ELF文件中的一段代码或者数据的具体位置和大小等基本信息。Program Header所指向的ELF段包括.text .data等。bootmain()函数就是先从加载到内存0x10000地址处的ELF Header中获得所有Program Header的信息，然后将这些Program段依次从磁盘加载到内存中。通过readelf命令，可以查看内核究竟有哪些Program Header，得到结果如下：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        80100000 001000 008111 00  AX  0   0  4
  [ 2] .rodata           PROGBITS        80108114 009114 000672 00   A  0   0  4
  [ 3] .stab             PROGBITS        80108786 009786 000001 0c  WA  4   0  1
  [ 4] .stabstr          STRTAB          80108787 009787 000001 00  WA  0   0  1
  [ 5] .data             PROGBITS        80109000 00a000 002596 00  WA  0   0 4096
  [ 6] .bss              NOBITS          8010b5a0 00c596 00715c 00  WA  0   0 32
  [ 7] .debug_line       PROGBITS        00000000 00c596 001f8c 00      0   0  1
  [ 8] .debug_info       PROGBITS        00000000 00e522 00a965 00      0   0  1
  [ 9] .debug_abbrev     PROGBITS        00000000 018e87 0026ed 00      0   0  1
  [10] .debug_aranges    PROGBITS        00000000 01b578 0003a0 00      0   0  8
  [11] .debug_loc        PROGBITS        00000000 01b918 002f30 00      0   0  1
  [12] .debug_str        PROGBITS        00000000 01e848 000cdc 01  MS  0   0  1
  [13] .comment          PROGBITS        00000000 01f524 00001c 01  MS  0   0  1
  [14] .debug_ranges     PROGBITS        00000000 01f540 000018 00      0   0  1
  [15] .shstrtab         STRTAB          00000000 01f558 0000a5 00      0   0  1
  [16] .symtab           SYMTAB          00000000 01f8d0 0023d0 10     17 138  4
  [17] .strtab           STRTAB          00000000 021ca0 0012d0 00      0   0  1
</code></pre></div></div>

<h2 id="xv6系统的启动过程">XV6系统的启动过程</h2>

<p><img src="https://hehao98.github.io/assets/1-boot.png" alt="a" /></p>

<p>在源代码中，XV6系统的启动运行轨迹如图。系统的启动分为以下几个步骤：</p>

<ol>
  <li>
    <p>首先，在<code class="language-plaintext highlighter-rouge">bootasm.S</code>中，系统必须初始化CPU的运行状态。具体地说，需要将x86 CPU从启动时默认的Intel 8088 16位实模式切换到80386之后的32位保护模式；然后设置初始的GDT(详细解释参见https://wiki.osdev.org/Global_Descriptor_Table)，将虚拟地址直接按值映射到物理地址；最后，调用<code class="language-plaintext highlighter-rouge">bootmain.c</code>中的<code class="language-plaintext highlighter-rouge">bootmain()</code>函数。</p>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">bootmain()</code>函数的主要任务是将内核的ELF文件从硬盘中加载进内存，并将控制权转交给内核程序。具体地说，此函数首先将ELF文件的前4096个字节（也就是第一个内存页）从磁盘里加载进来，然后根据ELF文件头里记录的文件大小和不同的程序头信息，将完整的ELF文件加载到内存中。然后根据ELF文件里记录的入口点，将控制权转交给XV6系统。</p>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">entry.S</code>的主要任务是设置页表，让分页硬件能够正常运行，然后跳转到<code class="language-plaintext highlighter-rouge">main.c</code>的<code class="language-plaintext highlighter-rouge">main()</code>函数处，开始整个操作系统的运行。</p>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">main()</code>函数首先初始化了与内存管理、进程管理、中断控制、文件管理相关的各种模块，然后启动第一个叫做<code class="language-plaintext highlighter-rouge">initcode</code>的用户进程。至此，整个XV6系统启动完毕。</p>
  </li>
</ol>

<p>XV6的操作系统的加载与真实情况有一些区别。首先，XV6操作系统作为教学操作系统，它的启动过程是相对比较简单的。XV6并不会在启动时对主板上的硬件做全面的检查，而真实的Bootloader会对所有连接到计算机的所有硬件的状态进行检查。此外，XV6的Boot loader足够精简，以至于能够被压缩到小于512字节，从而能够直接将Bootloader加载进0x7c00的内存位置。真实的操作系统中，通常会有一个两步加载的过程。首先将一个加载Bootloader的程序加载在0x7c00处，然后加载进完整的功能复杂的Bootloader，再使用Bootloader加载内核。</p>

<h2 id="bootmain函数详解"><code class="language-plaintext highlighter-rouge">bootmain()</code>函数详解</h2>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">bootmain</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>
<span class="p">{</span>
  <span class="k">struct</span> <span class="n">elfhdr</span> <span class="o">*</span><span class="n">elf</span><span class="p">;</span>
  <span class="k">struct</span> <span class="n">proghdr</span> <span class="o">*</span><span class="n">ph</span><span class="p">,</span> <span class="o">*</span><span class="n">eph</span><span class="p">;</span>
  <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">entry</span><span class="p">)(</span><span class="kt">void</span><span class="p">);</span>
  <span class="n">uchar</span><span class="o">*</span> <span class="n">pa</span><span class="p">;</span>

  <span class="n">elf</span> <span class="o">=</span> <span class="p">(</span><span class="k">struct</span> <span class="n">elfhdr</span><span class="o">*</span><span class="p">)</span><span class="mh">0x10000</span><span class="p">;</span>  <span class="c1">// scratch space</span>

  <span class="c1">// Read 1st page off disk</span>
  <span class="n">readseg</span><span class="p">((</span><span class="n">uchar</span><span class="o">*</span><span class="p">)</span><span class="n">elf</span><span class="p">,</span> <span class="mi">4096</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>

  <span class="c1">// Is this an ELF executable?</span>
  <span class="k">if</span><span class="p">(</span><span class="n">elf</span><span class="o">-&gt;</span><span class="n">magic</span> <span class="o">!=</span> <span class="n">ELF_MAGIC</span><span class="p">)</span>
    <span class="k">return</span><span class="p">;</span>  <span class="c1">// let bootasm.S handle error</span>

  <span class="c1">// Load each program segment (ignores ph flags).</span>
  <span class="n">ph</span> <span class="o">=</span> <span class="p">(</span><span class="k">struct</span> <span class="n">proghdr</span><span class="o">*</span><span class="p">)((</span><span class="n">uchar</span><span class="o">*</span><span class="p">)</span><span class="n">elf</span> <span class="o">+</span> <span class="n">elf</span><span class="o">-&gt;</span><span class="n">phoff</span><span class="p">);</span>
  <span class="n">eph</span> <span class="o">=</span> <span class="n">ph</span> <span class="o">+</span> <span class="n">elf</span><span class="o">-&gt;</span><span class="n">phnum</span><span class="p">;</span>
  <span class="k">for</span><span class="p">(;</span> <span class="n">ph</span> <span class="o">&lt;</span> <span class="n">eph</span><span class="p">;</span> <span class="n">ph</span><span class="o">++</span><span class="p">){</span>
    <span class="n">pa</span> <span class="o">=</span> <span class="p">(</span><span class="n">uchar</span><span class="o">*</span><span class="p">)</span><span class="n">ph</span><span class="o">-&gt;</span><span class="n">paddr</span><span class="p">;</span>
    <span class="n">readseg</span><span class="p">(</span><span class="n">pa</span><span class="p">,</span> <span class="n">ph</span><span class="o">-&gt;</span><span class="n">filesz</span><span class="p">,</span> <span class="n">ph</span><span class="o">-&gt;</span><span class="n">off</span><span class="p">);</span>
    <span class="k">if</span><span class="p">(</span><span class="n">ph</span><span class="o">-&gt;</span><span class="n">memsz</span> <span class="o">&gt;</span> <span class="n">ph</span><span class="o">-&gt;</span><span class="n">filesz</span><span class="p">)</span>
      <span class="n">stosb</span><span class="p">(</span><span class="n">pa</span> <span class="o">+</span> <span class="n">ph</span><span class="o">-&gt;</span><span class="n">filesz</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">ph</span><span class="o">-&gt;</span><span class="n">memsz</span> <span class="o">-</span> <span class="n">ph</span><span class="o">-&gt;</span><span class="n">filesz</span><span class="p">);</span>
  <span class="p">}</span>

  <span class="c1">// Call the entry point from the ELF header.</span>
  <span class="c1">// Does not return!</span>
  <span class="n">entry</span> <span class="o">=</span> <span class="p">(</span><span class="kt">void</span><span class="p">(</span><span class="o">*</span><span class="p">)(</span><span class="kt">void</span><span class="p">))(</span><span class="n">elf</span><span class="o">-&gt;</span><span class="n">entry</span><span class="p">);</span>
  <span class="n">entry</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">bootmain.c</code>中的<code class="language-plaintext highlighter-rouge">bootmain()</code>函数是XV6系统启动的核心代码。<code class="language-plaintext highlighter-rouge">bootmain()</code>函数首先从磁盘中读取第一个内存页（11行）；然后判断读取到的内存页是否是ELF文件的开头（14-15行）；如果是的话，根据ELF文件头内保存的每个程序头和其长度信息，依次将程序读入内存（18-25行）；最后，从ELF文件头内找到程序的入口点，跳转到那里执行（29-30行）。通过<code class="language-plaintext highlighter-rouge">readelf</code>命令可以得到ELF文件中程序头的详细信息。总而言之，boot loader在XV6系统的启动中主要用来将内核的ELF文件从硬盘中加载进内存，并将控制权转交给内核程序。</p>

<p>通过获取<code class="language-plaintext highlighter-rouge">struct elfhdr</code>中<code class="language-plaintext highlighter-rouge">struct proghdr</code>的位置和大小信息（18-19行，<code class="language-plaintext highlighter-rouge">elf-&gt;phoff</code> <code class="language-plaintext highlighter-rouge">elf-&gt;phnum</code>），就能得知XV6内核程序段(Program Header)的位置和数量，在加载硬盘扇区的过程中，逐步向前移动<code class="language-plaintext highlighter-rouge">ph</code>指针，一个个加载对应的程序段。对于一个程序段，通过<code class="language-plaintext highlighter-rouge">ph-&gt;filesz</code>和<code class="language-plaintext highlighter-rouge">ph-&gt;off</code>获得程序段的大小和位置，使用<code class="language-plaintext highlighter-rouge">readseg()</code>函数来加载程序段，逐步向前移动<code class="language-plaintext highlighter-rouge">pa</code>指针，直到加载进的磁盘扇区使得加载进的扇区大小超过程序文件的结尾<code class="language-plaintext highlighter-rouge">epa</code>，从而完成单个程序段的加载。对于单个内核程序段，代码确保它会填满最后一个内存页。</p>

<h2 id="xv6系统的中断管理">XV6系统的中断管理</h2>

<h3 id="1-中断描述符与中断描述符表">1. 中断描述符与中断描述符表</h3>

<p>中断描述符表是X86体系结构中保护模式下用来存放中断服务程序信息的数据结构，其中的条目被称为中断描述符。在XV6数据结构中，涉及的数据结构如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Gate descriptors for interrupts and traps</span>
<span class="k">struct</span> <span class="n">gatedesc</span> <span class="p">{</span>
  <span class="n">uint</span> <span class="n">off_15_0</span> <span class="o">:</span> <span class="mi">16</span><span class="p">;</span>   <span class="c1">// low 16 bits of offset in segment</span>
  <span class="n">uint</span> <span class="n">cs</span> <span class="o">:</span> <span class="mi">16</span><span class="p">;</span>         <span class="c1">// code segment selector</span>
  <span class="n">uint</span> <span class="n">args</span> <span class="o">:</span> <span class="mi">5</span><span class="p">;</span>        <span class="c1">// # args, 0 for interrupt/trap gates</span>
  <span class="n">uint</span> <span class="n">rsv1</span> <span class="o">:</span> <span class="mi">3</span><span class="p">;</span>        <span class="c1">// reserved(should be zero I guess)</span>
  <span class="n">uint</span> <span class="n">type</span> <span class="o">:</span> <span class="mi">4</span><span class="p">;</span>        <span class="c1">// type(STS_{IG32,TG32})</span>
  <span class="n">uint</span> <span class="n">s</span> <span class="o">:</span> <span class="mi">1</span><span class="p">;</span>           <span class="c1">// must be 0 (system)</span>
  <span class="n">uint</span> <span class="n">dpl</span> <span class="o">:</span> <span class="mi">2</span><span class="p">;</span>         <span class="c1">// descriptor(meaning new) privilege level</span>
  <span class="n">uint</span> <span class="n">p</span> <span class="o">:</span> <span class="mi">1</span><span class="p">;</span>           <span class="c1">// Present</span>
  <span class="n">uint</span> <span class="n">off_31_16</span> <span class="o">:</span> <span class="mi">16</span><span class="p">;</span>  <span class="c1">// high bits of offset in segment</span>
<span class="p">};</span>
<span class="k">struct</span> <span class="n">gatedesc</span> <span class="n">idt</span><span class="p">[</span><span class="mi">256</span><span class="p">];</span>
<span class="k">extern</span> <span class="n">uint</span> <span class="n">vectors</span><span class="p">[];</span> 
</code></pre></div></div>

<p>其中，struct gatedesc的格式与X86体系结构所要求的完全相同https://wiki.osdev.org/Interrupt_Descriptor_Table。对于第$i$条中断描述符，CS寄存器存储的是内核代码段的段编号SEG_KCODE，offset部分存储的是vector[i]的地址。在XV6系统中，所有的vector[i]地址均指向trapasm.S中的alltraps函数。</p>

<h3 id="2-xv6中断管理的初始化">2. XV6中断管理的初始化</h3>

<p>由于中断机制是由CPU硬件支持的，所以计算机在运行阶段一开始时，BIOS就开启并支持中断。但是，在XV6系统的启动过程中，第一条指令就使用cli指令来屏蔽中断，直到第一个进程调度时才会在scheduler()里使用STI指令允许硬件中断。在允许硬件中断之前，必须先配置好中断描述符表，具体的实现在tvinit()和idtinit()函数中</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">tvinit</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
  <span class="kt">int</span> <span class="n">i</span><span class="p">;</span>

  <span class="k">for</span><span class="p">(</span><span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="mi">256</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
    <span class="n">SETGATE</span><span class="p">(</span><span class="n">idt</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SEG_KCODE</span><span class="o">&lt;&lt;</span><span class="mi">3</span><span class="p">,</span> <span class="n">vectors</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="mi">0</span><span class="p">);</span>
  <span class="n">SETGATE</span><span class="p">(</span><span class="n">idt</span><span class="p">[</span><span class="n">T_SYSCALL</span><span class="p">],</span> <span class="mi">1</span><span class="p">,</span> <span class="n">SEG_KCODE</span><span class="o">&lt;&lt;</span><span class="mi">3</span><span class="p">,</span> <span class="n">vectors</span><span class="p">[</span><span class="n">T_SYSCALL</span><span class="p">],</span> <span class="n">DPL_USER</span><span class="p">);</span>

  <span class="n">initlock</span><span class="p">(</span><span class="o">&amp;</span><span class="n">tickslock</span><span class="p">,</span> <span class="s">"time"</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">idtinit</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">lidt</span><span class="p">(</span><span class="n">idt</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">idt</span><span class="p">));</span>
<span class="p">}</span>
</code></pre></div></div>

<p>在XV6系统中，只有中断和系统调用机制可以实现用户态到内核态的转变。因此，即使是第一个用户进程启动时，XV6系统也会在内核态手动构建Trap Frame，设置Trap Frame中的CS寄存器上的相关权限位，然后调用中断返回函数进入用户态。XV6中的硬件中断都是使用CTI和STI指令来进行开关。在实际的计算机中，中断分为外部中断和内部中断。外部中断包括来自外部IO设备的中断、来自时钟的中断、断电信号等，外部中断又分为可屏蔽中断和不可屏蔽中断。对于内部中断，包括由软件调用INT指令触发的中断和由CPU内部错误（指令除零等）触发的中断。</p>

<h3 id="3-xv6中断处理过程举例">3. XV6中断处理过程举例</h3>

<p>以除零错误为例。当XV6的指令执行中遇到除零错误时，首先CPU硬件会发现这个错误，触发中断处理机制。在中断处理机制中，硬件会执行如下步骤:</p>

<ol>
  <li>从IDT 中获得第 n 个描述符，n 就是 int 的参数。</li>
  <li>检查CS的域 CPL &lt;= DPL，DPL 是描述符中记录的特权级。</li>
  <li>如果目标段选择符的 PL &lt; CPL，就在 CPU 内部的寄存器中保存ESP和SS的值。</li>
  <li>从一个任务段描述符中加载SS和ESP。</li>
  <li>将SS压栈。</li>
  <li>将ESP压栈。</li>
  <li>将EFLAGS压栈。</li>
  <li>将CS压栈。</li>
  <li>将EIP压栈。</li>
  <li>清除EFLAGS的一些位。</li>
  <li>设置CS和EIP为描述符中的值。</li>
</ol>

<p>此时，由于CS已经被设置为描述符中的值（SEG_KCODE)，所以此时已经进入了内核态，并且EIP指向了trapasm.S中alltraps函数的开头。在alltrap函数中，系统将用户寄存器压栈，构建Trap Frame，并且设置数据寄存器段为内核数据段，然后跳转到trap.c中的trap函数。在trap函数中，首先通过检查中断调用号，发现这不是一个系统调用，也不是一个外部硬件中断，因此进入如下代码段：</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">if</span><span class="p">(</span><span class="n">myproc</span><span class="p">()</span> <span class="o">==</span> <span class="mi">0</span> <span class="o">||</span> <span class="p">(</span><span class="n">tf</span><span class="o">-&gt;</span><span class="n">cs</span><span class="o">&amp;</span><span class="mi">3</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">){</span>
      <span class="c1">// In kernel, it must be our mistake.</span>
      <span class="n">cprintf</span><span class="p">(</span><span class="s">"unexpected trap %d from cpu %d eip %x (cr2=0x%x)</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span>
              <span class="n">tf</span><span class="o">-&gt;</span><span class="n">trapno</span><span class="p">,</span> <span class="n">cpuid</span><span class="p">(),</span> <span class="n">tf</span><span class="o">-&gt;</span><span class="n">eip</span><span class="p">,</span> <span class="n">rcr2</span><span class="p">());</span>
      <span class="n">panic</span><span class="p">(</span><span class="s">"trap"</span><span class="p">);</span>
    <span class="p">}</span>
    <span class="c1">// In user space, assume process misbehaved.</span>
    <span class="n">cprintf</span><span class="p">(</span><span class="s">"pid %d %s: trap %d err %d on cpu %d "</span>
            <span class="s">"eip 0x%x addr 0x%x--kill proc</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span>
            <span class="n">myproc</span><span class="p">()</span><span class="o">-&gt;</span><span class="n">pid</span><span class="p">,</span> <span class="n">myproc</span><span class="p">()</span><span class="o">-&gt;</span><span class="n">name</span><span class="p">,</span> <span class="n">tf</span><span class="o">-&gt;</span><span class="n">trapno</span><span class="p">,</span>
            <span class="n">tf</span><span class="o">-&gt;</span><span class="n">err</span><span class="p">,</span> <span class="n">cpuid</span><span class="p">(),</span> <span class="n">tf</span><span class="o">-&gt;</span><span class="n">eip</span><span class="p">,</span> <span class="n">rcr2</span><span class="p">());</span>
    <span class="n">myproc</span><span class="p">()</span><span class="o">-&gt;</span><span class="n">killed</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
</code></pre></div></div>

<p>根据触发中断的是内核态还是用户进程，执行不同的处理。如果是用户进程出错了，那么系统会杀死这个用户进程；如果是内核进程出错了，那么在输出一段错误信息后，整个系统进入死循环。</p>

<p>如果是一个可以修复的错误，比如页错误，那么系统会在处理完后返回trap()函数进入trapret()函数，在这个函数中恢复进程的执行上下文，让整个系统返回到触发中断的位置和状态。</p>

<h3 id="4-如何在xv6中添加新的系统调用以setrlimit为例">4. 如何在XV6中添加新的系统调用（以setrlimit为例）</h3>

<p>在Linux系统中，setrlimit系统调用的作用是设置资源使用限制。我们以setrlimit为例，要在XV6系统中添加一个新的系统调用，首先在syscall.h中添加一个新的系统调用的定义</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define SYS_setrlimit  22
</span></code></pre></div></div>

<p>然后，在syscall.c中增加新的系统调用的函数指针</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">int</span> <span class="p">(</span><span class="o">*</span><span class="n">syscalls</span><span class="p">[])(</span><span class="kt">void</span><span class="p">)</span> <span class="o">=</span> <span class="p">{</span>
        <span class="p">...</span>
    <span class="p">[</span><span class="n">SYS_setrlimit</span><span class="p">]</span> <span class="n">sys_setrlimit</span><span class="p">,</span>
<span class="p">};</span>
</code></pre></div></div>

<p>当然现在sys_setrlimit这个符号还不存在，因此在sysproc.c中声明并实现这个函数</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">sys_setrlimit</span><span class="p">(</span><span class="kt">int</span> <span class="n">resource</span><span class="p">,</span> <span class="k">const</span> <span class="k">struct</span> <span class="n">rlimit</span> <span class="o">*</span><span class="n">rlim</span><span class="p">)</span> <span class="p">{</span>
    <span class="c1">// set max memory for this process, etc</span>
<span class="p">}</span>
</code></pre></div></div>

<p>最后，在user.h中声明setrlimit()这个函数系统调用函数的接口，并在usys.S中添加有关的用户系统调用接口。</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">SYSCALL</span><span class="p">(</span><span class="n">setrlimit</span><span class="p">)</span>

<span class="kt">int</span> <span class="nf">setrlimit</span><span class="p">(</span><span class="kt">int</span> <span class="n">resource</span><span class="p">,</span> <span class="k">const</span> <span class="k">struct</span> <span class="n">rlimit</span> <span class="o">*</span><span class="n">rlim</span><span class="p">);</span>
</code></pre></div></div>

<h2 id="一些问题">一些问题</h2>

<h3 id="1-在中断描述符表里存放了一个cs寄存器的值为什么要有这个cs寄存器">1. 在中断描述符表里存放了一个CS寄存器的值，为什么要有这个CS寄存器？</h3>

<p>这个问题事实上涉及到了很多关于x86的底层实现的细节。在80386中，硬件对内存访问支持保护模式，在32位保护模式中，CPU使用Global Descriptor Table来存储有关内存段的信息，使用CS寄存器来存储GDT的索引，通过这个方式来索引内存段的过程中，可以通过GDT中的相应位来设置这块内存的权限。注意，这与操作系统的虚拟内存是相互独立的两个机制。对于XV6系统而言，GDT中只有5个描述符，分别是内核代码段、内核数据段、用户代码段、用户数据段和TSS，对应的定义如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>   <span class="c1">// various segment selectors.</span>
   <span class="cp">#define SEG_KCODE 1  // kernel code
</span>   <span class="cp">#define SEG_KDATA 2  // kernel data+stack
</span>   <span class="cp">#define SEG_UCODE 3  // user code
</span>   <span class="cp">#define SEG_UDATA 4  // user data+stack
</span>   <span class="cp">#define SEG_TSS   5  // this process's task state
</span></code></pre></div></div>

<p>在中断切换的时候，需要从用户代码段切换到内核代码段，因此需要保存CS的值，在中断返回的时候再弹出。此外，中断描述符表中的CS寄存器的值指明了中断处理程序应该使用的CS值，也就是对应的内存段。</p>

<h3 id="2-在从用户态和内核态之间切换的时候代码的执行权限是如何被设置的">2. 在从用户态和内核态之间切换的时候，代码的执行权限是如何被设置的？</h3>

<p>代码的执行权限由CS寄存器中的权限位标记。在中断调用时，INT指令会保存原来的CS寄存器，读入新的CS寄存器，从而维持中断前后的代码执行权限不变。对于第一个用户进程的而言，需要在启动前手动设置CS寄存器的相关权限位才行，具体的代码片段如下</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="n">p</span><span class="o">-&gt;</span><span class="n">tf</span><span class="o">-&gt;</span><span class="n">cs</span> <span class="o">=</span> <span class="p">(</span><span class="n">SEG_UCODE</span> <span class="o">&lt;&lt;</span> <span class="mi">3</span><span class="p">)</span> <span class="o">|</span> <span class="n">DPL_USER</span><span class="p">;</span>
    <span class="n">p</span><span class="o">-&gt;</span><span class="n">tf</span><span class="o">-&gt;</span><span class="n">ds</span> <span class="o">=</span> <span class="p">(</span><span class="n">SEG_UDATA</span> <span class="o">&lt;&lt;</span> <span class="mi">3</span><span class="p">)</span> <span class="o">|</span> <span class="n">DPL_USER</span><span class="p">;</span>
</code></pre></div></div>

<h2 id="参考文献">参考文献</h2>

<ol>
  <li>Bryant, Randal E., O’Hallaron David Richard, and O’Hallaron David Richard. Computer systems: a programmer’s perspective. Vol. 281. Upper Saddle River: Prentice Hall, 2003.</li>
  <li>Silberschatz, Abraham, Greg Gagne, and Peter B. Galvin. Operating system concepts. Wiley, 2018.</li>
</ol>]]></content><author><name>Hao He</name><email>haohe@andrew.cmu.edu</email></author><category term="Chinese" /><category term="C" /><category term="Operating System" /><summary type="html"><![CDATA[XV6操作系统是MIT 6.828课程中使用的教学操作系统，是在现代硬件上对Unix V6系统的重写。XV6总共只有一万多行，非常适合初学者用于学习和实践操作系统相关知识。]]></summary></entry></feed>