Yay!!!<div><br></div><div>Thanks so much for doing this!</div><div><br></div><div>--Sean Silva</div><div><br><div class="gmail_quote">On Tue, Jun 19, 2012 at 7:57 PM, Bill Wendling <span dir="ltr"><<a href="mailto:isanbard@gmail.com" target="_blank">isanbard@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: void<br>

Date: Tue Jun 19 21:57:56 2012<br>

New Revision: 158786<br>

<br>

URL: <a href="http://llvm.org/viewvc/llvm-project?rev=158786&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=158786&view=rev</a><br>

Log:<br>

Sphinxify the CodingStandard documentation.<br>

<br>

Added:<br>

    llvm/trunk/docs/CodingStandards.rst<br>

Removed:<br>

    llvm/trunk/docs/CodingStandards.html<br>

Modified:<br>

    llvm/trunk/docs/development_process.rst<br>

<br>

Removed: llvm/trunk/docs/CodingStandards.html<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CodingStandards.html?rev=158785&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CodingStandards.html?rev=158785&view=auto</a><br>


==============================================================================<br>

--- llvm/trunk/docs/CodingStandards.html (original)<br>

+++ llvm/trunk/docs/CodingStandards.html (removed)<br>

@@ -1,1568 +0,0 @@<br>

-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"<br>

-                      "<a href="http://www.w3.org/TR/html4/strict.dtd" target="_blank">http://www.w3.org/TR/html4/strict.dtd</a>"><br>

-<html><br>

-<head><br>

-  <meta http-equiv="Content-Type" content="text/html; charset=utf-8"><br>

-  <link rel="stylesheet" href="_static/llvm.css" type="text/css"><br>

-  <title>LLVM Coding Standards</title><br>

-</head><br>

-<body><br>

-<br>

-<h1><br>

-  LLVM Coding Standards<br>

-</h1><br>

-<br>

-<ol><br>

-  <li><a href="#introduction">Introduction</a></li><br>

-  <li><a href="#mechanicalissues">Mechanical Source Issues</a><br>

-    <ol><br>

-      <li><a href="#sourceformating">Source Code Formatting</a><br>

-        <ol><br>

-          <li><a href="#scf_commenting">Commenting</a></li><br>

-          <li><a href="#scf_commentformat">Comment Formatting</a></li><br>

-          <li><a href="#scf_includes"><tt>#include</tt> Style</a></li><br>

-          <li><a href="#scf_codewidth">Source Code Width</a></li><br>

-          <li><a href="#scf_spacestabs">Use Spaces Instead of Tabs</a></li><br>

-          <li><a href="#scf_indentation">Indent Code Consistently</a></li><br>

-        </ol></li><br>

-      <li><a href="#compilerissues">Compiler Issues</a><br>

-        <ol><br>

-          <li><a href="#ci_warningerrors">Treat Compiler Warnings Like<br>

-              Errors</a></li><br>

-          <li><a href="#ci_portable_code">Write Portable Code</a></li><br>

-          <li><a href="#ci_rtti_exceptions">Do not use RTTI or Exceptions</a></li><br>

-          <li><a href="#ci_static_ctors">Do not use Static Constructors</a></li><br>

-          <li><a href="#ci_class_struct">Use of <tt>class</tt>/<tt>struct</tt> Keywords</a></li><br>

-        </ol></li><br>

-    </ol></li><br>

-  <li><a href="#styleissues">Style Issues</a><br>

-    <ol><br>

-      <li><a href="#macro">The High-Level Issues</a><br>

-        <ol><br>

-          <li><a href="#hl_module">A Public Header File <b>is</b> a<br>

-              Module</a></li><br>

-          <li><a href="#hl_dontinclude"><tt>#include</tt> as Little as Possible</a></li><br>

-          <li><a href="#hl_privateheaders">Keep "internal" Headers<br>

-              Private</a></li><br>

-          <li><a href="#hl_earlyexit">Use Early Exits and <tt>continue</tt> to Simplify<br>

-              Code</a></li><br>

-          <li><a href="#hl_else_after_return">Don't use <tt>else</tt> after a<br>

-              <tt>return</tt></a></li><br>

-          <li><a href="#hl_predicateloops">Turn Predicate Loops into Predicate<br>

-              Functions</a></li><br>

-        </ol></li><br>

-      <li><a href="#micro">The Low-Level Issues</a><br>

-        <ol><br>

-          <li><a href="#ll_naming">Name Types, Functions, Variables, and Enumerators Properly</a></li><br>

-          <li><a href="#ll_assert">Assert Liberally</a></li><br>

-          <li><a href="#ll_ns_std">Do not use '<tt>using namespace std</tt>'</a></li><br>

-          <li><a href="#ll_virtual_anch">Provide a virtual method anchor for<br>

-              classes in headers</a></li><br>

-          <li><a href="#ll_end">Don't evaluate <tt>end()</tt> every time through a<br>

-              loop</a></li><br>

-          <li><a href="#ll_iostream"><tt>#include &lt;iostream&gt;</tt> is<br>

-              <em>forbidden</em></a></li><br>

-          <li><a href="#ll_raw_ostream">Use <tt>raw_ostream</tt></a></li><br>

-          <li><a href="#ll_avoidendl">Avoid <tt>std::endl</tt></a></li><br>

-        </ol></li><br>

-<br>

-      <li><a href="#nano">Microscopic Details</a><br>

-        <ol><br>

-          <li><a href="#micro_spaceparen">Spaces Before Parentheses</a></li><br>

-          <li><a href="#micro_preincrement">Prefer Preincrement</a></li><br>

-          <li><a href="#micro_namespaceindent">Namespace Indentation</a></li><br>

-          <li><a href="#micro_anonns">Anonymous Namespaces</a></li><br>

-        </ol></li><br>

-<br>

-<br>

-    </ol></li><br>

-  <li><a href="#seealso">See Also</a></li><br>

-</ol><br>

-<br>

-<div class="doc_author"><br>

-  <p>Written by <a href="mailto:<a href="mailto:sabre@nondot.org">sabre@nondot.org</a>">Chris Lattner</a></p><br>

-</div><br>

-<br>

-<br>

-<!-- *********************************************************************** --><br>

-<h2><a name="introduction">Introduction</a></h2><br>

-<!-- *********************************************************************** --><br>

-<br>

-<div><br>

-<br>

-<p>This document attempts to describe a few coding standards that are being used<br>

-in the LLVM source tree.  Although no coding standards should be regarded as<br>

-absolute requirements to be followed in all instances, coding standards are<br>

-particularly important for large-scale code bases that follow a library-based<br>

-design (like LLVM).</p><br>

-<br>

-<p>This document intentionally does not prescribe fixed standards for religious<br>

-issues such as brace placement and space usage.  For issues like this, follow<br>

-the golden rule:</p><br>

-<br>

-<blockquote><br>

-<br>

-<p><b><a name="goldenrule">If you are extending, enhancing, or bug fixing<br>

-already implemented code, use the style that is already being used so that the<br>

-source is uniform and easy to follow.</a></b></p><br>

-<br>

-</blockquote><br>

-<br>

-<p>Note that some code bases (e.g. libc++) have really good reasons to deviate<br>

-from the coding standards.  In the case of libc++, this is because the naming<br>

-and other conventions are dictated by the C++ standard.  If you think there is<br>

-a specific good reason to deviate from the standards here, please bring it up<br>

-on the LLVMdev mailing list.</p><br>

-<br>

-<p>There are some conventions that are not uniformly followed in the code base<br>

-(e.g. the naming convention).  This is because they are relatively new, and a<br>

-lot of code was written before they were put in place.  Our long term goal is<br>

-for the entire codebase to follow the convention, but we explicitly <em>do<br>

-not</em> want patches that do large-scale reformating of existing code.  OTOH,<br>

-it is reasonable to rename the methods of a class if you're about to change it<br>

-in some other way.  Just do the reformating as a separate commit from the<br>

-functionality change. </p><br>

-<br>

-<p>The ultimate goal of these guidelines is the increase readability and<br>

-maintainability of our common source base. If you have suggestions for topics to<br>

-be included, please mail them to <a<br>

-href="mailto:<a href="mailto:sabre@nondot.org">sabre@nondot.org</a>">Chris</a>.</p><br>

-<br>

-</div><br>

-<br>

-<!-- *********************************************************************** --><br>

-<h2><br>

-  <a name="mechanicalissues">Mechanical Source Issues</a><br>

-</h2><br>

-<!-- *********************************************************************** --><br>

-<br>

-<div><br>

-<br>

-<!-- ======================================================================= --><br>

-<h3><br>

-  <a name="sourceformating">Source Code Formatting</a><br>

-</h3><br>

-<br>

-<div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="scf_commenting">Commenting</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>Comments are one critical part of readability and maintainability.  Everyone<br>

-knows they should comment their code, and so should you.  When writing comments,<br>

-write them as English prose, which means they should use proper capitalization,<br>

-punctuation, etc.  Aim to describe what a code is trying to do and why, not<br>

-"how" it does it at a micro level. Here are a few critical things to<br>

-document:</p><br>

-<br>

-<h5>File Headers</h5><br>

-<br>

-<div><br>

-<br>

-<p>Every source file should have a header on it that describes the basic<br>

-purpose of the file.  If a file does not have a header, it should not be<br>

-checked into the tree.  The standard header looks like this:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-//===-- llvm/Instruction.h - Instruction class definition -------*- C++ -*-===//<br>

-//<br>

-//                     The LLVM Compiler Infrastructure<br>

-//<br>

-// This file is distributed under the University of Illinois Open Source<br>

-// License. See LICENSE.TXT for details.<br>

-//<br>

-//===----------------------------------------------------------------------===//<br>

-//<br>

-// This file contains the declaration of the Instruction class, which is the<br>

-// base class for all of the VM instructions.<br>

-//<br>

-//===----------------------------------------------------------------------===//<br>

-</pre><br>

-</div><br>

-<br>

-<p>A few things to note about this particular format:  The "<tt>-*- C++<br>

--*-</tt>" string on the first line is there to tell Emacs that the source file<br>

-is a C++ file, not a C file (Emacs assumes <tt>.h</tt> files are C files by default).<br>

-Note that this tag is not necessary in <tt>.cpp</tt> files.  The name of the file is also<br>

-on the first line, along with a very short description of the purpose of the<br>

-file.  This is important when printing out code and flipping though lots of<br>

-pages.</p><br>

-<br>

-<p>The next section in the file is a concise note that defines the license<br>

-that the file is released under.  This makes it perfectly clear what terms the<br>

-source code can be distributed under and should not be modified in any way.</p><br>

-<br>

-<p>The main body of the description does not have to be very long in most cases.<br>

-Here it's only two lines.  If an algorithm is being implemented or something<br>

-tricky is going on, a reference to the paper where it is published should be<br>

-included, as well as any notes or "gotchas" in the code to watch out for.</p><br>

-<br>

-</div><br>

-<br>

-<h5>Class overviews</h5><br>

-<br>

-<p>Classes are one fundamental part of a good object oriented design.  As such,<br>

-a class definition should have a comment block that explains what the class is<br>

-used for and how it works.  Every non-trivial class is expected to have a<br>

-doxygen comment block.</p><br>

-<br>

-<br>

-<h5>Method information</h5><br>

-<br>

-<div><br>

-<br>

-<p>Methods defined in a class (as well as any global functions) should also be<br>

-documented properly.  A quick note about what it does and a description of the<br>

-borderline behaviour is all that is necessary here (unless something<br>

-particularly tricky or insidious is going on).  The hope is that people can<br>

-figure out how to use your interfaces without reading the code itself.</p><br>

-<br>

-<p>Good things to talk about here are what happens when something unexpected<br>

-happens: does the method return null?  Abort?  Format your hard disk?</p><br>

-<br>

-</div><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="scf_commentformat">Comment Formatting</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>In general, prefer C++ style (<tt>//</tt>) comments.  They take less space,<br>

-require less typing, don't have nesting problems, etc.  There are a few cases<br>

-when it is useful to use C style (<tt>/* */</tt>) comments however:</p><br>

-<br>

-<ol><br>

-  <li>When writing C code: Obviously if you are writing C code, use C style<br>

-      comments.</li><br>

-  <li>When writing a header file that may be <tt>#include</tt>d by a C source<br>

-      file.</li><br>

-  <li>When writing a source file that is used by a tool that only accepts C<br>

-      style comments.</li><br>

-</ol><br>

-<br>

-<p>To comment out a large block of code, use <tt>#if 0</tt> and <tt>#endif</tt>.<br>

-These nest properly and are better behaved in general than C style comments.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="scf_includes"><tt>#include</tt> Style</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>Immediately after the <a href="#scf_commenting">header file comment</a> (and<br>

-include guards if working on a header file), the <a<br>

-href="#hl_dontinclude">minimal</a> list of <tt>#include</tt>s required by the<br>

-file should be listed.  We prefer these <tt>#include</tt>s to be listed in this<br>

-order:</p><br>

-<br>

-<ol><br>

-  <li><a href="#mmheader">Main Module Header</a></li><br>

-  <li><a href="#hl_privateheaders">Local/Private Headers</a></li><br>

-  <li><tt>llvm/*</tt></li><br>

-  <li><tt>llvm/Analysis/*</tt></li><br>

-  <li><tt>llvm/Assembly/*</tt></li><br>

-  <li><tt>llvm/Bitcode/*</tt></li><br>

-  <li><tt>llvm/CodeGen/*</tt></li><br>

-  <li>...</li><br>

-  <li><tt>Support/*</tt></li><br>

-  <li><tt>Config/*</tt></li><br>

-  <li>System <tt>#includes</tt></li><br>

-</ol><br>

-<br>

-<p>and each category should be sorted by name.</p><br>

-<br>

-<p><a name="mmheader">The "Main Module Header"</a> file applies to <tt>.cpp</tt> files<br>

-which implement an interface defined by a <tt>.h</tt> file.  This <tt>#include</tt><br>

-should always be included <b>first</b> regardless of where it lives on the file<br>

-system.  By including a header file first in the <tt>.cpp</tt> files that implement the<br>

-interfaces, we ensure that the header does not have any hidden dependencies<br>

-which are not explicitly #included in the header, but should be.  It is also a<br>

-form of documentation in the <tt>.cpp</tt> file to indicate where the interfaces it<br>

-implements are defined.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="scf_codewidth">Source Code Width</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>Write your code to fit within 80 columns of text.  This helps those of us who<br>

-like to print out code and look at your code in an xterm without resizing<br>

-it.</p><br>

-<br>

-<p>The longer answer is that there must be some limit to the width of the code<br>

-in order to reasonably allow developers to have multiple files side-by-side in<br>

-windows on a modest display.  If you are going to pick a width limit, it is<br>

-somewhat arbitrary but you might as well pick something standard.  Going with<br>

-90 columns (for example) instead of 80 columns wouldn't add any significant<br>

-value and would be detrimental to printing out code.  Also many other projects<br>

-have standardized on 80 columns, so some people have already configured their<br>

-editors for it (vs something else, like 90 columns).</p><br>

-<br>

-<p>This is one of many contentious issues in coding standards, but it is not up<br>

-for debate.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="scf_spacestabs">Use Spaces Instead of Tabs</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>In all cases, prefer spaces to tabs in source files.  People have different<br>

-preferred indentation levels, and different styles of indentation that they<br>

-like; this is fine.  What isn't fine is that different editors/viewers expand<br>

-tabs out to different tab stops.  This can cause your code to look completely<br>

-unreadable, and it is not worth dealing with.</p><br>

-<br>

-<p>As always, follow the <a href="#goldenrule">Golden Rule</a> above: follow the<br>

-style of existing code if you are modifying and extending it.  If you like four<br>

-spaces of indentation, <b>DO NOT</b> do that in the middle of a chunk of code<br>

-with two spaces of indentation.  Also, do not reindent a whole source file: it<br>

-makes for incredible diffs that are absolutely worthless.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="scf_indentation">Indent Code Consistently</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>Okay, in your first year of programming you were told that indentation is<br>

-important.  If you didn't believe and internalize this then, now is the time.<br>

-Just do it.</p><br>

-<br>

-</div><br>

-<br>

-</div><br>

-<br>

-<!-- ======================================================================= --><br>

-<h3><br>

-  <a name="compilerissues">Compiler Issues</a><br>

-</h3><br>

-<br>

-<div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="ci_warningerrors">Treat Compiler Warnings Like Errors</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>If your code has compiler warnings in it, something is wrong &mdash; you<br>

-aren't casting values correctly, your have "questionable" constructs in your<br>

-code, or you are doing something legitimately wrong.  Compiler warnings can<br>

-cover up legitimate errors in output and make dealing with a translation unit<br>

-difficult.</p><br>

-<br>

-<p>It is not possible to prevent all warnings from all compilers, nor is it<br>

-desirable.  Instead, pick a standard compiler (like <tt>gcc</tt>) that provides<br>

-a good thorough set of warnings, and stick to it.  At least in the case of<br>

-<tt>gcc</tt>, it is possible to work around any spurious errors by changing the<br>

-syntax of the code slightly.  For example, a warning that annoys me occurs when<br>

-I write code like this:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-if (V = getValue()) {<br>

-  ...<br>

-}<br>

-</pre><br>

-</div><br>

-<br>

-<p><tt>gcc</tt> will warn me that I probably want to use the <tt>==</tt><br>

-operator, and that I probably mistyped it.  In most cases, I haven't, and I<br>

-really don't want the spurious errors.  To fix this particular problem, I<br>

-rewrite the code like this:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-if ((V = getValue())) {<br>

-  ...<br>

-}<br>

-</pre><br>

-</div><br>

-<br>

-<p>which shuts <tt>gcc</tt> up.  Any <tt>gcc</tt> warning that annoys you can<br>

-be fixed by massaging the code appropriately.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="ci_portable_code">Write Portable Code</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>In almost all cases, it is possible and within reason to write completely<br>

-portable code.  If there are cases where it isn't possible to write portable<br>

-code, isolate it behind a well defined (and well documented) interface.</p><br>

-<br>

-<p>In practice, this means that you shouldn't assume much about the host<br>

-compiler, and Visual Studio tends to be the lowest common denominator.<br>

-If advanced features are used, they should only be an implementation detail of<br>

-a library which has a simple exposed API, and preferably be buried in<br>

-libSystem.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-<a name="ci_rtti_exceptions">Do not use RTTI or Exceptions</a><br>

-</h4><br>

-<div><br>

-<br>

-<p>In an effort to reduce code and executable size, LLVM does not use RTTI<br>

-(e.g. <tt>dynamic_cast&lt;&gt;</tt>) or exceptions.  These two language features<br>

-violate the general C++ principle of <i>"you only pay for what you use"</i>,<br>

-causing executable bloat even if exceptions are never used in the code base, or<br>

-if RTTI is never used for a class.  Because of this, we turn them off globally<br>

-in the code.</p><br>

-<br>

-<p>That said, LLVM does make extensive use of a hand-rolled form of RTTI that<br>

-use templates like <a href="ProgrammersManual.html#isa"><tt>isa&lt;&gt;</tt>,<br>

-<tt>cast&lt;&gt;</tt>, and <tt>dyn_cast&lt;&gt;</tt></a>.  This form of RTTI is<br>

-opt-in and can be added to any class.  It is also substantially more efficient<br>

-than <tt>dynamic_cast&lt;&gt;</tt>.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-<a name="ci_static_ctors">Do not use Static Constructors</a><br>

-</h4><br>

-<div><br>

-<br>

-<p>Static constructors and destructors (e.g. global variables whose types have<br>

-a constructor or destructor) should not be added to the code base, and should be<br>

-removed wherever possible.  Besides <a<br>

-href="<a href="http://yosefk.com/c++fqa/ctors.html#fqa-10.12" target="_blank">http://yosefk.com/c++fqa/ctors.html#fqa-10.12</a>">well known problems</a><br>

-where the order of initialization is undefined between globals in different<br>

-source files, the entire concept of static constructors is at odds with the<br>

-common use case of LLVM as a library linked into a larger application.</p><br>

-<br>

-<p>Consider the use of LLVM as a JIT linked into another application (perhaps<br>

-for <a href="<a href="http://llvm.org/Users.html" target="_blank">http://llvm.org/Users.html</a>">OpenGL, custom languages</a>,<br>

-<a href="<a href="http://llvm.org/devmtg/2010-11/Gritz-OpenShadingLang.pdf" target="_blank">http://llvm.org/devmtg/2010-11/Gritz-OpenShadingLang.pdf</a>">shaders in<br>

-movies</a>, etc).  Due to the design of static constructors, they must be<br>

-executed at startup time of the entire application, regardless of whether or<br>

-how LLVM is used in that larger application.  There are two problems with<br>

-this:</p><br>

-<br>

-<ol><br>

-  <li>The time to run the static constructors impacts startup time of<br>

-    applications &mdash; a critical time for GUI apps, among others.</li><br>

-<br>

-  <li>The static constructors cause the app to pull many extra pages of memory<br>

-    off the disk: both the code for the constructor in each <tt>.o</tt> file and<br>

-    the small amount of data that gets touched. In addition, touched/dirty pages<br>

-    put more pressure on the VM system on low-memory machines.</li><br>

-</ol><br>

-<br>

-<p>We would really like for there to be zero cost for linking in an additional<br>

-LLVM target or other library into an application, but static constructors<br>

-violate this goal.</p><br>

-<br>

-<p>That said, LLVM unfortunately does contain static constructors.  It would be<br>

-a <a href="<a href="http://llvm.org/PR11944" target="_blank">http://llvm.org/PR11944</a>">great project</a> for someone to purge all<br>

-static constructors from LLVM, and then enable the<br>

-<tt>-Wglobal-constructors</tt> warning flag (when building with Clang) to ensure<br>

-we do not regress in the future.<br>

-</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-<a name="ci_class_struct">Use of <tt>class</tt> and <tt>struct</tt> Keywords</a><br>

-</h4><br>

-<div><br>

-<br>

-<p>In C++, the <tt>class</tt> and <tt>struct</tt> keywords can be used almost<br>

-interchangeably. The only difference is when they are used to declare a class:<br>

-<tt>class</tt> makes all members private by default while <tt>struct</tt> makes<br>

-all members public by default.</p><br>

-<br>

-<p>Unfortunately, not all compilers follow the rules and some will generate<br>

-different symbols based on whether <tt>class</tt> or <tt>struct</tt> was used to<br>

-declare the symbol.  This can lead to problems at link time.</p><br>

-<br>

-<p>So, the rule for LLVM is to always use the <tt>class</tt> keyword, unless<br>

-<b>all</b> members are public and the type is a C++<br>

-<a href="<a href="http://en.wikipedia.org/wiki/Plain_old_data_structure" target="_blank">http://en.wikipedia.org/wiki/Plain_old_data_structure</a>">POD</a> type, in<br>

-which case <tt>struct</tt> is allowed.</p><br>

-<br>

-</div><br>

-<br>

-</div><br>

-<br>

-</div><br>

-<br>

-<!-- *********************************************************************** --><br>

-<h2><br>

-  <a name="styleissues">Style Issues</a><br>

-</h2><br>

-<!-- *********************************************************************** --><br>

-<br>

-<div><br>

-<br>

-<!-- ======================================================================= --><br>

-<h3><br>

-  <a name="macro">The High-Level Issues</a><br>

-</h3><br>

-<!-- ======================================================================= --><br>

-<br>

-<div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="hl_module">A Public Header File <b>is</b> a Module</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>C++ doesn't do too well in the modularity department.  There is no real<br>

-encapsulation or data hiding (unless you use expensive protocol classes), but it<br>

-is what we have to work with.  When you write a public header file (in the LLVM<br>

-source tree, they live in the top level "<tt>include</tt>" directory), you are<br>

-defining a module of functionality.</p><br>

-<br>

-<p>Ideally, modules should be completely independent of each other, and their<br>

-header files should only <tt>#include</tt> the absolute minimum number of<br>

-headers possible. A module is not just a class, a function, or a<br>

-namespace: <a href="<a href="http://www.cuj.com/articles/2000/0002/0002c/0002c.htm" target="_blank">http://www.cuj.com/articles/2000/0002/0002c/0002c.htm</a>">it's<br>

-a collection of these</a> that defines an interface.  This interface may be<br>

-several functions, classes, or data structures, but the important issue is how<br>

-they work together.</p><br>

-<br>

-<p>In general, a module should be implemented by one or more <tt>.cpp</tt><br>

-files.  Each of these <tt>.cpp</tt> files should include the header that defines<br>

-their interface first.  This ensures that all of the dependences of the module<br>

-header have been properly added to the module header itself, and are not<br>

-implicit.  System headers should be included after user headers for a<br>

-translation unit.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="hl_dontinclude"><tt>#include</tt> as Little as Possible</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p><tt>#include</tt> hurts compile time performance.  Don't do it unless you<br>

-have to, especially in header files.</p><br>

-<br>

-<p>But wait! Sometimes you need to have the definition of a class to use it, or<br>

-to inherit from it.  In these cases go ahead and <tt>#include</tt> that header<br>

-file.  Be aware however that there are many cases where you don't need to have<br>

-the full definition of a class.  If you are using a pointer or reference to a<br>

-class, you don't need the header file.  If you are simply returning a class<br>

-instance from a prototyped function or method, you don't need it.  In fact, for<br>

-most cases, you simply don't need the definition of a class. And not<br>

-<tt>#include</tt>'ing speeds up compilation.</p><br>

-<br>

-<p>It is easy to try to go too overboard on this recommendation, however.  You<br>

-<b>must</b> include all of the header files that you are using &mdash; you can<br>

-include them either directly or indirectly (through another header file).  To<br>

-make sure that you don't accidentally forget to include a header file in your<br>

-module header, make sure to include your module header <b>first</b> in the<br>

-implementation file (as mentioned above).  This way there won't be any hidden<br>

-dependencies that you'll find out about later.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="hl_privateheaders">Keep "Internal" Headers Private</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>Many modules have a complex implementation that causes them to use more than<br>

-one implementation (<tt>.cpp</tt>) file.  It is often tempting to put the<br>

-internal communication interface (helper classes, extra functions, etc) in the<br>

-public module header file.  Don't do this!</p><br>

-<br>

-<p>If you really need to do something like this, put a private header file in<br>

-the same directory as the source files, and include it locally.  This ensures<br>

-that your private interface remains private and undisturbed by outsiders.</p><br>

-<br>

-<p>Note however, that it's okay to put extra implementation methods in a public<br>

-class itself. Just make them private (or protected) and all is well.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="hl_earlyexit">Use Early Exits and <tt>continue</tt> to Simplify Code</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>When reading code, keep in mind how much state and how many previous<br>

-decisions have to be remembered by the reader to understand a block of code.<br>

-Aim to reduce indentation where possible when it doesn't make it more difficult<br>

-to understand the code.  One great way to do this is by making use of early<br>

-exits and the <tt>continue</tt> keyword in long loops.  As an example of using<br>

-an early exit from a function, consider this "bad" code:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-Value *DoSomething(Instruction *I) {<br>

-  if (!isa&lt;TerminatorInst&gt;(I) &amp;&amp;<br>

-      I-&gt;hasOneUse() &amp;&amp; SomeOtherThing(I)) {<br>

-    ... some long code ....<br>

-  }<br>

-<br>

-  return 0;<br>

-}<br>

-</pre><br>

-</div><br>

-<br>

-<p>This code has several problems if the body of the '<tt>if</tt>' is large.<br>

-When you're looking at the top of the function, it isn't immediately clear that<br>

-this <em>only</em> does interesting things with non-terminator instructions, and<br>

-only applies to things with the other predicates.  Second, it is relatively<br>

-difficult to describe (in comments) why these predicates are important because<br>

-the <tt>if</tt> statement makes it difficult to lay out the comments.  Third,<br>

-when you're deep within the body of the code, it is indented an extra level.<br>

-Finally, when reading the top of the function, it isn't clear what the result is<br>

-if the predicate isn't true; you have to read to the end of the function to know<br>

-that it returns null.</p><br>

-<br>

-<p>It is much preferred to format the code like this:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-Value *DoSomething(Instruction *I) {<br>

-  // Terminators never need 'something' done to them because ...<br>

-  if (isa&lt;TerminatorInst&gt;(I))<br>

-    return 0;<br>

-<br>

-  // We conservatively avoid transforming instructions with multiple uses<br>

-  // because goats like cheese.<br>

-  if (!I-&gt;hasOneUse())<br>

-    return 0;<br>

-<br>

-  // This is really just here for example.<br>

-  if (!SomeOtherThing(I))<br>

-    return 0;<br>

-<br>

-  ... some long code ....<br>

-}<br>

-</pre><br>

-</div><br>

-<br>

-<p>This fixes these problems.  A similar problem frequently happens in <tt>for</tt><br>

-loops.  A silly example is something like this:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-  for (BasicBlock::iterator II = BB-&gt;begin(), E = BB-&gt;end(); II != E; ++II) {<br>

-    if (BinaryOperator *BO = dyn_cast&lt;BinaryOperator&gt;(II)) {<br>

-      Value *LHS = BO-&gt;getOperand(0);<br>

-      Value *RHS = BO-&gt;getOperand(1);<br>

-      if (LHS != RHS) {<br>

-        ...<br>

-      }<br>

-    }<br>

-  }<br>

-</pre><br>

-</div><br>

-<br>

-<p>When you have very, very small loops, this sort of structure is fine. But if<br>

-it exceeds more than 10-15 lines, it becomes difficult for people to read and<br>

-understand at a glance. The problem with this sort of code is that it gets very<br>

-nested very quickly. Meaning that the reader of the code has to keep a lot of<br>

-context in their brain to remember what is going immediately on in the loop,<br>

-because they don't know if/when the <tt>if</tt> conditions will have elses etc.<br>

-It is strongly preferred to structure the loop like this:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-  for (BasicBlock::iterator II = BB-&gt;begin(), E = BB-&gt;end(); II != E; ++II) {<br>

-    BinaryOperator *BO = dyn_cast&lt;BinaryOperator&gt;(II);<br>

-    if (!BO) continue;<br>

-<br>

-    Value *LHS = BO-&gt;getOperand(0);<br>

-    Value *RHS = BO-&gt;getOperand(1);<br>

-    if (LHS == RHS) continue;<br>

-<br>

-    ...<br>

-  }<br>

-</pre><br>

-</div><br>

-<br>

-<p>This has all the benefits of using early exits for functions: it reduces<br>

-nesting of the loop, it makes it easier to describe why the conditions are true,<br>

-and it makes it obvious to the reader that there is no <tt>else</tt> coming up<br>

-that they have to push context into their brain for.  If a loop is large, this<br>

-can be a big understandability win.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="hl_else_after_return">Don't use <tt>else</tt> after a <tt>return</tt></a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>For similar reasons above (reduction of indentation and easier reading),<br>

-please do not use '<tt>else</tt>' or '<tt>else if</tt>' after something that<br>

-interrupts control flow &mdash; like <tt>return</tt>, <tt>break</tt>,<br>

-<tt>continue</tt>, <tt>goto</tt>, etc. For example, this is <em>bad</em>:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-  case 'J': {<br>

-    if (Signed) {<br>

-      Type = Context.getsigjmp_bufType();<br>

-      if (Type.isNull()) {<br>

-        Error = ASTContext::GE_Missing_sigjmp_buf;<br>

-        return QualType();<br>

-      <b>} else {<br>

-        break;<br>

-      }</b><br>

-    } else {<br>

-      Type = Context.getjmp_bufType();<br>

-      if (Type.isNull()) {<br>

-        Error = ASTContext::GE_Missing_jmp_buf;<br>

-        return QualType();<br>

-      <b>} else {<br>

-        break;<br>

-      }</b><br>

-    }<br>

-  }<br>

-  }<br>

-</pre><br>

-</div><br>

-<br>

-<p>It is better to write it like this:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-  case 'J':<br>

-    if (Signed) {<br>

-      Type = Context.getsigjmp_bufType();<br>

-      if (Type.isNull()) {<br>

-        Error = ASTContext::GE_Missing_sigjmp_buf;<br>

-        return QualType();<br>

-      }<br>

-    } else {<br>

-      Type = Context.getjmp_bufType();<br>

-      if (Type.isNull()) {<br>

-        Error = ASTContext::GE_Missing_jmp_buf;<br>

-        return QualType();<br>

-      }<br>

-    }<br>

-    <b>break;</b><br>

-</pre><br>

-</div><br>

-<br>

-<p>Or better yet (in this case) as:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-  case 'J':<br>

-    if (Signed)<br>

-      Type = Context.getsigjmp_bufType();<br>

-    else<br>

-      Type = Context.getjmp_bufType();<br>

-<br>

-    if (Type.isNull()) {<br>

-      Error = Signed ? ASTContext::GE_Missing_sigjmp_buf :<br>

-                       ASTContext::GE_Missing_jmp_buf;<br>

-      return QualType();<br>

-    }<br>

-    <b>break;</b><br>

-</pre><br>

-</div><br>

-<br>

-<p>The idea is to reduce indentation and the amount of code you have to keep<br>

-track of when reading the code.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="hl_predicateloops">Turn Predicate Loops into Predicate Functions</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>It is very common to write small loops that just compute a boolean value.<br>

-There are a number of ways that people commonly write these, but an example of<br>

-this sort of thing is:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-  <b>bool FoundFoo = false;</b><br>

-  for (unsigned i = 0, e = BarList.size(); i != e; ++i)<br>

-    if (BarList[i]-&gt;isFoo()) {<br>

-      <b>FoundFoo = true;</b><br>

-      break;<br>

-    }<br>

-<br>

-  <b>if (FoundFoo) {</b><br>

-    ...<br>

-  }<br>

-</pre><br>

-</div><br>

-<br>

-<p>This sort of code is awkward to write, and is almost always a bad sign.<br>

-Instead of this sort of loop, we strongly prefer to use a predicate function<br>

-(which may be <a href="#micro_anonns">static</a>) that uses<br>

-<a href="#hl_earlyexit">early exits</a> to compute the predicate.  We prefer<br>

-the code to be structured like this:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-/// ListContainsFoo - Return true if the specified list has an element that is<br>

-/// a foo.<br>

-static bool ListContainsFoo(const std::vector&lt;Bar*&gt; &amp;List) {<br>

-  for (unsigned i = 0, e = List.size(); i != e; ++i)<br>

-    if (List[i]-&gt;isFoo())<br>

-      return true;<br>

-  return false;<br>

-}<br>

-...<br>

-<br>

-  <b>if (ListContainsFoo(BarList)) {</b><br>

-    ...<br>

-  }<br>

-</pre><br>

-</div><br>

-<br>

-<p>There are many reasons for doing this: it reduces indentation and factors out<br>

-code which can often be shared by other code that checks for the same predicate.<br>

-More importantly, it <em>forces you to pick a name</em> for the function, and<br>

-forces you to write a comment for it.  In this silly example, this doesn't add<br>

-much value.  However, if the condition is complex, this can make it a lot easier<br>

-for the reader to understand the code that queries for this predicate.  Instead<br>

-of being faced with the in-line details of how we check to see if the BarList<br>

-contains a foo, we can trust the function name and continue reading with better<br>

-locality.</p><br>

-<br>

-</div><br>

-<br>

-</div><br>

-<br>

-<!-- ======================================================================= --><br>

-<h3><br>

-  <a name="micro">The Low-Level Issues</a><br>

-</h3><br>

-<!-- ======================================================================= --><br>

-<br>

-<div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="ll_naming"><br>

-    Name Types, Functions, Variables, and Enumerators Properly<br>

-  </a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>Poorly-chosen names can mislead the reader and cause bugs. We cannot stress<br>

-enough how important it is to use <em>descriptive</em> names.  Pick names that<br>

-match the semantics and role of the underlying entities, within reason.  Avoid<br>

-abbreviations unless they are well known.  After picking a good name, make sure<br>

-to use consistent capitalization for the name, as inconsistency requires clients<br>

-to either memorize the APIs or to look it up to find the exact spelling.</p><br>

-<br>

-<p>In general, names should be in camel case (e.g. <tt>TextFileReader</tt><br>

-and <tt>isLValue()</tt>).  Different kinds of declarations have different<br>

-rules:</p><br>

-<br>

-<ul><br>

-<li><p><b>Type names</b> (including classes, structs, enums, typedefs, etc)<br>

-    should be nouns and start with an upper-case letter (e.g.<br>

-    <tt>TextFileReader</tt>).</p></li><br>

-<br>

-<li><p><b>Variable names</b> should be nouns (as they represent state).  The<br>

-    name should be camel case, and start with an upper case letter (e.g.<br>

-    <tt>Leader</tt> or <tt>Boats</tt>).</p></li><br>

-<br>

-<li><p><b>Function names</b> should be verb phrases (as they represent<br>

-    actions), and command-like function should be imperative.  The name should<br>

-    be camel case, and start with a lower case letter (e.g. <tt>openFile()</tt><br>

-    or <tt>isFoo()</tt>).</p></li><br>

-<br>

-<li><p><b>Enum declarations</b> (e.g. <tt>enum Foo {...}</tt>) are types, so<br>

-    they should follow the naming conventions for types.  A common use for enums<br>

-    is as a discriminator for a union, or an indicator of a subclass.  When an<br>

-    enum is used for something like this, it should have a <tt>Kind</tt> suffix<br>

-    (e.g. <tt>ValueKind</tt>).</p></li><br>

-<br>

-<li><p><b>Enumerators</b> (e.g. <tt>enum { Foo, Bar }</tt>) and <b>public member<br>

-    variables</b> should start with an upper-case letter, just like types.<br>

-    Unless the enumerators are defined in their own small namespace or inside a<br>

-    class, enumerators should have a prefix corresponding to the enum<br>

-    declaration name.  For example, <tt>enum ValueKind { ... };</tt> may contain<br>

-    enumerators like <tt>VK_Argument</tt>, <tt>VK_BasicBlock</tt>, etc.<br>

-    Enumerators that are just convenience constants are exempt from the<br>

-    requirement for a prefix.  For instance:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-enum {<br>

-  MaxSize = 42,<br>

-  Density = 12<br>

-};<br>

-</pre><br>

-</div><br>

-</li><br>

-<br>

-</ul><br>

-<br>

-<p>As an exception, classes that mimic STL classes can have member names in<br>

-STL's style of lower-case words separated by underscores (e.g. <tt>begin()</tt>,<br>

-<tt>push_back()</tt>, and <tt>empty()</tt>).</p><br>

-<br>

-<p>Here are some examples of good and bad names:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-class VehicleMaker {<br>

-  ...<br>

-  Factory&lt;Tire&gt; F;            // Bad -- abbreviation and non-descriptive.<br>

-  Factory&lt;Tire&gt; Factory;      // Better.<br>

-  Factory&lt;Tire&gt; TireFactory;  // Even better -- if VehicleMaker has more than one<br>

-                              // kind of factories.<br>

-};<br>

-<br>

-Vehicle MakeVehicle(VehicleType Type) {<br>

-  VehicleMaker M;                         // Might be OK if having a short life-span.<br>

-  Tire tmp1 = M.makeTire();               // Bad -- 'tmp1' provides no information.<br>

-  Light headlight = M.makeLight("head");  // Good -- descriptive.<br>

-  ...<br>

-}<br>

-</pre><br>

-</div><br>

-<br>

-</div><br>

-<br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="ll_assert">Assert Liberally</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>Use the "<tt>assert</tt>" macro to its fullest.  Check all of your<br>

-preconditions and assumptions, you never know when a bug (not necessarily even<br>

-yours) might be caught early by an assertion, which reduces debugging time<br>

-dramatically.  The "<tt>&lt;cassert&gt;</tt>" header file is probably already<br>

-included by the header files you are using, so it doesn't cost anything to use<br>

-it.</p><br>

-<br>

-<p>To further assist with debugging, make sure to put some kind of error message<br>

-in the assertion statement, which is printed if the assertion is tripped. This<br>

-helps the poor debugger make sense of why an assertion is being made and<br>

-enforced, and hopefully what to do about it.  Here is one complete example:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-inline Value *getOperand(unsigned i) {<br>

-  assert(i &lt; Operands.size() &amp;&amp; "getOperand() out of range!");<br>

-  return Operands[i];<br>

-}<br>

-</pre><br>

-</div><br>

-<br>

-<p>Here are more examples:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-assert(Ty-&gt;isPointerType() &amp;&amp; "Can't allocate a non pointer type!");<br>

-<br>

-assert((Opcode == Shl || Opcode == Shr) &amp;&amp; "ShiftInst Opcode invalid!");<br>

-<br>

-assert(idx &lt; getNumSuccessors() &amp;&amp; "Successor # out of range!");<br>

-<br>

-assert(V1.getType() == V2.getType() &amp;&amp; "Constant types must be identical!");<br>

-<br>

-assert(isa&lt;PHINode&gt;(Succ-&gt;front()) &amp;&amp; "Only works on PHId BBs!");<br>

-</pre><br>

-</div><br>

-<br>

-<p>You get the idea.</p><br>

-<br>

-<p>Please be aware that, when adding assert statements, not all compilers are aware of<br>

-the semantics of the assert.  In some places, asserts are used to indicate a piece of<br>

-code that should not be reached.  These are typically of the form:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-assert(0 &amp;&amp; "Some helpful error message");<br>

-</pre><br>

-</div><br>

-<br>

-<p>When used in a function that returns a value, they should be followed with a return<br>

-statement and a comment indicating that this line is never reached.  This will prevent<br>

-a compiler which is unable to deduce that the assert statement never returns from<br>

-generating a warning.</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-assert(0 &amp;&amp; "Some helpful error message");<br>

-// Not reached<br>

-return 0;<br>

-</pre><br>

-</div><br>

-<br>

-<p>Another issue is that values used only by assertions will produce an "unused<br>

-value" warning when assertions are disabled.  For example, this code will<br>

-warn:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-unsigned Size = V.size();<br>

-assert(Size &gt; 42 &amp;&amp; "Vector smaller than it should be");<br>

-<br>

-bool NewToSet = Myset.insert(Value);<br>

-assert(NewToSet &amp;&amp; "The value shouldn't be in the set yet");<br>

-</pre><br>

-</div><br>

-<br>

-<p>These are two interesting different cases. In the first case, the call to<br>

-V.size() is only useful for the assert, and we don't want it executed when<br>

-assertions are disabled.  Code like this should move the call into the assert<br>

-itself.  In the second case, the side effects of the call must happen whether<br>

-the assert is enabled or not.  In this case, the value should be cast to void to<br>

-disable the warning.  To be specific, it is preferred to write the code like<br>

-this:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-assert(V.size() &gt; 42 &amp;&amp; "Vector smaller than it should be");<br>

-<br>

-bool NewToSet = Myset.insert(Value); (void)NewToSet;<br>

-assert(NewToSet &amp;&amp; "The value shouldn't be in the set yet");<br>

-</pre><br>

-</div><br>

-<br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="ll_ns_std">Do Not Use '<tt>using namespace std</tt>'</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>In LLVM, we prefer to explicitly prefix all identifiers from the standard<br>

-namespace with an "<tt>std::</tt>" prefix, rather than rely on<br>

-"<tt>using namespace std;</tt>".</p><br>

-<br>

-<p> In header files, adding a '<tt>using namespace XXX</tt>' directive pollutes<br>

-the namespace of any source file that <tt>#include</tt>s the header.  This is<br>

-clearly a bad thing.</p><br>

-<br>

-<p>In implementation files (e.g. <tt>.cpp</tt> files), the rule is more of a stylistic<br>

-rule, but is still important.  Basically, using explicit namespace prefixes<br>

-makes the code <b>clearer</b>, because it is immediately obvious what facilities<br>

-are being used and where they are coming from. And <b>more portable</b>, because<br>

-namespace clashes cannot occur between LLVM code and other namespaces.  The<br>

-portability rule is important because different standard library implementations<br>

-expose different symbols (potentially ones they shouldn't), and future revisions<br>

-to the C++ standard will add more symbols to the <tt>std</tt> namespace.  As<br>

-such, we never use '<tt>using namespace std;</tt>' in LLVM.</p><br>

-<br>

-<p>The exception to the general rule (i.e. it's not an exception for<br>

-the <tt>std</tt> namespace) is for implementation files.  For example, all of<br>

-the code in the LLVM project implements code that lives in the 'llvm' namespace.<br>

-As such, it is ok, and actually clearer, for the <tt>.cpp</tt> files to have a<br>

-'<tt>using namespace llvm;</tt>' directive at the top, after the<br>

-<tt>#include</tt>s.  This reduces indentation in the body of the file for source<br>

-editors that indent based on braces, and keeps the conceptual context cleaner.<br>

-The general form of this rule is that any <tt>.cpp</tt> file that implements<br>

-code in any namespace may use that namespace (and its parents'), but should not<br>

-use any others.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="ll_virtual_anch"><br>

-    Provide a Virtual Method Anchor for Classes in Headers<br>

-  </a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>If a class is defined in a header file and has a v-table (either it has<br>

-virtual methods or it derives from classes with virtual methods), it must<br>

-always have at least one out-of-line virtual method in the class.  Without<br>

-this, the compiler will copy the vtable and RTTI into every <tt>.o</tt> file<br>

-that <tt>#include</tt>s the header, bloating <tt>.o</tt> file sizes and<br>

-increasing link times.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="ll_end">Don't evaluate <tt>end()</tt> every time through a loop</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>Because C++ doesn't have a standard "<tt>foreach</tt>" loop (though it can be<br>

-emulated with macros and may be coming in C++'0x) we end up writing a lot of<br>

-loops that manually iterate from begin to end on a variety of containers or<br>

-through other data structures.  One common mistake is to write a loop in this<br>

-style:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-  BasicBlock *BB = ...<br>

-  for (BasicBlock::iterator I = BB->begin(); I != <b>BB->end()</b>; ++I)<br>

-     ... use I ...<br>

-</pre><br>

-</div><br>

-<br>

-<p>The problem with this construct is that it evaluates "<tt>BB->end()</tt>"<br>

-every time through the loop.  Instead of writing the loop like this, we strongly<br>

-prefer loops to be written so that they evaluate it once before the loop starts.<br>

-A convenient way to do this is like so:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-  BasicBlock *BB = ...<br>

-  for (BasicBlock::iterator I = BB->begin(), E = <b>BB->end()</b>; I != E; ++I)<br>

-     ... use I ...<br>

-</pre><br>

-</div><br>

-<br>

-<p>The observant may quickly point out that these two loops may have different<br>

-semantics: if the container (a basic block in this case) is being mutated, then<br>

-"<tt>BB->end()</tt>" may change its value every time through the loop and the<br>

-second loop may not in fact be correct.  If you actually do depend on this<br>

-behavior, please write the loop in the first form and add a comment indicating<br>

-that you did it intentionally.</p><br>

-<br>

-<p>Why do we prefer the second form (when correct)?  Writing the loop in the<br>

-first form has two problems. First it may be less efficient than evaluating it<br>

-at the start of the loop.  In this case, the cost is probably minor &mdash; a<br>

-few extra loads every time through the loop.  However, if the base expression is<br>

-more complex, then the cost can rise quickly.  I've seen loops where the end<br>

-expression was actually something like: "<tt>SomeMap[x]->end()</tt>" and map<br>

-lookups really aren't cheap.  By writing it in the second form consistently, you<br>

-eliminate the issue entirely and don't even have to think about it.</p><br>

-<br>

-<p>The second (even bigger) issue is that writing the loop in the first form<br>

-hints to the reader that the loop is mutating the container (a fact that a<br>

-comment would handily confirm!).  If you write the loop in the second form, it<br>

-is immediately obvious without even looking at the body of the loop that the<br>

-container isn't being modified, which makes it easier to read the code and<br>

-understand what it does.</p><br>

-<br>

-<p>While the second form of the loop is a few extra keystrokes, we do strongly<br>

-prefer it.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="ll_iostream"><tt>#include &lt;iostream&gt;</tt> is Forbidden</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>The use of <tt>#include &lt;iostream&gt;</tt> in library files is<br>

-hereby <b><em>forbidden</em></b>, because many common implementations<br>

-transparently inject a <a href="#ci_static_ctors">static constructor</a> into<br>

-every translation unit that includes it.</p><br>

-<br>

-<p>Note that using the other stream headers (<tt>&lt;sstream&gt;</tt> for<br>

-example) is not problematic in this regard &mdash;<br>

-just <tt>&lt;iostream&gt;</tt>. However, <tt>raw_ostream</tt> provides various<br>

-APIs that are better performing for almost every use than <tt>std::ostream</tt><br>

-style APIs. <b>Therefore new code should always<br>

-use <a href="#ll_raw_ostream"><tt>raw_ostream</tt></a> for writing, or<br>

-the <tt>llvm::MemoryBuffer</tt> API for reading files.</b></p><br>

-<br>

-</div><br>

-<br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="ll_raw_ostream">Use <tt>raw_ostream</tt></a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>LLVM includes a lightweight, simple, and efficient stream implementation<br>

-in <tt>llvm/Support/raw_ostream.h</tt>, which provides all of the common<br>

-features of <tt>std::ostream</tt>.  All new code should use <tt>raw_ostream</tt><br>

-instead of <tt>ostream</tt>.</p><br>

-<br>

-<p>Unlike <tt>std::ostream</tt>, <tt>raw_ostream</tt> is not a template and can<br>

-be forward declared as <tt>class raw_ostream</tt>.  Public headers should<br>

-generally not include the <tt>raw_ostream</tt> header, but use forward<br>

-declarations and constant references to <tt>raw_ostream</tt> instances.</p><br>

-<br>

-</div><br>

-<br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="ll_avoidendl">Avoid <tt>std::endl</tt></a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>The <tt>std::endl</tt> modifier, when used with <tt>iostreams</tt> outputs a<br>

-newline to the output stream specified.  In addition to doing this, however, it<br>

-also flushes the output stream.  In other words, these are equivalent:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-std::cout &lt;&lt; std::endl;<br>

-std::cout &lt;&lt; '\n' &lt;&lt; std::flush;<br>

-</pre><br>

-</div><br>

-<br>

-<p>Most of the time, you probably have no reason to flush the output stream, so<br>

-it's better to use a literal <tt>'\n'</tt>.</p><br>

-<br>

-</div><br>

-<br>

-</div><br>

-<br>

-<!-- ======================================================================= --><br>

-<h3><br>

-  <a name="nano">Microscopic Details</a><br>

-</h3><br>

-<!-- ======================================================================= --><br>

-<br>

-<div><br>

-<br>

-<p>This section describes preferred low-level formatting guidelines along with<br>

-reasoning on why we prefer them.</p><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="micro_spaceparen">Spaces Before Parentheses</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>We prefer to put a space before an open parenthesis only in control flow<br>

-statements, but not in normal function call expressions and function-like<br>

-macros.  For example, this is good:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-<b>if (</b>x) ...<br>

-<b>for (</b>i = 0; i != 100; ++i) ...<br>

-<b>while (</b>llvm_rocks) ...<br>

-<br>

-<b>somefunc(</b>42);<br>

-<b><a href="#ll_assert">assert</a>(</b>3 != 4 &amp;&amp; "laws of math are failing me");<br>

-<br>

-a = <b>foo(</b>42, 92) + <b>bar(</b>x);<br>

-</pre><br>

-</div><br>

-<br>

-<p>and this is bad:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-<b>if(</b>x) ...<br>

-<b>for(</b>i = 0; i != 100; ++i) ...<br>

-<b>while(</b>llvm_rocks) ...<br>

-<br>

-<b>somefunc (</b>42);<br>

-<b><a href="#ll_assert">assert</a> (</b>3 != 4 &amp;&amp; "laws of math are failing me");<br>

-<br>

-a = <b>foo (</b>42, 92) + <b>bar (</b>x);<br>

-</pre><br>

-</div><br>

-<br>

-<p>The reason for doing this is not completely arbitrary.  This style makes<br>

-control flow operators stand out more, and makes expressions flow better. The<br>

-function call operator binds very tightly as a postfix operator.  Putting a<br>

-space after a function name (as in the last example) makes it appear that the<br>

-code might bind the arguments of the left-hand-side of a binary operator with<br>

-the argument list of a function and the name of the right side.  More<br>

-specifically, it is easy to misread the "a" example as:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-a = foo <b>(</b>(42, 92) + bar<b>)</b> (x);<br>

-</pre><br>

-</div><br>

-<br>

-<p>when skimming through the code.  By avoiding a space in a function, we avoid<br>

-this misinterpretation.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="micro_preincrement">Prefer Preincrement</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>Hard fast rule: Preincrement (<tt>++X</tt>) may be no slower than<br>

-postincrement (<tt>X++</tt>) and could very well be a lot faster than it.  Use<br>

-preincrementation whenever possible.</p><br>

-<br>

-<p>The semantics of postincrement include making a copy of the value being<br>

-incremented, returning it, and then preincrementing the "work value".  For<br>

-primitive types, this isn't a big deal... but for iterators, it can be a huge<br>

-issue (for example, some iterators contains stack and set objects in them...<br>

-copying an iterator could invoke the copy ctor's of these as well).  In general,<br>

-get in the habit of always using preincrement, and you won't have a problem.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="micro_namespaceindent">Namespace Indentation</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p><br>

-In general, we strive to reduce indentation wherever possible.  This is useful<br>

-because we want code to <a href="#scf_codewidth">fit into 80 columns</a> without<br>

-wrapping horribly, but also because it makes it easier to understand the code.<br>

-Namespaces are a funny thing: they are often large, and we often desire to put<br>

-lots of stuff into them (so they can be large).  Other times they are tiny,<br>

-because they just hold an enum or something similar.  In order to balance this,<br>

-we use different approaches for small versus large namespaces.<br>

-</p><br>

-<br>

-<p><br>

-If a namespace definition is small and <em>easily</em> fits on a screen (say,<br>

-less than 35 lines of code), then you should indent its body.  Here's an<br>

-example:<br>

-</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-namespace llvm {<br>

-  namespace X86 {<br>

-    /// RelocationType - An enum for the x86 relocation codes. Note that<br>

-    /// the terminology here doesn't follow x86 convention - word means<br>

-    /// 32-bit and dword means 64-bit.<br>

-    enum RelocationType {<br>

-      /// reloc_pcrel_word - PC relative relocation, add the relocated value to<br>

-      /// the value already in memory, after we adjust it for where the PC is.<br>

-      reloc_pcrel_word = 0,<br>

-<br>

-      /// reloc_picrel_word - PIC base relative relocation, add the relocated<br>

-      /// value to the value already in memory, after we adjust it for where the<br>

-      /// PIC base is.<br>

-      reloc_picrel_word = 1,<br>

-<br>

-      /// reloc_absolute_word, reloc_absolute_dword - Absolute relocation, just<br>

-      /// add the relocated value to the value already in memory.<br>

-      reloc_absolute_word = 2,<br>

-      reloc_absolute_dword = 3<br>

-    };<br>

-  }<br>

-}<br>

-</pre><br>

-</div><br>

-<br>

-<p>Since the body is small, indenting adds value because it makes it very clear<br>

-where the namespace starts and ends, and it is easy to take the whole thing in<br>

-in one "gulp" when reading the code.  If the blob of code in the namespace is<br>

-larger (as it typically is in a header in the <tt>llvm</tt> or <tt>clang</tt> namespaces), do not<br>

-indent the code, and add a comment indicating what namespace is being closed.<br>

-For example:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-namespace llvm {<br>

-namespace knowledge {<br>

-<br>

-/// Grokable - This class represents things that Smith can have an intimate<br>

-/// understanding of and contains the data associated with it.<br>

-class Grokable {<br>

-...<br>

-public:<br>

-  explicit Grokable() { ... }<br>

-  virtual ~Grokable() = 0;<br>

-<br>

-  ...<br>

-<br>

-};<br>

-<br>

-} // end namespace knowledge<br>

-} // end namespace llvm<br>

-</pre><br>

-</div><br>

-<br>

-<p>Because the class is large, we don't expect that the reader can easily<br>

-understand the entire concept in a glance, and the end of the file (where the<br>

-namespaces end) may be a long ways away from the place they open.  As such,<br>

-indenting the contents of the namespace doesn't add any value, and detracts from<br>

-the readability of the class.  In these cases it is best to <em>not</em> indent<br>

-the contents of the namespace.</p><br>

-<br>

-</div><br>

-<br>

-<!-- _______________________________________________________________________ --><br>

-<h4><br>

-  <a name="micro_anonns">Anonymous Namespaces</a><br>

-</h4><br>

-<br>

-<div><br>

-<br>

-<p>After talking about namespaces in general, you may be wondering about<br>

-anonymous namespaces in particular.<br>

-Anonymous namespaces are a great language feature that tells the C++ compiler<br>

-that the contents of the namespace are only visible within the current<br>

-translation unit, allowing more aggressive optimization and eliminating the<br>

-possibility of symbol name collisions.  Anonymous namespaces are to C++ as<br>

-"static" is to C functions and global variables.  While "static" is available<br>

-in C++, anonymous namespaces are more general: they can make entire classes<br>

-private to a file.</p><br>

-<br>

-<p>The problem with anonymous namespaces is that they naturally want to<br>

-encourage indentation of their body, and they reduce locality of reference: if<br>

-you see a random function definition in a C++ file, it is easy to see if it is<br>

-marked static, but seeing if it is in an anonymous namespace requires scanning<br>

-a big chunk of the file.</p><br>

-<br>

-<p>Because of this, we have a simple guideline: make anonymous namespaces as<br>

-small as possible, and only use them for class declarations.  For example, this<br>

-is good:</p><br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-<b>namespace {</b><br>

-  class StringSort {<br>

-  ...<br>

-  public:<br>

-    StringSort(...)<br>

-    bool operator&lt;(const char *RHS) const;<br>

-  };<br>

-<b>} // end anonymous namespace</b><br>

-<br>

-static void Helper() {<br>

-  ...<br>

-}<br>

-<br>

-bool StringSort::operator&lt;(const char *RHS) const {<br>

-  ...<br>

-}<br>

-<br>

-</pre><br>

-</div><br>

-<br>

-<p>This is bad:</p><br>

-<br>

-<br>

-<div class="doc_code"><br>

-<pre><br>

-<b>namespace {</b><br>

-class StringSort {<br>

-...<br>

-public:<br>

-  StringSort(...)<br>

-  bool operator&lt;(const char *RHS) const;<br>

-};<br>

-<br>

-void Helper() {<br>

-  ...<br>

-}<br>

-<br>

-bool StringSort::operator&lt;(const char *RHS) const {<br>

-  ...<br>

-}<br>

-<br>

-<b>} // end anonymous namespace</b><br>

-<br>

-</pre><br>

-</div><br>

-<br>

-<br>

-<p>This is bad specifically because if you're looking at "Helper" in the middle<br>

-of a large C++ file, that you have no immediate way to tell if it is local to<br>

-the file.  When it is marked static explicitly, this is immediately obvious.<br>

-Also, there is no reason to enclose the definition of "operator&lt;" in the<br>

-namespace just because it was declared there.<br>

-</p><br>

-<br>

-</div><br>

-<br>

-</div><br>

-<br>

-</div><br>

-<br>

-<!-- *********************************************************************** --><br>

-<h2><br>

-  <a name="seealso">See Also</a><br>

-</h2><br>

-<!-- *********************************************************************** --><br>

-<br>

-<div><br>

-<br>

-<p>A lot of these comments and recommendations have been culled for other<br>

-sources.  Two particularly important books for our work are:</p><br>

-<br>

-<ol><br>

-<br>

-<li><a href="<a href="http://www.amazon.com/Effective-Specific-Addison-Wesley-Professional-Computing/dp/0321334876" target="_blank">http://www.amazon.com/Effective-Specific-Addison-Wesley-Professional-Computing/dp/0321334876</a>">Effective<br>


-C++</a> by Scott Meyers.  Also<br>

-interesting and useful are "More Effective C++" and "Effective STL" by the same<br>

-author.</li><br>

-<br>

-<li>Large-Scale C++ Software Design by John Lakos</li><br>

-<br>

-</ol><br>

-<br>

-<p>If you get some free time, and you haven't read them: do so, you might learn<br>

-something.</p><br>

-<br>

-</div><br>

-<br>

-<!-- *********************************************************************** --><br>

-<br>

-<hr><br>

-<address><br>

-  <a href="<a href="http://jigsaw.w3.org/css-validator/check/referer" target="_blank">http://jigsaw.w3.org/css-validator/check/referer</a>"><img<br>

-  src="<a href="http://jigsaw.w3.org/css-validator/images/vcss-blue" target="_blank">http://jigsaw.w3.org/css-validator/images/vcss-blue</a>" alt="Valid CSS"></a><br>

-  <a href="<a href="http://validator.w3.org/check/referer" target="_blank">http://validator.w3.org/check/referer</a>"><img<br>

-  src="<a href="http://www.w3.org/Icons/valid-html401-blue" target="_blank">http://www.w3.org/Icons/valid-html401-blue</a>" alt="Valid HTML 4.01"></a><br>

-<br>

-  <a href="mailto:<a href="mailto:sabre@nondot.org">sabre@nondot.org</a>">Chris Lattner</a><br><br>

-  <a href="<a href="http://llvm.org/" target="_blank">http://llvm.org/</a>">LLVM Compiler Infrastructure</a><br><br>

-  Last modified: $Date$<br>

-</address><br>

-<br>

-</body><br>

-</html><br>

<br>

Added: llvm/trunk/docs/CodingStandards.rst<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CodingStandards.rst?rev=158786&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CodingStandards.rst?rev=158786&view=auto</a><br>


==============================================================================<br>

--- llvm/trunk/docs/CodingStandards.rst (added)<br>

+++ llvm/trunk/docs/CodingStandards.rst Tue Jun 19 21:57:56 2012<br>

@@ -0,0 +1,1148 @@<br>

+.. _coding_standards:<br>

+<br>

+=====================<br>

+LLVM Coding Standards<br>

+=====================<br>

+<br>

+.. contents::<br>

+   :local:<br>

+<br>

+Introduction<br>

+============<br>

+<br>

+This document attempts to describe a few coding standards that are being used in<br>

+the LLVM source tree.  Although no coding standards should be regarded as<br>

+absolute requirements to be followed in all instances, coding standards are<br>

+particularly important for large-scale code bases that follow a library-based<br>

+design (like LLVM).<br>

+<br>

+This document intentionally does not prescribe fixed standards for religious<br>

+issues such as brace placement and space usage.  For issues like this, follow<br>

+the golden rule:<br>

+<br>

+.. _Golden Rule:<br>

+<br>

+    **If you are extending, enhancing, or bug fixing already implemented code,<br>

+    use the style that is already being used so that the source is uniform and<br>

+    easy to follow.**<br>

+<br>

+Note that some code bases (e.g. ``libc++``) have really good reasons to deviate<br>

+from the coding standards.  In the case of ``libc++``, this is because the<br>

+naming and other conventions are dictated by the C++ standard.  If you think<br>

+there is a specific good reason to deviate from the standards here, please bring<br>

+it up on the LLVMdev mailing list.<br>

+<br>

+There are some conventions that are not uniformly followed in the code base<br>

+(e.g. the naming convention).  This is because they are relatively new, and a<br>

+lot of code was written before they were put in place.  Our long term goal is<br>

+for the entire codebase to follow the convention, but we explicitly *do not*<br>

+want patches that do large-scale reformating of existing code.  On the other<br>

+hand, it is reasonable to rename the methods of a class if you're about to<br>

+change it in some other way.  Just do the reformating as a separate commit from<br>

+the functionality change.<br>

+<br>

+The ultimate goal of these guidelines is the increase readability and<br>

+maintainability of our common source base. If you have suggestions for topics to<br>

+be included, please mail them to `Chris <mailto:<a href="mailto:sabre@nondot.org">sabre@nondot.org</a>>`_.<br>

+<br>

+Mechanical Source Issues<br>

+========================<br>

+<br>

+Source Code Formatting<br>

+----------------------<br>

+<br>

+Commenting<br>

+^^^^^^^^^^<br>

+<br>

+Comments are one critical part of readability and maintainability.  Everyone<br>

+knows they should comment their code, and so should you.  When writing comments,<br>

+write them as English prose, which means they should use proper capitalization,<br>

+punctuation, etc.  Aim to describe what the code is trying to do and why, not<br>

+*how* it does it at a micro level. Here are a few critical things to document:<br>

+<br>

+.. _header file comment:<br>

+<br>

+File Headers<br>

+""""""""""""<br>

+<br>

+Every source file should have a header on it that describes the basic purpose of<br>

+the file.  If a file does not have a header, it should not be checked into the<br>

+tree.  The standard header looks like this:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  //===-- llvm/Instruction.h - Instruction class definition -------*- C++ -*-===//<br>

+  //<br>

+  //                     The LLVM Compiler Infrastructure<br>

+  //<br>

+  // This file is distributed under the University of Illinois Open Source<br>

+  // License. See LICENSE.TXT for details.<br>

+  //<br>

+  //===----------------------------------------------------------------------===//<br>

+  //<br>

+  // This file contains the declaration of the Instruction class, which is the<br>

+  // base class for all of the VM instructions.<br>

+  //<br>

+  //===----------------------------------------------------------------------===//<br>

+<br>

+A few things to note about this particular format: The "``-*- C++ -*-``" string<br>

+on the first line is there to tell Emacs that the source file is a C++ file, not<br>

+a C file (Emacs assumes ``.h`` files are C files by default).<br>

+<br>

+.. note::<br>

+<br>

+    This tag is not necessary in ``.cpp`` files.  The name of the file is also<br>

+    on the first line, along with a very short description of the purpose of the<br>

+    file.  This is important when printing out code and flipping though lots of<br>

+    pages.<br>

+<br>

+The next section in the file is a concise note that defines the license that the<br>

+file is released under.  This makes it perfectly clear what terms the source<br>

+code can be distributed under and should not be modified in any way.<br>

+<br>

+The main body of the description does not have to be very long in most cases.<br>

+Here it's only two lines.  If an algorithm is being implemented or something<br>

+tricky is going on, a reference to the paper where it is published should be<br>

+included, as well as any notes or *gotchas* in the code to watch out for.<br>

+<br>

+Class overviews<br>

+"""""""""""""""<br>

+<br>

+Classes are one fundamental part of a good object oriented design.  As such, a<br>

+class definition should have a comment block that explains what the class is<br>

+used for and how it works.  Every non-trivial class is expected to have a<br>

+``doxygen`` comment block.<br>

+<br>

+Method information<br>

+""""""""""""""""""<br>

+<br>

+Methods defined in a class (as well as any global functions) should also be<br>

+documented properly.  A quick note about what it does and a description of the<br>

+borderline behaviour is all that is necessary here (unless something<br>

+particularly tricky or insidious is going on).  The hope is that people can<br>

+figure out how to use your interfaces without reading the code itself.<br>

+<br>

+Good things to talk about here are what happens when something unexpected<br>

+happens: does the method return null?  Abort?  Format your hard disk?<br>

+<br>

+Comment Formatting<br>

+^^^^^^^^^^^^^^^^^^<br>

+<br>

+In general, prefer C++ style (``//``) comments.  They take less space, require<br>

+less typing, don't have nesting problems, etc.  There are a few cases when it is<br>

+useful to use C style (``/* */``) comments however:<br>

+<br>

+#. When writing C code: Obviously if you are writing C code, use C style<br>

+   comments.<br>

+<br>

+#. When writing a header file that may be ``#include``\d by a C source file.<br>

+<br>

+#. When writing a source file that is used by a tool that only accepts C style<br>

+   comments.<br>

+<br>

+To comment out a large block of code, use ``#if 0`` and ``#endif``. These nest<br>

+properly and are better behaved in general than C style comments.<br>

+<br>

+``#include`` Style<br>

+^^^^^^^^^^^^^^^^^^<br>

+<br>

+Immediately after the `header file comment`_ (and include guards if working on a<br>

+header file), the `minimal list of #includes`_ required by the file should be<br>

+listed.  We prefer these ``#include``\s to be listed in this order:<br>

+<br>

+.. _Main Module Header:<br>

+.. _Local/Private Headers:<br>

+<br>

+#. Main Module Header<br>

+#. Local/Private Headers<br>

+#. ``llvm/*``<br>

+#. ``llvm/Analysis/*``<br>

+#. ``llvm/Assembly/*``<br>

+#. ``llvm/Bitcode/*``<br>

+#. ``llvm/CodeGen/*``<br>

+#. ...<br>

+#. ``llvm/Support/*``<br>

+#. ``llvm/Config/*``<br>

+#. System ``#include``\s<br>

+<br>

+and each category should be sorted by name.<br>

+<br>

+The `Main Module Header`_ file applies to ``.cpp`` files which implement an<br>

+interface defined by a ``.h`` file.  This ``#include`` should always be included<br>

+**first** regardless of where it lives on the file system.  By including a<br>

+header file first in the ``.cpp`` files that implement the interfaces, we ensure<br>

+that the header does not have any hidden dependencies which are not explicitly<br>

+``#include``\d in the header, but should be. It is also a form of documentation<br>

+in the ``.cpp`` file to indicate where the interfaces it implements are defined.<br>

+<br>

+.. _fit into 80 columns:<br>

+<br>

+Source Code Width<br>

+^^^^^^^^^^^^^^^^^<br>

+<br>

+Write your code to fit within 80 columns of text.  This helps those of us who<br>

+like to print out code and look at your code in an ``xterm`` without resizing<br>

+it.<br>

+<br>

+The longer answer is that there must be some limit to the width of the code in<br>

+order to reasonably allow developers to have multiple files side-by-side in<br>

+windows on a modest display.  If you are going to pick a width limit, it is<br>

+somewhat arbitrary but you might as well pick something standard.  Going with 90<br>

+columns (for example) instead of 80 columns wouldn't add any significant value<br>

+and would be detrimental to printing out code.  Also many other projects have<br>

+standardized on 80 columns, so some people have already configured their editors<br>

+for it (vs something else, like 90 columns).<br>

+<br>

+This is one of many contentious issues in coding standards, but it is not up for<br>

+debate.<br>

+<br>

+Use Spaces Instead of Tabs<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+In all cases, prefer spaces to tabs in source files.  People have different<br>

+preferred indentation levels, and different styles of indentation that they<br>

+like; this is fine.  What isn't fine is that different editors/viewers expand<br>

+tabs out to different tab stops.  This can cause your code to look completely<br>

+unreadable, and it is not worth dealing with.<br>

+<br>

+As always, follow the `Golden Rule`_ above: follow the style of<br>

+existing code if you are modifying and extending it.  If you like four spaces of<br>

+indentation, **DO NOT** do that in the middle of a chunk of code with two spaces<br>

+of indentation.  Also, do not reindent a whole source file: it makes for<br>

+incredible diffs that are absolutely worthless.<br>

+<br>

+Indent Code Consistently<br>

+^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+Okay, in your first year of programming you were told that indentation is<br>

+important.  If you didn't believe and internalize this then, now is the time.<br>

+Just do it.<br>

+<br>

+Compiler Issues<br>

+---------------<br>

+<br>

+Treat Compiler Warnings Like Errors<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+If your code has compiler warnings in it, something is wrong --- you aren't<br>

+casting values correctly, you have "questionable" constructs in your code, or<br>

+you are doing something legitimately wrong.  Compiler warnings can cover up<br>

+legitimate errors in output and make dealing with a translation unit difficult.<br>

+<br>

+It is not possible to prevent all warnings from all compilers, nor is it<br>

+desirable.  Instead, pick a standard compiler (like ``gcc``) that provides a<br>

+good thorough set of warnings, and stick to it.  At least in the case of<br>

+``gcc``, it is possible to work around any spurious errors by changing the<br>

+syntax of the code slightly.  For example, a warning that annoys me occurs when<br>

+I write code like this:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  if (V = getValue()) {<br>

+    ...<br>

+  }<br>

+<br>

+``gcc`` will warn me that I probably want to use the ``==`` operator, and that I<br>

+probably mistyped it.  In most cases, I haven't, and I really don't want the<br>

+spurious errors.  To fix this particular problem, I rewrite the code like<br>

+this:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  if ((V = getValue())) {<br>

+    ...<br>

+  }<br>

+<br>

+which shuts ``gcc`` up.  Any ``gcc`` warning that annoys you can be fixed by<br>

+massaging the code appropriately.<br>

+<br>

+Write Portable Code<br>

+^^^^^^^^^^^^^^^^^^^<br>

+<br>

+In almost all cases, it is possible and within reason to write completely<br>

+portable code.  If there are cases where it isn't possible to write portable<br>

+code, isolate it behind a well defined (and well documented) interface.<br>

+<br>

+In practice, this means that you shouldn't assume much about the host compiler<br>

+(and Visual Studio tends to be the lowest common denominator).  If advanced<br>

+features are used, they should only be an implementation detail of a library<br>

+which has a simple exposed API, and preferably be buried in ``libSystem``.<br>

+<br>

+Do not use RTTI or Exceptions<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+In an effort to reduce code and executable size, LLVM does not use RTTI<br>

+(e.g. ``dynamic_cast<>;``) or exceptions.  These two language features violate<br>

+the general C++ principle of *"you only pay for what you use"*, causing<br>

+executable bloat even if exceptions are never used in the code base, or if RTTI<br>

+is never used for a class.  Because of this, we turn them off globally in the<br>

+code.<br>

+<br>

+That said, LLVM does make extensive use of a hand-rolled form of RTTI that use<br>

+templates like `isa<>, cast<>, and dyn_cast<> <ProgrammersManual.html#isa>`_.<br>

+This form of RTTI is opt-in and can be added to any class.  It is also<br>

+substantially more efficient than ``dynamic_cast<>``.<br>

+<br>

+.. _static constructor:<br>

+<br>

+Do not use Static Constructors<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+Static constructors and destructors (e.g. global variables whose types have a<br>

+constructor or destructor) should not be added to the code base, and should be<br>

+removed wherever possible.  Besides `well known problems<br>

+<<a href="http://yosefk.com/c++fqa/ctors.html#fqa-10.12" target="_blank">http://yosefk.com/c++fqa/ctors.html#fqa-10.12</a>>`_ where the order of<br>

+initialization is undefined between globals in different source files, the<br>

+entire concept of static constructors is at odds with the common use case of<br>

+LLVM as a library linked into a larger application.<br>

+<br>

+Consider the use of LLVM as a JIT linked into another application (perhaps for<br>

+`OpenGL, custom languages <<a href="http://llvm.org/Users.html" target="_blank">http://llvm.org/Users.html</a>>`_, `shaders in movies<br>

+<<a href="http://llvm.org/devmtg/2010-11/Gritz-OpenShadingLang.pdf" target="_blank">http://llvm.org/devmtg/2010-11/Gritz-OpenShadingLang.pdf</a>>`_, etc). Due to the<br>

+design of static constructors, they must be executed at startup time of the<br>

+entire application, regardless of whether or how LLVM is used in that larger<br>

+application.  There are two problems with this:<br>

+<br>

+* The time to run the static constructors impacts startup time of applications<br>

+  --- a critical time for GUI apps, among others.<br>

+<br>

+* The static constructors cause the app to pull many extra pages of memory off<br>

+  the disk: both the code for the constructor in each ``.o`` file and the small<br>

+  amount of data that gets touched. In addition, touched/dirty pages put more<br>

+  pressure on the VM system on low-memory machines.<br>

+<br>

+We would really like for there to be zero cost for linking in an additional LLVM<br>

+target or other library into an application, but static constructors violate<br>

+this goal.<br>

+<br>

+That said, LLVM unfortunately does contain static constructors.  It would be a<br>

+`great project <<a href="http://llvm.org/PR11944" target="_blank">http://llvm.org/PR11944</a>>`_ for someone to purge all static<br>

+constructors from LLVM, and then enable the ``-Wglobal-constructors`` warning<br>

+flag (when building with Clang) to ensure we do not regress in the future.<br>

+<br>

+Use of ``class`` and ``struct`` Keywords<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+In C++, the ``class`` and ``struct`` keywords can be used almost<br>

+interchangeably. The only difference is when they are used to declare a class:<br>

+``class`` makes all members private by default while ``struct`` makes all<br>

+members public by default.<br>

+<br>

+Unfortunately, not all compilers follow the rules and some will generate<br>

+different symbols based on whether ``class`` or ``struct`` was used to declare<br>

+the symbol.  This can lead to problems at link time.<br>

+<br>

+So, the rule for LLVM is to always use the ``class`` keyword, unless **all**<br>

+members are public and the type is a C++ `POD<br>

+<<a href="http://en.wikipedia.org/wiki/Plain_old_data_structure" target="_blank">http://en.wikipedia.org/wiki/Plain_old_data_structure</a>>`_ type, in which case<br>

+``struct`` is allowed.<br>

+<br>

+Style Issues<br>

+============<br>

+<br>

+The High-Level Issues<br>

+---------------------<br>

+<br>

+A Public Header File **is** a Module<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+C++ doesn't do too well in the modularity department.  There is no real<br>

+encapsulation or data hiding (unless you use expensive protocol classes), but it<br>

+is what we have to work with.  When you write a public header file (in the LLVM<br>

+source tree, they live in the top level "``include``" directory), you are<br>

+defining a module of functionality.<br>

+<br>

+Ideally, modules should be completely independent of each other, and their<br>

+header files should only ``#include`` the absolute minimum number of headers<br>

+possible. A module is not just a class, a function, or a namespace: it's a<br>

+collection of these that defines an interface.  This interface may be several<br>

+functions, classes, or data structures, but the important issue is how they work<br>

+together.<br>

+<br>

+In general, a module should be implemented by one or more ``.cpp`` files.  Each<br>

+of these ``.cpp`` files should include the header that defines their interface<br>

+first.  This ensures that all of the dependences of the module header have been<br>

+properly added to the module header itself, and are not implicit.  System<br>

+headers should be included after user headers for a translation unit.<br>

+<br>

+.. _minimal list of #includes:<br>

+<br>

+``#include`` as Little as Possible<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+``#include`` hurts compile time performance.  Don't do it unless you have to,<br>

+especially in header files.<br>

+<br>

+But wait! Sometimes you need to have the definition of a class to use it, or to<br>

+inherit from it.  In these cases go ahead and ``#include`` that header file.  Be<br>

+aware however that there are many cases where you don't need to have the full<br>

+definition of a class.  If you are using a pointer or reference to a class, you<br>

+don't need the header file.  If you are simply returning a class instance from a<br>

+prototyped function or method, you don't need it.  In fact, for most cases, you<br>

+simply don't need the definition of a class. And not ``#include``\ing speeds up<br>

+compilation.<br>

+<br>

+It is easy to try to go too overboard on this recommendation, however.  You<br>

+**must** include all of the header files that you are using --- you can include<br>

+them either directly or indirectly through another header file.  To make sure<br>

+that you don't accidentally forget to include a header file in your module<br>

+header, make sure to include your module header **first** in the implementation<br>

+file (as mentioned above).  This way there won't be any hidden dependencies that<br>

+you'll find out about later.<br>

+<br>

+Keep "Internal" Headers Private<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+Many modules have a complex implementation that causes them to use more than one<br>

+implementation (``.cpp``) file.  It is often tempting to put the internal<br>

+communication interface (helper classes, extra functions, etc) in the public<br>

+module header file.  Don't do this!<br>

+<br>

+If you really need to do something like this, put a private header file in the<br>

+same directory as the source files, and include it locally.  This ensures that<br>

+your private interface remains private and undisturbed by outsiders.<br>

+<br>

+.. note::<br>

+<br>

+    It's okay to put extra implementation methods in a public class itself. Just<br>

+    make them private (or protected) and all is well.<br>

+<br>

+.. _early exits:<br>

+<br>

+Use Early Exits and ``continue`` to Simplify Code<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+When reading code, keep in mind how much state and how many previous decisions<br>

+have to be remembered by the reader to understand a block of code.  Aim to<br>

+reduce indentation where possible when it doesn't make it more difficult to<br>

+understand the code.  One great way to do this is by making use of early exits<br>

+and the ``continue`` keyword in long loops.  As an example of using an early<br>

+exit from a function, consider this "bad" code:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  Value *DoSomething(Instruction *I) {<br>

+    if (!isa<TerminatorInst>(I) &&<br>

+        I->hasOneUse() && SomeOtherThing(I)) {<br>

+      ... some long code ....<br>

+    }<br>

+<br>

+    return 0;<br>

+  }<br>

+<br>

+This code has several problems if the body of the ``'if'`` is large.  When<br>

+you're looking at the top of the function, it isn't immediately clear that this<br>

+*only* does interesting things with non-terminator instructions, and only<br>

+applies to things with the other predicates.  Second, it is relatively difficult<br>

+to describe (in comments) why these predicates are important because the ``if``<br>

+statement makes it difficult to lay out the comments.  Third, when you're deep<br>

+within the body of the code, it is indented an extra level.  Finally, when<br>

+reading the top of the function, it isn't clear what the result is if the<br>

+predicate isn't true; you have to read to the end of the function to know that<br>

+it returns null.<br>

+<br>

+It is much preferred to format the code like this:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  Value *DoSomething(Instruction *I) {<br>

+    // Terminators never need 'something' done to them because ...<br>

+    if (isa<TerminatorInst>(I))<br>

+      return 0;<br>

+<br>

+    // We conservatively avoid transforming instructions with multiple uses<br>

+    // because goats like cheese.<br>

+    if (!I->hasOneUse())<br>

+      return 0;<br>

+<br>

+    // This is really just here for example.<br>

+    if (!SomeOtherThing(I))<br>

+      return 0;<br>

+<br>

+    ... some long code ....<br>

+  }<br>

+<br>

+This fixes these problems.  A similar problem frequently happens in ``for``<br>

+loops.  A silly example is something like this:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) {<br>

+    if (BinaryOperator *BO = dyn_cast<BinaryOperator>(II)) {<br>

+      Value *LHS = BO->getOperand(0);<br>

+      Value *RHS = BO->getOperand(1);<br>

+      if (LHS != RHS) {<br>

+        ...<br>

+      }<br>

+    }<br>

+  }<br>

+<br>

+When you have very, very small loops, this sort of structure is fine. But if it<br>

+exceeds more than 10-15 lines, it becomes difficult for people to read and<br>

+understand at a glance. The problem with this sort of code is that it gets very<br>

+nested very quickly. Meaning that the reader of the code has to keep a lot of<br>

+context in their brain to remember what is going immediately on in the loop,<br>

+because they don't know if/when the ``if`` conditions will have ``else``\s etc.<br>

+It is strongly preferred to structure the loop like this:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) {<br>

+    BinaryOperator *BO = dyn_cast<BinaryOperator>(II);<br>

+    if (!BO) continue;<br>

+<br>

+    Value *LHS = BO->getOperand(0);<br>

+    Value *RHS = BO->getOperand(1);<br>

+    if (LHS == RHS) continue;<br>

+<br>

+    ...<br>

+  }<br>

+<br>

+This has all the benefits of using early exits for functions: it reduces nesting<br>

+of the loop, it makes it easier to describe why the conditions are true, and it<br>

+makes it obvious to the reader that there is no ``else`` coming up that they<br>

+have to push context into their brain for.  If a loop is large, this can be a<br>

+big understandability win.<br>

+<br>

+Don't use ``else`` after a ``return``<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+For similar reasons above (reduction of indentation and easier reading), please<br>

+do not use ``'else'`` or ``'else if'`` after something that interrupts control<br>

+flow --- like ``return``, ``break``, ``continue``, ``goto``, etc. For<br>

+example, this is *bad*:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  case 'J': {<br>

+    if (Signed) {<br>

+      Type = Context.getsigjmp_bufType();<br>

+      if (Type.isNull()) {<br>

+        Error = ASTContext::GE_Missing_sigjmp_buf;<br>

+        return QualType();<br>

+      } else {<br>

+        break;<br>

+      }<br>

+    } else {<br>

+      Type = Context.getjmp_bufType();<br>

+      if (Type.isNull()) {<br>

+        Error = ASTContext::GE_Missing_jmp_buf;<br>

+        return QualType();<br>

+      <b>} else {<br>

+        break;<br>

+      }</b><br>

+    }<br>

+  }<br>

+  }<br>

+<br>

+It is better to write it like this:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  case 'J':<br>

+    if (Signed) {<br>

+      Type = Context.getsigjmp_bufType();<br>

+      if (Type.isNull()) {<br>

+        Error = ASTContext::GE_Missing_sigjmp_buf;<br>

+        return QualType();<br>

+      }<br>

+    } else {<br>

+      Type = Context.getjmp_bufType();<br>

+      if (Type.isNull()) {<br>

+        Error = ASTContext::GE_Missing_jmp_buf;<br>

+        return QualType();<br>

+      }<br>

+    }<br>

+    break;<br>

+<br>

+Or better yet (in this case) as:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  case 'J':<br>

+    if (Signed)<br>

+      Type = Context.getsigjmp_bufType();<br>

+    else<br>

+      Type = Context.getjmp_bufType();<br>

+<br>

+    if (Type.isNull()) {<br>

+      Error = Signed ? ASTContext::GE_Missing_sigjmp_buf :<br>

+                       ASTContext::GE_Missing_jmp_buf;<br>

+      return QualType();<br>

+    }<br>

+    break;<br>

+<br>

+The idea is to reduce indentation and the amount of code you have to keep track<br>

+of when reading the code.<br>

+<br>

+Turn Predicate Loops into Predicate Functions<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+It is very common to write small loops that just compute a boolean value.  There<br>

+are a number of ways that people commonly write these, but an example of this<br>

+sort of thing is:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  bool FoundFoo = false;<br>

+  for (unsigned i = 0, e = BarList.size(); i != e; ++i)<br>

+    if (BarList[i]->isFoo()) {<br>

+      FoundFoo = true;<br>

+      break;<br>

+    }<br>

+<br>

+  if (FoundFoo) {<br>

+    ...<br>

+  }<br>

+<br>

+This sort of code is awkward to write, and is almost always a bad sign.  Instead<br>

+of this sort of loop, we strongly prefer to use a predicate function (which may<br>

+be `static`_) that uses `early exits`_ to compute the predicate.  We prefer the<br>

+code to be structured like this:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  /// ListContainsFoo - Return true if the specified list has an element that is<br>

+  /// a foo.<br>

+  static bool ListContainsFoo(const std::vector<Bar*> &List) {<br>

+    for (unsigned i = 0, e = List.size(); i != e; ++i)<br>

+      if (List[i]->isFoo())<br>

+        return true;<br>

+    return false;<br>

+  }<br>

+  ...<br>

+<br>

+  if (ListContainsFoo(BarList)) {<br>

+    ...<br>

+  }<br>

+<br>

+There are many reasons for doing this: it reduces indentation and factors out<br>

+code which can often be shared by other code that checks for the same predicate.<br>

+More importantly, it *forces you to pick a name* for the function, and forces<br>

+you to write a comment for it.  In this silly example, this doesn't add much<br>

+value.  However, if the condition is complex, this can make it a lot easier for<br>

+the reader to understand the code that queries for this predicate.  Instead of<br>

+being faced with the in-line details of how we check to see if the BarList<br>

+contains a foo, we can trust the function name and continue reading with better<br>

+locality.<br>

+<br>

+The Low-Level Issues<br>

+--------------------<br>

+<br>

+Name Types, Functions, Variables, and Enumerators Properly<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+Poorly-chosen names can mislead the reader and cause bugs. We cannot stress<br>

+enough how important it is to use *descriptive* names.  Pick names that match<br>

+the semantics and role of the underlying entities, within reason.  Avoid<br>

+abbreviations unless they are well known.  After picking a good name, make sure<br>

+to use consistent capitalization for the name, as inconsistency requires clients<br>

+to either memorize the APIs or to look it up to find the exact spelling.<br>

+<br>

+In general, names should be in camel case (e.g. ``TextFileReader`` and<br>

+``isLValue()``).  Different kinds of declarations have different rules:<br>

+<br>

+* **Type names** (including classes, structs, enums, typedefs, etc) should be<br>

+  nouns and start with an upper-case letter (e.g. ``TextFileReader``).<br>

+<br>

+* **Variable names** should be nouns (as they represent state).  The name should<br>

+  be camel case, and start with an upper case letter (e.g. ``Leader`` or<br>

+  ``Boats``).<br>

+<br>

+* **Function names** should be verb phrases (as they represent actions), and<br>

+  command-like function should be imperative.  The name should be camel case,<br>

+  and start with a lower case letter (e.g. ``openFile()`` or ``isFoo()``).<br>

+<br>

+* **Enum declarations** (e.g. ``enum Foo {...}``) are types, so they should<br>

+  follow the naming conventions for types.  A common use for enums is as a<br>

+  discriminator for a union, or an indicator of a subclass.  When an enum is<br>

+  used for something like this, it should have a ``Kind`` suffix<br>

+  (e.g. ``ValueKind``).<br>

+<br>

+* **Enumerators** (e.g. ``enum { Foo, Bar }``) and **public member variables**<br>

+  should start with an upper-case letter, just like types.  Unless the<br>

+  enumerators are defined in their own small namespace or inside a class,<br>

+  enumerators should have a prefix corresponding to the enum declaration name.<br>

+  For example, ``enum ValueKind { ... };`` may contain enumerators like<br>

+  ``VK_Argument``, ``VK_BasicBlock``, etc.  Enumerators that are just<br>

+  convenience constants are exempt from the requirement for a prefix.  For<br>

+  instance:<br>

+<br>

+  .. code-block:: c++<br>

+<br>

+      enum {<br>

+        MaxSize = 42,<br>

+        Density = 12<br>

+      };<br>

+<br>

+As an exception, classes that mimic STL classes can have member names in STL's<br>

+style of lower-case words separated by underscores (e.g. ``begin()``,<br>

+``push_back()``, and ``empty()``).<br>

+<br>

+Here are some examples of good and bad names:<br>

+<br>

+.. code-block::c++<br>

+<br>

+  class VehicleMaker {<br>

+    ...<br>

+    Factory<Tire> F;            // Bad -- abbreviation and non-descriptive.<br>

+    Factory<Tire> Factory;      // Better.<br>

+    Factory<Tire> TireFactory;  // Even better -- if VehicleMaker has more than one<br>

+                                // kind of factories.<br>

+  };<br>

+<br>

+  Vehicle MakeVehicle(VehicleType Type) {<br>

+    VehicleMaker M;                         // Might be OK if having a short life-span.<br>

+    Tire tmp1 = M.makeTire();               // Bad -- 'tmp1' provides no information.<br>

+    Light headlight = M.makeLight("head");  // Good -- descriptive.<br>

+    ...<br>

+  }<br>

+<br>

+Assert Liberally<br>

+^^^^^^^^^^^^^^^^<br>

+<br>

+Use the "``assert``" macro to its fullest.  Check all of your preconditions and<br>

+assumptions, you never know when a bug (not necessarily even yours) might be<br>

+caught early by an assertion, which reduces debugging time dramatically.  The<br>

+"``<cassert>``" header file is probably already included by the header files you<br>

+are using, so it doesn't cost anything to use it.<br>

+<br>

+To further assist with debugging, make sure to put some kind of error message in<br>

+the assertion statement, which is printed if the assertion is tripped. This<br>

+helps the poor debugger make sense of why an assertion is being made and<br>

+enforced, and hopefully what to do about it.  Here is one complete example:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  inline Value *getOperand(unsigned i) {<br>

+    assert(i < Operands.size() &amp;&amp; "getOperand() out of range!");<br>

+    return Operands[i];<br>

+  }<br>

+<br>

+Here are more examples:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  assert(Ty->isPointerType() && "Can't allocate a non pointer type!");<br>

+<br>

+  assert((Opcode == Shl || Opcode == Shr) && "ShiftInst Opcode invalid!");<br>

+<br>

+  assert(idx < getNumSuccessors() && "Successor # out of range!");<br>

+<br>

+  assert(V1.getType() == V2.getType() && "Constant types must be identical!");<br>

+<br>

+  assert(isa<PHINode>(Succ->front()) && "Only works on PHId BBs!");<br>

+<br>

+You get the idea.<br>

+<br>

+Please be aware that, when adding assert statements, not all compilers are aware<br>

+of the semantics of the assert.  In some places, asserts are used to indicate a<br>

+piece of code that should not be reached.  These are typically of the form:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  assert(0 && "Some helpful error message");<br>

+<br>

+When used in a function that returns a value, they should be followed with a<br>

+return statement and a comment indicating that this line is never reached.  This<br>

+will prevent a compiler which is unable to deduce that the assert statement<br>

+never returns from generating a warning.<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  assert(0 && "Some helpful error message");<br>

+  return 0;<br>

+<br>

+Another issue is that values used only by assertions will produce an "unused<br>

+value" warning when assertions are disabled.  For example, this code will warn:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  unsigned Size = V.size();<br>

+  assert(Size > 42 && "Vector smaller than it should be");<br>

+<br>

+  bool NewToSet = Myset.insert(Value);<br>

+  assert(NewToSet && "The value shouldn't be in the set yet");<br>

+<br>

+These are two interesting different cases. In the first case, the call to<br>

+``V.size()`` is only useful for the assert, and we don't want it executed when<br>

+assertions are disabled.  Code like this should move the call into the assert<br>

+itself.  In the second case, the side effects of the call must happen whether<br>

+the assert is enabled or not.  In this case, the value should be cast to void to<br>

+disable the warning.  To be specific, it is preferred to write the code like<br>

+this:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  assert(V.size() > 42 && "Vector smaller than it should be");<br>

+<br>

+  bool NewToSet = Myset.insert(Value); (void)NewToSet;<br>

+  assert(NewToSet && "The value shouldn't be in the set yet");<br>

+<br>

+Do Not Use ``using namespace std``<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+In LLVM, we prefer to explicitly prefix all identifiers from the standard<br>

+namespace with an "``std::``" prefix, rather than rely on "``using namespace<br>

+std;``".<br>

+<br>

+In header files, adding a ``'using namespace XXX'`` directive pollutes the<br>

+namespace of any source file that ``#include``\s the header.  This is clearly a<br>

+bad thing.<br>

+<br>

+In implementation files (e.g. ``.cpp`` files), the rule is more of a stylistic<br>

+rule, but is still important.  Basically, using explicit namespace prefixes<br>

+makes the code **clearer**, because it is immediately obvious what facilities<br>

+are being used and where they are coming from. And **more portable**, because<br>

+namespace clashes cannot occur between LLVM code and other namespaces.  The<br>

+portability rule is important because different standard library implementations<br>

+expose different symbols (potentially ones they shouldn't), and future revisions<br>

+to the C++ standard will add more symbols to the ``std`` namespace.  As such, we<br>

+never use ``'using namespace std;'`` in LLVM.<br>

+<br>

+The exception to the general rule (i.e. it's not an exception for the ``std``<br>

+namespace) is for implementation files.  For example, all of the code in the<br>

+LLVM project implements code that lives in the 'llvm' namespace.  As such, it is<br>

+ok, and actually clearer, for the ``.cpp`` files to have a ``'using namespace<br>

+llvm;'`` directive at the top, after the ``#include``\s.  This reduces<br>

+indentation in the body of the file for source editors that indent based on<br>

+braces, and keeps the conceptual context cleaner.  The general form of this rule<br>

+is that any ``.cpp`` file that implements code in any namespace may use that<br>

+namespace (and its parents'), but should not use any others.<br>

+<br>

+Provide a Virtual Method Anchor for Classes in Headers<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+If a class is defined in a header file and has a vtable (either it has virtual<br>

+methods or it derives from classes with virtual methods), it must always have at<br>

+least one out-of-line virtual method in the class.  Without this, the compiler<br>

+will copy the vtable and RTTI into every ``.o`` file that ``#include``\s the<br>

+header, bloating ``.o`` file sizes and increasing link times.<br>

+<br>

+Don't evaluate ``end()`` every time through a loop<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+Because C++ doesn't have a standard "``foreach``" loop (though it can be<br>

+emulated with macros and may be coming in C++'0x) we end up writing a lot of<br>

+loops that manually iterate from begin to end on a variety of containers or<br>

+through other data structures.  One common mistake is to write a loop in this<br>

+style:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  BasicBlock *BB = ...<br>

+  for (BasicBlock::iterator I = BB->begin(); I != BB->end(); ++I)<br>

+    ... use I ...<br>

+<br>

+The problem with this construct is that it evaluates "``BB->end()``" every time<br>

+through the loop.  Instead of writing the loop like this, we strongly prefer<br>

+loops to be written so that they evaluate it once before the loop starts.  A<br>

+convenient way to do this is like so:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  BasicBlock *BB = ...<br>

+  for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I)<br>

+    ... use I ...<br>

+<br>

+The observant may quickly point out that these two loops may have different<br>

+semantics: if the container (a basic block in this case) is being mutated, then<br>

+"``BB->end()``" may change its value every time through the loop and the second<br>

+loop may not in fact be correct.  If you actually do depend on this behavior,<br>

+please write the loop in the first form and add a comment indicating that you<br>

+did it intentionally.<br>

+<br>

+Why do we prefer the second form (when correct)?  Writing the loop in the first<br>

+form has two problems. First it may be less efficient than evaluating it at the<br>

+start of the loop.  In this case, the cost is probably minor --- a few extra<br>

+loads every time through the loop.  However, if the base expression is more<br>

+complex, then the cost can rise quickly.  I've seen loops where the end<br>

+expression was actually something like: "``SomeMap[x]->end()``" and map lookups<br>

+really aren't cheap.  By writing it in the second form consistently, you<br>

+eliminate the issue entirely and don't even have to think about it.<br>

+<br>

+The second (even bigger) issue is that writing the loop in the first form hints<br>

+to the reader that the loop is mutating the container (a fact that a comment<br>

+would handily confirm!).  If you write the loop in the second form, it is<br>

+immediately obvious without even looking at the body of the loop that the<br>

+container isn't being modified, which makes it easier to read the code and<br>

+understand what it does.<br>

+<br>

+While the second form of the loop is a few extra keystrokes, we do strongly<br>

+prefer it.<br>

+<br>

+``#include <iostream>`` is Forbidden<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+The use of ``#include <iostream>`` in library files is hereby **forbidden**,<br>

+because many common implementations transparently inject a `static constructor`_<br>

+into every translation unit that includes it.<br>

+<br>

+Note that using the other stream headers (``<sstream>`` for example) is not<br>

+problematic in this regard --- just ``<iostream>``. However, ``raw_ostream``<br>

+provides various APIs that are better performing for almost every use than<br>

+``std::ostream`` style APIs.<br>

+<br>

+.. note::<br>

+<br>

+  New code should always use `raw_ostream`_ for writing, or the<br>

+  ``llvm::MemoryBuffer`` API for reading files.<br>

+<br>

+.. _raw_ostream:<br>

+<br>

+Use ``raw_ostream``<br>

+^^^^^^^^^^^^^^^^^^^<br>

+<br>

+LLVM includes a lightweight, simple, and efficient stream implementation in<br>

+``llvm/Support/raw_ostream.h``, which provides all of the common features of<br>

+``std::ostream``.  All new code should use ``raw_ostream`` instead of<br>

+``ostream``.<br>

+<br>

+Unlike ``std::ostream``, ``raw_ostream`` is not a template and can be forward<br>

+declared as ``class raw_ostream``.  Public headers should generally not include<br>

+the ``raw_ostream`` header, but use forward declarations and constant references<br>

+to ``raw_ostream`` instances.<br>

+<br>

+Avoid ``std::endl``<br>

+^^^^^^^^^^^^^^^^^^^<br>

+<br>

+The ``std::endl`` modifier, when used with ``iostreams`` outputs a newline to<br>

+the output stream specified.  In addition to doing this, however, it also<br>

+flushes the output stream.  In other words, these are equivalent:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  std::cout << std::endl;<br>

+  std::cout << '\n' << std::flush;<br>

+<br>

+Most of the time, you probably have no reason to flush the output stream, so<br>

+it's better to use a literal ``'\n'``.<br>

+<br>

+Microscopic Details<br>

+-------------------<br>

+<br>

+This section describes preferred low-level formatting guidelines along with<br>

+reasoning on why we prefer them.<br>

+<br>

+Spaces Before Parentheses<br>

+^^^^^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+We prefer to put a space before an open parenthesis only in control flow<br>

+statements, but not in normal function call expressions and function-like<br>

+macros.  For example, this is good:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  if (x) ...<br>

+  for (i = 0; i != 100; ++i) ...<br>

+  while (llvm_rocks) ...<br>

+<br>

+  somefunc(42);<br>

+  assert(3 != 4 && "laws of math are failing me");<br>

+<br>

+  a = foo(42, 92) + bar(x);<br>

+<br>

+and this is bad:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  if(x) ...<br>

+  for(i = 0; i != 100; ++i) ...<br>

+  while(llvm_rocks) ...<br>

+<br>

+  somefunc (42);<br>

+  assert (3 != 4 && "laws of math are failing me");<br>

+<br>

+  a = foo (42, 92) + bar (x);<br>

+<br>

+The reason for doing this is not completely arbitrary.  This style makes control<br>

+flow operators stand out more, and makes expressions flow better. The function<br>

+call operator binds very tightly as a postfix operator.  Putting a space after a<br>

+function name (as in the last example) makes it appear that the code might bind<br>

+the arguments of the left-hand-side of a binary operator with the argument list<br>

+of a function and the name of the right side.  More specifically, it is easy to<br>

+misread the "``a``" example as:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  a = foo ((42, 92) + bar) (x);<br>

+<br>

+when skimming through the code.  By avoiding a space in a function, we avoid<br>

+this misinterpretation.<br>

+<br>

+Prefer Preincrement<br>

+^^^^^^^^^^^^^^^^^^^<br>

+<br>

+Hard fast rule: Preincrement (``++X``) may be no slower than postincrement<br>

+(``X++``) and could very well be a lot faster than it.  Use preincrementation<br>

+whenever possible.<br>

+<br>

+The semantics of postincrement include making a copy of the value being<br>

+incremented, returning it, and then preincrementing the "work value".  For<br>

+primitive types, this isn't a big deal. But for iterators, it can be a huge<br>

+issue (for example, some iterators contains stack and set objects in them...<br>

+copying an iterator could invoke the copy ctor's of these as well).  In general,<br>

+get in the habit of always using preincrement, and you won't have a problem.<br>

+<br>

+<br>

+Namespace Indentation<br>

+^^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+In general, we strive to reduce indentation wherever possible.  This is useful<br>

+because we want code to `fit into 80 columns`_ without wrapping horribly, but<br>

+also because it makes it easier to understand the code.  Namespaces are a funny<br>

+thing: they are often large, and we often desire to put lots of stuff into them<br>

+(so they can be large).  Other times they are tiny, because they just hold an<br>

+enum or something similar.  In order to balance this, we use different<br>

+approaches for small versus large namespaces.<br>

+<br>

+If a namespace definition is small and *easily* fits on a screen (say, less than<br>

+35 lines of code), then you should indent its body.  Here's an example:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  namespace llvm {<br>

+    namespace X86 {<br>

+      /// RelocationType - An enum for the x86 relocation codes. Note that<br>

+      /// the terminology here doesn't follow x86 convention - word means<br>

+      /// 32-bit and dword means 64-bit.<br>

+      enum RelocationType {<br>

+        /// reloc_pcrel_word - PC relative relocation, add the relocated value to<br>

+        /// the value already in memory, after we adjust it for where the PC is.<br>

+        reloc_pcrel_word = 0,<br>

+<br>

+        /// reloc_picrel_word - PIC base relative relocation, add the relocated<br>

+        /// value to the value already in memory, after we adjust it for where the<br>

+        /// PIC base is.<br>

+        reloc_picrel_word = 1,<br>

+<br>

+        /// reloc_absolute_word, reloc_absolute_dword - Absolute relocation, just<br>

+        /// add the relocated value to the value already in memory.<br>

+        reloc_absolute_word = 2,<br>

+        reloc_absolute_dword = 3<br>

+      };<br>

+    }<br>

+  }<br>

+<br>

+Since the body is small, indenting adds value because it makes it very clear<br>

+where the namespace starts and ends, and it is easy to take the whole thing in<br>

+in one "gulp" when reading the code.  If the blob of code in the namespace is<br>

+larger (as it typically is in a header in the ``llvm`` or ``clang`` namespaces),<br>

+do not indent the code, and add a comment indicating what namespace is being<br>

+closed.  For example:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  namespace llvm {<br>

+  namespace knowledge {<br>

+<br>

+  /// Grokable - This class represents things that Smith can have an intimate<br>

+  /// understanding of and contains the data associated with it.<br>

+  class Grokable {<br>

+  ...<br>

+  public:<br>

+    explicit Grokable() { ... }<br>

+    virtual ~Grokable() = 0;<br>

+<br>

+    ...<br>

+<br>

+  };<br>

+<br>

+  } // end namespace knowledge<br>

+  } // end namespace llvm<br>

+<br>

+Because the class is large, we don't expect that the reader can easily<br>

+understand the entire concept in a glance, and the end of the file (where the<br>

+namespaces end) may be a long ways away from the place they open.  As such,<br>

+indenting the contents of the namespace doesn't add any value, and detracts from<br>

+the readability of the class.  In these cases it is best to *not* indent the<br>

+contents of the namespace.<br>

+<br>

+.. _static:<br>

+<br>

+Anonymous Namespaces<br>

+^^^^^^^^^^^^^^^^^^^^<br>

+<br>

+After talking about namespaces in general, you may be wondering about anonymous<br>

+namespaces in particular.  Anonymous namespaces are a great language feature<br>

+that tells the C++ compiler that the contents of the namespace are only visible<br>

+within the current translation unit, allowing more aggressive optimization and<br>

+eliminating the possibility of symbol name collisions.  Anonymous namespaces are<br>

+to C++ as "static" is to C functions and global variables.  While "``static``"<br>

+is available in C++, anonymous namespaces are more general: they can make entire<br>

+classes private to a file.<br>

+<br>

+The problem with anonymous namespaces is that they naturally want to encourage<br>

+indentation of their body, and they reduce locality of reference: if you see a<br>

+random function definition in a C++ file, it is easy to see if it is marked<br>

+static, but seeing if it is in an anonymous namespace requires scanning a big<br>

+chunk of the file.<br>

+<br>

+Because of this, we have a simple guideline: make anonymous namespaces as small<br>

+as possible, and only use them for class declarations.  For example, this is<br>

+good:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  namespace {<br>

+    class StringSort {<br>

+    ...<br>

+    public:<br>

+      StringSort(...)<br>

+      bool operator<(const char *RHS) const;<br>

+    };<br>

+  } // end anonymous namespace<br>

+<br>

+  static void Helper() {<br>

+    ...<br>

+  }<br>

+<br>

+  bool StringSort::operator<(const char *RHS) const {<br>

+    ...<br>

+  }<br>

+<br>

+This is bad:<br>

+<br>

+.. code-block:: c++<br>

+<br>

+  namespace {<br>

+  class StringSort {<br>

+  ...<br>

+  public:<br>

+    StringSort(...)<br>

+    bool operator<(const char *RHS) const;<br>

+  };<br>

+<br>

+  void Helper() {<br>

+    ...<br>

+  }<br>

+<br>

+  bool StringSort::operator<(const char *RHS) const {<br>

+    ...<br>

+  }<br>

+<br>

+  } // end anonymous namespace<br>

+<br>

+This is bad specifically because if you're looking at "``Helper``" in the middle<br>

+of a large C++ file, that you have no immediate way to tell if it is local to<br>

+the file.  When it is marked static explicitly, this is immediately obvious.<br>

+Also, there is no reason to enclose the definition of "``operator<``" in the<br>

+namespace just because it was declared there.<br>

+<br>

+See Also<br>

+========<br>

+<br>

+A lot of these comments and recommendations have been culled for other sources.<br>

+Two particularly important books for our work are:<br>

+<br>

+#. `Effective C++<br>

+   <<a href="http://www.amazon.com/Effective-Specific-Addison-Wesley-Professional-Computing/dp/0321334876" target="_blank">http://www.amazon.com/Effective-Specific-Addison-Wesley-Professional-Computing/dp/0321334876</a>>`_<br>


+   by Scott Meyers.  Also interesting and useful are "More Effective C++" and<br>

+   "Effective STL" by the same author.<br>

+<br>

+#. `Large-Scale C++ Software Design<br>

+   <<a href="http://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620/ref=sr_1_1" target="_blank">http://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620/ref=sr_1_1</a>>`_<br>

+   by John Lakos<br>

+<br>

+If you get some free time, and you haven't read them: do so, you might learn<br>

+something.<br>

<br>

Modified: llvm/trunk/docs/development_process.rst<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/development_process.rst?rev=158786&r1=158785&r2=158786&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/development_process.rst?rev=158786&r1=158785&r2=158786&view=diff</a><br>


==============================================================================<br>

--- llvm/trunk/docs/development_process.rst (original)<br>

+++ llvm/trunk/docs/development_process.rst Tue Jun 19 21:57:56 2012<br>

@@ -7,6 +7,7 @@<br>

    :hidden:<br>

<br>

    Projects<br>

+   CodingStandards<br>

<br>

 \<br>

<br>

@@ -17,6 +18,12 @@<br>

    tree) allow the project code to be located outside (or inside) the ``llvm/``<br>

    tree, while using LLVM header files and libraries.<br>

<br>

+ * :ref:`coding_standards`<br>

+<br>

+   Describes a few coding standards that are used in the LLVM source tree. All<br>

+   code submissions must follow the coding standards before being allowed into<br>

+   the source tree.<br>

+<br>

  * `LLVMBuild Documentation <LLVMBuild.html>`_<br>

<br>

    Describes the LLVMBuild organization and files used by LLVM to specify<br>

<br>

<br>

_______________________________________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

</blockquote></div><br></div>