[llvm-commits] CVS: llvm/docs/BytecodeFormat.html

Tue May 25 10:51:02 PDT 2004

Changes in directory llvm/docs:

BytecodeFormat.html updated: 1.6 -> 1.7

---
Log message:

Added a bit on slot numbers.


---
Diffs of the changes:  (+39 -1)

Index: llvm/docs/BytecodeFormat.html
diff -u llvm/docs/BytecodeFormat.html:1.6 llvm/docs/BytecodeFormat.html:1.7

--- llvm/docs/BytecodeFormat.html:1.6	Mon May 24 00:35:17 2004
+++ llvm/docs/BytecodeFormat.html	Tue May 25 10:47:57 2004
@@ -19,6 +19,7 @@
       <li><a href="#blocks">Blocks</a></li>
       <li><a href="#lists">Lists</a></li>
       <li><a href="#fields">Fields</a></li>
+      <li><a href="#slots">Slots</a></li>
       <li><a href="#encoding">Encoding Rules</a></li>
       <li><a href="#align">Alignment</a></li>
     </ol>
@@ -120,6 +121,43 @@
 written and how the bits are to be interpreted.</p>
 </div>
 <!-- _______________________________________________________________________ -->
+<div class="doc_subsection"><a name="slots">Slots</a> </div>
+<div class="doc_text">
+<p>The bytecode format uses the notion of a "slot" to reference Types and
+Values. Since the bytecode file is a <em>direct</em> representation of LLVM's
+intermediate representation, there is a need to represent pointers in the file.
+Slots are used for this purpose. For example, if one has the following assembly:
+</p>
+<pre><code>
+  %MyType = type { int, sbyte };
+  %MyVar = external global %MyType ;
+</code></pre>
+<p>there are two definitions. The definition of %MyVar uses %MyType and %MyType
+is used by %MyVar. In the C++ IR this linkage between %MyVar and %MyType is
+made explicitly by the use of C++ pointers. In bytecode, however, there's no
+ability to store memory addresses. Instead, we compute and write out slot 
+numbers for every type and Value written to the file.</p>
+<p>A slot number is simply an unsigned 32-bit integer encoded in the variable
+bit rate scheme (see <a href="#encoding">encoding</a> below). This ensures that
+low slot numbers are encoded in one byte. Through various bits of magic LLVM
+attempts to always keep the slot numbers low. The first attempt is to associate
+slot numbers with their "type plane". That is, Values of the same type are 
+written to the bytecode file in a list (sequentially). Their order in that list
+determines their slot number. This means that slot #1 doesn't mean anything
+unless you also specify for which type you want slot #1. Types are handled
+specially and are always written to the file first (in the Global Type Pool) and
+in such a way that both forward and backward references of the types can be
+resolved with a single pass through the type pool. </p>
+<p>Slot numbers are also kept small by rearranging their order. Because of the
+structure of LLVM, certain values are much more likely to be used frequently
+in the body of a function. For this reason, a compaction table is provided in
+the body of a function if its use would make the function body smaller. 
+Suppose you have a function body that uses just the types "int*" and "{double}"
+but uses them thousands of time. Its worthwhile to ensure that the slot number
+for these types are low so they can be encoded in a single byte (via vbr).
+This is exactly what the compaction table does.</p>
+</div>
+<!-- _______________________________________________________________________ -->
 <div class="doc_subsection"><a name="encoding">Encoding Primitives</a> </div>
 <div class="doc_text">
 <p>Each field that can be put out is encoded into the file using a small set 
@@ -471,7 +509,7 @@
   <a href="mailto:rspencer at x10sys.com">Reid Spencer</a> and 
   <a href="mailto:sabre at nondot.org">Chris Lattner</a><br>
   <a href="http://llvm.cs.uiuc.edu">The LLVM Compiler Infrastructure</a><br>
-  Last modified: $Date: 2004/05/24 05:35:17 $
+  Last modified: $Date: 2004/05/25 15:47:57 $
 </address>
 </body>
 </html>