[LLVMbugs] [Bug 402] NEW: Bytecode Format Enhancements Needed

bugzilla-daemon at cs.uiuc.edu bugzilla-daemon at cs.uiuc.edu
Wed Jul 7 08:02:12 PDT 2004


http://llvm.cs.uiuc.edu/bugs/show_bug.cgi?id=402

           Summary: Bytecode Format Enhancements Needed
           Product: libraries
           Version: trunk
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Bytecode Writer
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: rspencer at x10sys.com


This bug is just to capture some enhancements to the bytecode format that 
we're planning for 1.3 so that we cover as many changes as possible in release 
1.3 and not disrupt users again in 1.4.

Encode Types As 24-bit Quantities. 
==================================
We need a new primitive, uint24_vbr, to encode types. Because of the use of 
bit fields for global variables and elsewhere, types are currently not fully 
32-bit quantities (see bug 392). The recommended plan is to always encode 
types into 24-bit fields, but provide for extension by using the value (2^24)-1
as an indicator that what follows is a uint64_vbr that contains the type. It 
is unlikely that many, if any, bytecode files will need more than 16 million 
distinct types.

VBRize the Block Headers
========================
While block headers are only 8 bytes currently, in very small files (say 
containing a few types), their overhead becomes quite large. We can skip the 
aligment of these fields (possibly saving a few bytes) and pack both the block 
type and block length into a single uint32_vbr. This will provide 8 bits for 
the block type (its doubtful if we'll ever need more than 256 block types) and 
24-bits for the block length. Similarly, its doubtful if we'd ever need a 
single block longer than 16MBytes.

CDRize Binary Data Content
==========================
We should use a standard for representing various binary quantities in the 
bytecode file. Integers are pretty much handled by VBR. However, float and 
double types should be regularized to IEEE format and written according to a 
cross-platform standard such as CDR (CORBA), NDR (Sun), or XDR (RPC). CDR is 
the most modern but has its shortcomings. There might be other applicable 
standards too. Strings should be regularized to a a standard format as well.



------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.




More information about the llvm-bugs mailing list