[LLVMbugs] [Bug 402] NEW: Bytecode Format Enhancements Needed
bugzilla-daemon at cs.uiuc.edu
bugzilla-daemon at cs.uiuc.edu
Wed Jul 7 08:02:12 PDT 2004
http://llvm.cs.uiuc.edu/bugs/show_bug.cgi?id=402
Summary: Bytecode Format Enhancements Needed
Product: libraries
Version: trunk
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Bytecode Writer
AssignedTo: unassignedbugs at nondot.org
ReportedBy: rspencer at x10sys.com
This bug is just to capture some enhancements to the bytecode format that
we're planning for 1.3 so that we cover as many changes as possible in release
1.3 and not disrupt users again in 1.4.
Encode Types As 24-bit Quantities.
==================================
We need a new primitive, uint24_vbr, to encode types. Because of the use of
bit fields for global variables and elsewhere, types are currently not fully
32-bit quantities (see bug 392). The recommended plan is to always encode
types into 24-bit fields, but provide for extension by using the value (2^24)-1
as an indicator that what follows is a uint64_vbr that contains the type. It
is unlikely that many, if any, bytecode files will need more than 16 million
distinct types.
VBRize the Block Headers
========================
While block headers are only 8 bytes currently, in very small files (say
containing a few types), their overhead becomes quite large. We can skip the
aligment of these fields (possibly saving a few bytes) and pack both the block
type and block length into a single uint32_vbr. This will provide 8 bits for
the block type (its doubtful if we'll ever need more than 256 block types) and
24-bits for the block length. Similarly, its doubtful if we'd ever need a
single block longer than 16MBytes.
CDRize Binary Data Content
==========================
We should use a standard for representing various binary quantities in the
bytecode file. Integers are pretty much handled by VBR. However, float and
double types should be regularized to IEEE format and written according to a
cross-platform standard such as CDR (CORBA), NDR (Sun), or XDR (RPC). CDR is
the most modern but has its shortcomings. There might be other applicable
standards too. Strings should be regularized to a a standard format as well.
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
More information about the llvm-bugs
mailing list