[LLVMdev] Bitcode format
Joshua Haberman
joshua at reverberate.org
Mon Sep 3 14:34:53 PDT 2007
Greetings,
I am working on a project (unrelated to LLVM) that needed a
bytecode-like format. I found Bitcode and it seems to fit the bill
really nicely.
I am writing an independent implementation of Bitcode in C (I really
want to keep my runtime pure C). In the process of doing this, I've
discovered a few things I want to ask about.
I notice the Bitcode format documentation [0] is somewhat incomplete --
there have been a few questions I had to resolve by looking at LLVM
source code. This is totally understandable (I'm actually impressed
that it's documented so well for something so new), but I wonder if you
would accept a patch that clarified some things. For example, the
document doesn't mention endianness at all, but from the source code I
discover that bytes are read in order and bits are read
least-significant first.
I also have a few questions about the format:
- it appears that the only magic number in the file is
application-specific. This seems unfortunate, because it means that
application-neutral tools cannot be built that process bitcode files,
since they could not reliably detect that the file is a bitcode file.
It might seem like there is little room for application- neutral tools
since almost all the data in the file is application-specific, but off
the top of my head I can think of a few, like a tool to suggest
abbreviations that would give a file better compression.
- the LLVM code assumes that several VBR fields can be at most 32 bits
(block ids, number of elements in an array, etc). These assumptions
seem quite reasonable: can they be considered part of the format and
added to the document?
Cheers,
Josh
[0] http://llvm.org/releases/2.0/docs/BitCodeFormat.html
More information about the llvm-dev
mailing list