[LLVMdev] Integer questions

Fri Sep 5 12:42:21 PDT 2008

First off, most of my information about the integer representation in
LLVM is from http://llvm.org/docs/LangRef.html#t_integer and I could
use some things cleared up.

First, I guess that smaller integer sizes, say, i1 (boolean) are
stuffed into a full word size for the cpu it is compiled on (so 8bits,
or 32 bits or whatever).
What if someone made an i4 and compiled it on 32/64 bit
windows/nix/bsd on a standard x86 or x64 system, and they set the
value to 15 (the max size of an unsigned i4), if it is still rounded
up to the next nearest size when compiled (i8 or i32 or what-not),
what if when that value has 15, but a 1 was added to it, it will be
represented in memory at 16, or if you ignore all but the first 4 bits
it would be zero.  Is there any enforcement in the range of a given
integer (in other words, regardless of architecture, would an i4 be
constrained to only be 0 to 15, or is this the realm of the language
to enforce, I would think it would be as having it at LLVM level would
add noticeable overhead on non-machine size integers, and given that
it would be in the realm of the language to deal with, how can the
language be certain what values are directly appropriate for the
architecture it is being compiled on)?
In just a quick guess, I would say that something like an i4 would be
compiled as if it was an i8, treated identically to an i8 in all
circumstances, is this correct?

Second, what if the specified integer size is rather large, say that
an i512 was specified, would this cause a compile error (something
along the lines of the specified integer size being too large to fit
in the machine architecture), or would it rather compile in the
necessary code to do bignum math on it (or would something like that
be in the realm of the language designer, although having it at LLVM
level would also make sense, after all, what best knows how to compile
something for speed on the target system other then the compiler
itself)?
In just a quick guess, I would say that specifying an integer bit size
too large for the machine would cause a compile error, but the docs do
not hint at that (especially with the given example of: i1942652	a
really big integer of over 1 million bits), is this correct?

Third, assuming either or both of the above things had to be
enforced/implemented by the language designer, what would be the best
way for the language to ask LLVM what the appropriate machine integer
sizes are, so that if an i64 is specified, then bignum math would be
done by the language on a 32-bit compile, but would just be a native
int on a 64-bit compile.  The reason this is asked instead of just
directly testing the cpu bit (32-bit, 64-bit, whatever) is that some
processors allow double sized integers to be specified, so a 64-bit
integer on some 32-bit cpu's is just fine, as is a 128-bit int on a
64-bit cpu, thus how can I query what are the best appropriate integer
sizes?

Some background on the questions:  Making a JIT'd, speed-critical
'scripting-language' for a certain app of mine, integer types have a
bitsize part, like how LLVM does it, i4/s4 is a signed integer of
4-bits, u4 is an unsigned integer of 4-bits, etc...