[LLVMdev] The definition of getTypeSize

Mon Oct 8 12:04:34 PDT 2007

On Oct 7, 2007, at 11:23 AM, Duncan Sands wrote:

> Now that I'm working on codegen support for arbitrary precision
> integers (think i36 or i129), I've hit the problem of what
> getTypeSize and friends should return for such integers (the
> current implementations seem to me to be wrong).  However it's
> not clear to me how getTypeSize and friends are defined.
>
> There seem to be several possible meanings for the size of a type
> (only talking about primitive types here):
>
> (1) The minimum number of bits needed to hold all values of the type.
> (2) The minimum number of bits read by a load (maybe a better way of
> saying this: if you load a value and store it somewhere else, how many
> bits are correctly copied?)
> (3) The maximum number of bits that may be overwritten by a store.
> (4) The amount of memory allocated by an alloca or malloc for a
> variable of this type.
> (5) The spacing between successive variables of this type in an
> array or struct.
>
> For example, take i36.  For this type (1) is 36; (2) is also 36
> (a load typically expands to two 32 bit loads, but any bits beyond 36
> are discarded); (3) is 64 given my current implementation (a store  
> writes
> two 32 bit values; bits beyond bit 36 hold some rubbish which  
> overwrites
> whatever was originally at that memory location); (4) needs to be at
> least 64; (5) will also be 64.

Why is (5) 64? Can it be 40?

Should (4) be the same as (5) since alloca / malloc are allocating an  
array of the specific type?

>
> In general (1) and (2) will be the same.  (4) needs to be at least
> as big as (3).  (5) needs to be at least as big as (4).

Do you really need all these "size"? What about just "size in bits",  
"storage size in bits", and "abi size"? The first is the exact size  
of the type (i.e. 36); the second is the size rounded up to some  
nature boundary for load / store (i.e. 64); the last one is the  size  
including alignment padding when it's part of a larger object (i.e.  
40?).

>
> Another example is 80-bit floating point types.  Here (1), (2)
> and (3) are presumably 80 bits.  On my machine (5) is 96 bits.
> I'm not sure what (4) is, presumably 80 or 96.
>
> Which (if any) of these should getTypeSize, getABITypeSize,  
> getTypeSizeInBits
> and getABITypeSizeInBits correspond to?

TypeSize == "real size", ABITypeSize == "abi size". You will need  
another pair for the storage size?

>
> It seems clear that getTypeSizeInBits corresponds to (1) and (2), as
> shown by it returning 36 for i36.  This is like gcc's TYPE_PRECISION,
> and is a useful concept - but I think the name should be changed,  
> since
> right now it implicitly suggests it returns 8*getTypeSize.  If no one
> objects, I will rename it to getBitsUsedByType.

Isn't it the other way around? Type information should be specified  
in bits, not in bytes. So getTypeSizeInBits returns the exact size in  
bits. I don't see how the new name is any clearer. I actually prefer  
the current name.

>
> Currently getTypeSize doesn't seem to correspond to any of these  
> possibilities,
> at least for APInt's: the current implementation returns the APInt  
> bitwidth rounded
> up to a multiple of the alignment.  That makes it sound like it's  
> trying to be (5).
> I think getTypeSize should be defined to be (3), the maximum number  
> of bits that
> may be overwritten by a store [except that it's in bytes].  This  
> means changing the
> implementation for APInts, but not for other types.

To me getTypeSize is getTypeSizeInBits divided by 8 and rounded up. I  
think renaming it to getTypeSizeInBytes make sense.

>
> Clearly getABITypeSize corresponds to (5) [except that it's in  
> bytes].  This
> corresponds to gcc's TYPE_SIZE.

Right.

>
> Currently getABITypeSizeInBits returns 36 for i36, and otherwise  
> 8*getABITypeSize.
> It seems to me that this is clearly wrong for APInts, and that it  
> should always
> return 8*getABITypeSize (in which case it can be eliminated).  If  
> no one objects,
> I will delete it as redundant.

I'd suggest you not deleting anything for now. Let these evolve and  
see what really makes sense after a while.

Evan

>
> Finally, looking through various users of getTypeSize, it seems  
> that they assume
> that getTypeSize returns (4), the amount of memory allocated for a  
> variable of
> this type.  That seems reasonable.  If everyone agrees, I will  
> document that
> for LLVM (3) and (4) coincide, and is what getTypeSize returns  
> [except that it
> returns the number of bytes, rather than the number of bits].
>
> Best wishes,
>
> Duncan.