[LLVMdev] types in load/store

Duncan Sands baldrick at free.fr
Fri Jul 9 00:44:46 PDT 2010

Hi Jianzhou,

> I misunderstood C99 ISO, such behaviors are defined not when types
> have the same sizes, but when they are same (compatible)  types with
> signed or qualified extension (this is much stronger than being of
> same sizes), or reading char by char:
> 7 An object shall have its stored value accessed only by an lvalue
> expression that has one of
> the following types:
> — a type compatible with the effective type of the object,
> [...]
> — a type that is the signed or unsigned type corresponding to the
> effective type of the
> object,
> [...]
> — an aggregate or union type that includes one of the aforementioned
> types among its
> members (including, recursively, a member of a subaggregate or
> contained union), or
> — a character type.
> (sec 6.5, items 6 and 7, page 67-68,
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf)

LLVM does not have any such restrictions.

> If LLVM IR is weaker than these C restrictions, then I have the
> following questions about when GEP is undefined:

In your examples, it is not GEP that would be undefined, but a load or
store from the GEP.  GEP just offsets the memory address.  In C too it
is not invalid to offset or cast a pointer; it is loading from or storing
to the cast or offset pointer that may be invalid.

> 1) Can I load a value partially or overlapped with other stored
> values? For example, if the stored values are of type [10*i32], and we
> cast i32* to  {i8, i4, float} *,  can we successfully load each fields
> via the addresses from GEPs?

Yes, except that as previously mentioned this is invalid for the i4 if
the original value was not set by performing an i4 store.

Since IR allows to define data layout of
> targets (size and alignment for types), does whether such GEPs
> undefined depend on its data layout?

As I mentioned, there is no problem with GEPs being undefined.

> 2) C allows characters as the least granularity when loading. Does
> LLVM have the same assumption?

LLVM doesn't have a notion of "character".  Currently all processors that LLVM
targets are capable of addressing an octet (8 bits), but nothing smaller.  This
means that the smallest granularity is currently i8.



More information about the llvm-dev mailing list