[LLVMdev] dragonegg svn still broken

Mon Jul 25 21:57:15 PDT 2011

On Jul 25, 2011, at 8:25 AM, Duncan Sands wrote:
> Hi Chris,
> 
>>> Chris made major changes to the LLVM type system, which requires major changes
>>> to dragonegg (c.f. the patch Chris applied to llvm-gcc).  I'm on holiday which
>>> is why I haven't taken care of it yet.
>> 
>> Yeah, sorry about that Duncan.  I can't directly hack on the code, but I'm happy to answer questions when you get back.
> 
> there seems to be a nasty flaw in the scheme you devised for llvm-gcc, involving
> arrays.

I believe it.  I just did the minimal possible thing to get llvm-gcc to bootstrap for me, because some people have trouble letting go of the past, I'm sorry for the impact that had on dragonegg.  Clang is at least working now :)

>  Consider the following mutually recursive arrays of structs:
> 
>  struct S has a field which is a pointer to array type A.
>  The element type of array type A is struct T.
>  struct T has a field which is a pointer to array type B.
>  The element type of array type B is struct S.
> 
> When trying to convert one of the structs or arrays, with the current scheme you
> inevitably end up forming an array of opaque struct types.  This immediately
> blows up dragonegg because it wants to compute the array size to see if it needs
> to append padding.  It can blow up llvm-gcc too, because it also wants to get
> the size of arrays in more exotic circumstances.
> 
> Do you have any suggestions on the best way to handle this?

Right, there are similar cases involving function pointers etc.  For example, here's an evil case where converting a struct converts a function pointer that wants the struct being converted (kudos to Eli):

struct foo {
  void (*FP)(struct foo);
} g;

If you run this through clang, you get:

%struct.foo = type { {}* }
@g = common global %struct.foo zeroinitializer, align 8

and any uses of "FP" do a bitcast to the right type.  The way this works in clang (and the attempt in llvm-gcc which probably is pretty close just in need of debugging) is that when a struct is being converted, if we go to convert a pointer underneath it, we disable converting the pointee (just returning {}).  At one stage in its development, this was pretty hardcore: we were lowering stuff like:

struct x { int a; };
struct y {
  struct x *P;
} h;

to y = "{ {}* }".

This is effectively your #3 below.

> (3) convert all pointers to i8*.  I'm not sure how much this would get in
> the way of the optimizers.  Note that dragonegg distinguishes between
> register types and in-memory types (I think clang does the same), and pointer
> types used for registers would still be sensible (struct S* rather than i8*).

Yep, I did this to beat out all of the assumptions in the clang IRGen that was assuming that the type return by a field access load was sane (they'd blow up when they got an unexpected {}* instead of a function pointer or whatever).

Once everything worked, I added the 'isSafeToConvert' logic in clang/lib/CodeGen/CodeGenTypes.cpp. The idea is that when converting the first "g" example, there is no way we can do better than {}*.  OTOH, in the second "h" example there is no problem recursively converting X to a nice type, and isSafeToConvert returns true for it.  Because we can recurse into it, we get pretty IR for that case (and this handles almost every other common case):

%struct.y = type { %struct.x* }
%struct.x = type { i32 }

@h = common global %struct.y zeroinitializer, align 8

This gets clang the benefit of getting pretty types that are maximally friendly to the optimizer (and the compiler hacking reading IR!) while still being able to fall back to an ugly type when it is required by conversion.

-Chris