[llvm-commits] PATCH: A new SROA implementation

Duncan Sands baldrick at free.fr
Tue Aug 21 01:19:57 PDT 2012

Hi Chandler,

> Funny thing? My assumption was based on the existing SROA pass which also gets
> this wrong:
>     #include <stdio.h>
>     struct S {
>        int x;
>        int __attribute__((vector_size(16))) v;
>        int y;
>     };
>     struct S f(int a, int b, int c, int i) __attribute__((noinline)) {
>        struct S s;
>        volatile int q = c;
>        s.x = a;
>        ((int*)&s.v)[i] = b;
>        s.y = q;
>        return s;
>     }
>     int main() {
>        struct S s = f(1, -1, 3, -1);
>        printf("%d %d\n", s.x, s.y);
>     }
> Clang miscompiles this program at -O2 because of SROA I think... At least, the
> LLVM IR produced seems valid according to the spec of the IR, and the transform
> applied by SROA seems invalid -- the index of -1 doesn't overflow anything, it
> just walks past the end of a sequential type, which everything indicates is allowed.
> We happen to be getting away with this because this transform in the existing
> SROA is only ever applied to vectors, never to arrays or pointers, or anything
> that happens to be lowered as an array or pointer.
> I'm inclined to completely remove this transform. It seems fundamentally unsafe
> given the current spec of GEPs, and provides fairly limited benefit if the
> frontend consistently lowers to insertelement and extractelement (which it did
> in most of my experiments). If we want to support this transform, I think we
> need to either extend GEP to have an optional constraint on inbounds indexing
> within array and/or vector aggregates, or we need to change the semantics of the
> existing 'inbounds' keyword to mean that.

I agree that it should be removed.  Maybe instead instcombine can do it when it
sees an inbounds GEP into an object that only consists of a single vector (maybe
instcombine already does this...).

The dragonegg front-end also produces insertelement/extractelement directly in
most situations.  However GCC sometimes views a vector as a block of memory to
be poked around in, and that is lowered into a GEP on the vector.  The frontend
could work harder though and directly produce insertelement/extractelement if
the GEP corresponds to accessing a vector element.

Ciao, Duncan.

More information about the llvm-commits mailing list