[llvm-commits] PATCH: A new SROA implementation
Duncan Sands
baldrick at free.fr
Tue Aug 21 01:19:57 PDT 2012
Hi Chandler,
> Funny thing? My assumption was based on the existing SROA pass which also gets
> this wrong:
>
> #include <stdio.h>
> struct S {
> int x;
> int __attribute__((vector_size(16))) v;
> int y;
> };
> struct S f(int a, int b, int c, int i) __attribute__((noinline)) {
> struct S s;
> volatile int q = c;
> s.x = a;
> ((int*)&s.v)[i] = b;
> s.y = q;
> return s;
> }
> int main() {
> struct S s = f(1, -1, 3, -1);
> printf("%d %d\n", s.x, s.y);
> }
>
>
> Clang miscompiles this program at -O2 because of SROA I think... At least, the
> LLVM IR produced seems valid according to the spec of the IR, and the transform
> applied by SROA seems invalid -- the index of -1 doesn't overflow anything, it
> just walks past the end of a sequential type, which everything indicates is allowed.
>
> We happen to be getting away with this because this transform in the existing
> SROA is only ever applied to vectors, never to arrays or pointers, or anything
> that happens to be lowered as an array or pointer.
>
> I'm inclined to completely remove this transform. It seems fundamentally unsafe
> given the current spec of GEPs, and provides fairly limited benefit if the
> frontend consistently lowers to insertelement and extractelement (which it did
> in most of my experiments). If we want to support this transform, I think we
> need to either extend GEP to have an optional constraint on inbounds indexing
> within array and/or vector aggregates, or we need to change the semantics of the
> existing 'inbounds' keyword to mean that.
I agree that it should be removed. Maybe instead instcombine can do it when it
sees an inbounds GEP into an object that only consists of a single vector (maybe
instcombine already does this...).
The dragonegg front-end also produces insertelement/extractelement directly in
most situations. However GCC sometimes views a vector as a block of memory to
be poked around in, and that is lowered into a GEP on the vector. The frontend
could work harder though and directly produce insertelement/extractelement if
the GEP corresponds to accessing a vector element.
Ciao, Duncan.
More information about the llvm-commits
mailing list