[PATCH] Fix SROA regression causing data corruption, fixes PR19250

Mon Jun 9 23:39:16 PDT 2014

On 2014.06.09 18:30:53 -0700, Duncan P. N. Exon Smith wrote:
> +aprantl
> 
> > On 2014-Apr-21, at 03:18, Björn Steinbrink <bsteinbr at gmail.com> wrote:
> > 
> > dotdash added you to the CC list for the revision "Fix SROA regression causing data corruption, fixes PR19250".
> > 
> > r199771 accidently broke the logic that makes sure that SROA only splits load
> > on byte boundaries. If such a split happens, some bits get lost when
> > reassembling loads of wider types, causing data corruption.
> > 
> > Moving the width check up to reject such splits early fixes the problem.
> > 
> 
> Thanks for working on this.  A few comments inline.

Thanks for the review!

> > 
> > http://reviews.llvm.org/D3297
> > 
> > Files:
> >  lib/Transforms/Scalar/SROA.cpp
> >  test/Transforms/SROA/basictest.ll
> > 
> > Index: lib/Transforms/Scalar/SROA.cpp
> > ===================================================================
> > --- lib/Transforms/Scalar/SROA.cpp
> > +++ lib/Transforms/Scalar/SROA.cpp
> > @@ -1031,11 +1031,6 @@
> >       UserTy = SI->getValueOperand()->getType();
> >     }
> > 
> > -    if (!UserTy || (Ty && Ty != UserTy))
> > -      TyIsCommon = false; // Give up on anything but an iN type.
> 
> I feel like this check (and assignment to `false`) should still be
> *here*, before the early `continue`.  Can you explain why you moved it?

Because moving only the "else" part would make the results depend on the
order of slices, which is something the code explicitly tries to avoid
by always looking at all slices. Moving the "if" block also restores the
behaviour from before r199771 WRT usage of UserTy vs. ITy.

For example:

define i32 @test26({ i16*, i32 }*) {
entry-block:
  %arg = alloca { i16*, i32 }, align 8
  %1 = bitcast { i16*, i32 }* %0 to i8*
  %2 = bitcast { i16*, i32 }* %arg to i8*
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %2, i8* %1, i32 16, i32 8, i1 false)
  %b = getelementptr inbounds { i16*, i32 }* %arg, i64 0, i32 0
  %pb0 = bitcast i16** %b to i8**
  %b0 = load i8** %pb0
  %pb1 = bitcast i16** %b to i63*
  %b1 = load i63* %pb1
  ret i32 0
}

With only the "else" part moved after the early continue, this will
have TyIsCommon set to false and return null, thus the new alloca will
have type i16*. But if you move %pb0 and %b0 to appear after %b1, then
TyIsCommon will be true and i8* is returned, which will then be used as
the type for the new alloca.

With the whole if-else block moved, this consistently returns i8*,
regardless of order.

Another approach would be to set TyIsCommon to false when the early
continue is taken. If that is preferable, I'll update the patch to do
that instead.

> 
> > -    else
> > -      Ty = UserTy;
> 
> (I understand moving the `else` clause.)
> 
> > -
> >     if (IntegerType *UserITy = dyn_cast_or_null<IntegerType>(UserTy)) {
> >       // If the type is larger than the partition, skip it. We only encounter
> >       // this for split integer operations where we want to use the type of the
> > @@ -1050,6 +1045,11 @@
> >       if (!ITy || ITy->getBitWidth() < UserITy->getBitWidth())
> >         ITy = UserITy;
> >     }
> > +
> > +    if (!UserTy || (Ty && Ty != UserTy))
> > +      TyIsCommon = false; // Give up on anything but an iN type.
> > +    else
> > +      Ty = UserTy;
> >   }
> > 
> >   return TyIsCommon ? Ty : ITy;
> > Index: test/Transforms/SROA/basictest.ll
> > ===================================================================
> > --- test/Transforms/SROA/basictest.ll
> > +++ test/Transforms/SROA/basictest.ll
> > @@ -1440,3 +1440,30 @@
> >   ret void
> > }
> > 
> > +define i32 @test25({ i1, i32 }*) {
> 
> I think you should create a new file for this testcase.  Name it after
> the problem that's being fixed, and give the function a good name too.

Will do once I get feedback on the above.

> 
> > +; This tests that splits only happen on integer types that are byte width multiple
> > +; CHECK-LABEL: @test25(
> > +; CHECK: case0:
> > +; CHECK-NOT: zext i1
> > +; CHECK: case1:
> > +; CHECK: ret i32
> > +entry-block:
> > +  %arg = alloca { i1, i32 }, align 8
> > +  %1 = bitcast { i1, i32 }* %0 to i8*
> > +  %2 = bitcast { i1, i32 }* %arg to i8*
> > +  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %2, i8* %1, i32 8, i32 4, i1 false)
> > +  %discrp = getelementptr inbounds { i1, i32 }* %arg, i64 0, i32 0
> > +  %s = getelementptr inbounds { i1, i32 }* %arg, i64 0, i32 1
> > +  %discr = load i1* %discrp, align 1
> > +  br i1 %discr, label %case0, label %case1
> > +
> > +case0:                                     ; preds = %entry-block
> > +  %r0 = load i32* %s
> > +  ret i32 %r0
> > +
> > +case1:
> > +  %p1 = bitcast i32* %s to i1*
> > +  %t1 = load i1* %p1
> > +  %r1 = zext i1 %t1 to i32
> > +  ret i32 %r1
> > +}
> > <D3297.1.patch>_______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>