[llvm] 7abefc4 - [instcombine] Fold away memset/memmove from otherwise unused alloca

Philip Reames via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 22 14:46:02 PDT 2022


I've got one buildbot which might be failing on this: 
https://lab.llvm.org/buildbot#builders/127/builds/26790

However, I can't make heads or tails out the failure message and the 
test isn't obvious unstable.  Waiting to see if any Linux bots fail with 
something I can understand.

Philip

On 3/22/22 13:49, Philip Reames via llvm-commits wrote:
> Author: Philip Reames
> Date: 2022-03-22T13:48:48-07:00
> New Revision: 7abefc42220b74551b433083ece33be31e48700f
>
> URL: https://github.com/llvm/llvm-project/commit/7abefc42220b74551b433083ece33be31e48700f
> DIFF: https://github.com/llvm/llvm-project/commit/7abefc42220b74551b433083ece33be31e48700f.diff
>
> LOG: [instcombine] Fold away memset/memmove from otherwise unused alloca
>
> The motivation for this is that while both memcpyopt and dse will catch this case, both are limited by MSSA's walk back threshold when finding clobbers.  As such, if you have a memcpy of an otherwise dead alloca placed towards the end of a long basic block with lots of other memory instructions, it would be missed.  This is a bit undesirable for such an "obviously" useless bit of code.
>
> As noted in comments, we should probably generalize instcombine's escape analysis peephole (see visitAllocInst) to allow read xor write.  Doing that would subsume this code in a more general way, but is also a more involved change.  For the moment, I went with the easiest fix.
>
> Added:
>      
>
> Modified:
>      llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
>      llvm/test/Transforms/Inline/byval-tail-call.ll
>      llvm/test/Transforms/InstCombine/memcpy_alloca.ll
>
> Removed:
>      
>
>
> ################################################################################
> diff  --git a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
> index 9327dda3924dc..b3edec04fbcef 100644
> --- a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
> +++ b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
> @@ -104,6 +104,19 @@ static Type *getPromotedType(Type *Ty) {
>     return Ty;
>   }
>   
> +/// Recognize a memcpy/memmove from a trivially otherwise unused alloca.
> +/// TODO: This should probably be integrated with visitAllocSites, but that
> +/// requires a deeper change to allow either unread or unwritten objects.
> +static bool hasUndefSource(AnyMemTransferInst *MI) {
> +  auto *Src = MI->getRawSource();
> +  while (isa<GetElementPtrInst>(Src) || isa<BitCastInst>(Src)) {
> +    if (!Src->hasOneUse())
> +      return false;
> +    Src = cast<Instruction>(Src)->getOperand(0);
> +  }
> +  return isa<AllocaInst>(Src) && Src->hasOneUse();
> +}
> +
>   Instruction *InstCombinerImpl::SimplifyAnyMemTransfer(AnyMemTransferInst *MI) {
>     Align DstAlign = getKnownAlignment(MI->getRawDest(), DL, MI, &AC, &DT);
>     MaybeAlign CopyDstAlign = MI->getDestAlign();
> @@ -128,6 +141,14 @@ Instruction *InstCombinerImpl::SimplifyAnyMemTransfer(AnyMemTransferInst *MI) {
>       return MI;
>     }
>   
> +  // If the source is provably undef, the memcpy/memmove doesn't do anything
> +  // (unless the transfer is volatile).
> +  if (hasUndefSource(MI) && !MI->isVolatile()) {
> +    // Set the size of the copy to 0, it will be deleted on the next iteration.
> +    MI->setLength(Constant::getNullValue(MI->getLength()->getType()));
> +    return MI;
> +  }
> +
>     // If MemCpyInst length is 1/2/4/8 bytes then replace memcpy with
>     // load/store.
>     ConstantInt *MemOpLength = dyn_cast<ConstantInt>(MI->getLength());
>
> diff  --git a/llvm/test/Transforms/Inline/byval-tail-call.ll b/llvm/test/Transforms/Inline/byval-tail-call.ll
> index a820bcb427dcf..19be5e5827f04 100644
> --- a/llvm/test/Transforms/Inline/byval-tail-call.ll
> +++ b/llvm/test/Transforms/Inline/byval-tail-call.ll
> @@ -92,14 +92,11 @@ define void @foobar(i32* %x) {
>   define void @barfoo() {
>   ; CHECK-LABEL: @barfoo(
>   ; CHECK-NEXT:    [[X1:%.*]] = alloca i32, align 4
> -; CHECK-NEXT:    [[X:%.*]] = alloca i32, align 4
>   ; CHECK-NEXT:    [[TMP1:%.*]] = bitcast i32* [[X1]] to i8*
>   ; CHECK-NEXT:    call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull [[TMP1]])
> -; CHECK-NEXT:    [[TMP2:%.*]] = load i32, i32* [[X]], align 4
> -; CHECK-NEXT:    store i32 [[TMP2]], i32* [[X1]], align 4
>   ; CHECK-NEXT:    tail call void @ext2(i32* nonnull byval(i32) [[X1]])
> -; CHECK-NEXT:    [[TMP3:%.*]] = bitcast i32* [[X1]] to i8*
> -; CHECK-NEXT:    call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull [[TMP3]])
> +; CHECK-NEXT:    [[TMP2:%.*]] = bitcast i32* [[X1]] to i8*
> +; CHECK-NEXT:    call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull [[TMP2]])
>   ; CHECK-NEXT:    ret void
>   ;
>     %x = alloca i32
>
> diff  --git a/llvm/test/Transforms/InstCombine/memcpy_alloca.ll b/llvm/test/Transforms/InstCombine/memcpy_alloca.ll
> index b7288d9f07476..fabf920c6e68d 100644
> --- a/llvm/test/Transforms/InstCombine/memcpy_alloca.ll
> +++ b/llvm/test/Transforms/InstCombine/memcpy_alloca.ll
> @@ -4,9 +4,6 @@
>   ; Memcpy is copying known-undef, and is thus removable
>   define void @test(i8* %dest) {
>   ; CHECK-LABEL: @test(
> -; CHECK-NEXT:    [[A:%.*]] = alloca [7 x i8], align 1
> -; CHECK-NEXT:    [[SRC:%.*]] = getelementptr inbounds [7 x i8], [7 x i8]* [[A]], i64 0, i64 0
> -; CHECK-NEXT:    call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 1 dereferenceable(7) [[DEST:%.*]], i8* noundef nonnull align 1 dereferenceable(7) [[SRC]], i64 7, i1 false)
>   ; CHECK-NEXT:    ret void
>   ;
>     %a = alloca [7 x i8]
> @@ -47,9 +44,6 @@ define void @test3(i8* %dest) {
>   
>   define void @test4(i8* %dest) {
>   ; CHECK-LABEL: @test4(
> -; CHECK-NEXT:    [[A1:%.*]] = alloca [7 x i8], align 1
> -; CHECK-NEXT:    [[A1_SUB:%.*]] = getelementptr inbounds [7 x i8], [7 x i8]* [[A1]], i64 0, i64 0
> -; CHECK-NEXT:    call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 1 dereferenceable(7) [[DEST:%.*]], i8* noundef nonnull align 1 dereferenceable(7) [[A1_SUB]], i64 7, i1 false)
>   ; CHECK-NEXT:    ret void
>   ;
>     %a = alloca [7 x i8]
> @@ -60,9 +54,6 @@ define void @test4(i8* %dest) {
>   
>   define void @test5(i8* %dest) {
>   ; CHECK-LABEL: @test5(
> -; CHECK-NEXT:    [[A:%.*]] = alloca [7 x i8], align 1
> -; CHECK-NEXT:    [[P2:%.*]] = getelementptr inbounds [7 x i8], [7 x i8]* [[A]], i64 0, i64 4
> -; CHECK-NEXT:    call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 1 dereferenceable(3) [[DEST:%.*]], i8* noundef nonnull align 1 dereferenceable(3) [[P2]], i64 3, i1 false)
>   ; CHECK-NEXT:    ret void
>   ;
>     %a = alloca [7 x i8]
>
>
>          
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits


More information about the llvm-commits mailing list