[llvm] 7abefc4 - [instcombine] Fold away memset/memmove from otherwise unused alloca

Wed Mar 23 07:30:48 PDT 2022

Reid, Matthew,

You two are the bot owner and the last person to touch this test 
respectively.  Any chance I could ask for some help here?

I'm fairly sure the bot failure mentioned below is a real failure, but 
a) no other bot has failed, and b) I can't translate how removing a 
memcpy could lead to "LINK : fatal error LNK1318: Unexpected PDB error; 
FORMAT (11) ''".  Any idea what might be going on with this test?  Or 
ideas on how to reproduce it which don't require a windows build?

I *suspect* this is a case where the test needs adjusted to prevent 
something being optimized away - that's the typical asan test pattern 
for new memcpy opts - but I might be wrong about that.

Philip

On 3/22/22 14:46, Philip Reames via llvm-commits wrote:
> I've got one buildbot which might be failing on this: 
> https://lab.llvm.org/buildbot#builders/127/builds/26790
>
> However, I can't make heads or tails out the failure message and the 
> test isn't obvious unstable.  Waiting to see if any Linux bots fail 
> with something I can understand.
>
> Philip
>
> On 3/22/22 13:49, Philip Reames via llvm-commits wrote:
>> Author: Philip Reames
>> Date: 2022-03-22T13:48:48-07:00
>> New Revision: 7abefc42220b74551b433083ece33be31e48700f
>>
>> URL: 
>> https://github.com/llvm/llvm-project/commit/7abefc42220b74551b433083ece33be31e48700f
>> DIFF: 
>> https://github.com/llvm/llvm-project/commit/7abefc42220b74551b433083ece33be31e48700f.diff
>>
>> LOG: [instcombine] Fold away memset/memmove from otherwise unused alloca
>>
>> The motivation for this is that while both memcpyopt and dse will 
>> catch this case, both are limited by MSSA's walk back threshold when 
>> finding clobbers.  As such, if you have a memcpy of an otherwise dead 
>> alloca placed towards the end of a long basic block with lots of 
>> other memory instructions, it would be missed.  This is a bit 
>> undesirable for such an "obviously" useless bit of code.
>>
>> As noted in comments, we should probably generalize instcombine's 
>> escape analysis peephole (see visitAllocInst) to allow read xor 
>> write.  Doing that would subsume this code in a more general way, but 
>> is also a more involved change.  For the moment, I went with the 
>> easiest fix.
>>
>> Added:
>>
>> Modified:
>>      llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
>>      llvm/test/Transforms/Inline/byval-tail-call.ll
>>      llvm/test/Transforms/InstCombine/memcpy_alloca.ll
>>
>> Removed:
>>
>>
>> ################################################################################ 
>>
>> diff  --git a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp 
>> b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
>> index 9327dda3924dc..b3edec04fbcef 100644
>> --- a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
>> +++ b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
>> @@ -104,6 +104,19 @@ static Type *getPromotedType(Type *Ty) {
>>     return Ty;
>>   }
>>   +/// Recognize a memcpy/memmove from a trivially otherwise unused 
>> alloca.
>> +/// TODO: This should probably be integrated with visitAllocSites, 
>> but that
>> +/// requires a deeper change to allow either unread or unwritten 
>> objects.
>> +static bool hasUndefSource(AnyMemTransferInst *MI) {
>> +  auto *Src = MI->getRawSource();
>> +  while (isa<GetElementPtrInst>(Src) || isa<BitCastInst>(Src)) {
>> +    if (!Src->hasOneUse())
>> +      return false;
>> +    Src = cast<Instruction>(Src)->getOperand(0);
>> +  }
>> +  return isa<AllocaInst>(Src) && Src->hasOneUse();
>> +}
>> +
>>   Instruction 
>> *InstCombinerImpl::SimplifyAnyMemTransfer(AnyMemTransferInst *MI) {
>>     Align DstAlign = getKnownAlignment(MI->getRawDest(), DL, MI, &AC, 
>> &DT);
>>     MaybeAlign CopyDstAlign = MI->getDestAlign();
>> @@ -128,6 +141,14 @@ Instruction 
>> *InstCombinerImpl::SimplifyAnyMemTransfer(AnyMemTransferInst *MI) {
>>       return MI;
>>     }
>>   +  // If the source is provably undef, the memcpy/memmove doesn't 
>> do anything
>> +  // (unless the transfer is volatile).
>> +  if (hasUndefSource(MI) && !MI->isVolatile()) {
>> +    // Set the size of the copy to 0, it will be deleted on the next 
>> iteration.
>> + MI->setLength(Constant::getNullValue(MI->getLength()->getType()));
>> +    return MI;
>> +  }
>> +
>>     // If MemCpyInst length is 1/2/4/8 bytes then replace memcpy with
>>     // load/store.
>>     ConstantInt *MemOpLength = dyn_cast<ConstantInt>(MI->getLength());
>>
>> diff  --git a/llvm/test/Transforms/Inline/byval-tail-call.ll 
>> b/llvm/test/Transforms/Inline/byval-tail-call.ll
>> index a820bcb427dcf..19be5e5827f04 100644
>> --- a/llvm/test/Transforms/Inline/byval-tail-call.ll
>> +++ b/llvm/test/Transforms/Inline/byval-tail-call.ll
>> @@ -92,14 +92,11 @@ define void @foobar(i32* %x) {
>>   define void @barfoo() {
>>   ; CHECK-LABEL: @barfoo(
>>   ; CHECK-NEXT:    [[X1:%.*]] = alloca i32, align 4
>> -; CHECK-NEXT:    [[X:%.*]] = alloca i32, align 4
>>   ; CHECK-NEXT:    [[TMP1:%.*]] = bitcast i32* [[X1]] to i8*
>>   ; CHECK-NEXT:    call void @llvm.lifetime.start.p0i8(i64 4, i8* 
>> nonnull [[TMP1]])
>> -; CHECK-NEXT:    [[TMP2:%.*]] = load i32, i32* [[X]], align 4
>> -; CHECK-NEXT:    store i32 [[TMP2]], i32* [[X1]], align 4
>>   ; CHECK-NEXT:    tail call void @ext2(i32* nonnull byval(i32) [[X1]])
>> -; CHECK-NEXT:    [[TMP3:%.*]] = bitcast i32* [[X1]] to i8*
>> -; CHECK-NEXT:    call void @llvm.lifetime.end.p0i8(i64 4, i8* 
>> nonnull [[TMP3]])
>> +; CHECK-NEXT:    [[TMP2:%.*]] = bitcast i32* [[X1]] to i8*
>> +; CHECK-NEXT:    call void @llvm.lifetime.end.p0i8(i64 4, i8* 
>> nonnull [[TMP2]])
>>   ; CHECK-NEXT:    ret void
>>   ;
>>     %x = alloca i32
>>
>> diff  --git a/llvm/test/Transforms/InstCombine/memcpy_alloca.ll 
>> b/llvm/test/Transforms/InstCombine/memcpy_alloca.ll
>> index b7288d9f07476..fabf920c6e68d 100644
>> --- a/llvm/test/Transforms/InstCombine/memcpy_alloca.ll
>> +++ b/llvm/test/Transforms/InstCombine/memcpy_alloca.ll
>> @@ -4,9 +4,6 @@
>>   ; Memcpy is copying known-undef, and is thus removable
>>   define void @test(i8* %dest) {
>>   ; CHECK-LABEL: @test(
>> -; CHECK-NEXT:    [[A:%.*]] = alloca [7 x i8], align 1
>> -; CHECK-NEXT:    [[SRC:%.*]] = getelementptr inbounds [7 x i8], [7 x 
>> i8]* [[A]], i64 0, i64 0
>> -; CHECK-NEXT:    call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef 
>> nonnull align 1 dereferenceable(7) [[DEST:%.*]], i8* noundef nonnull 
>> align 1 dereferenceable(7) [[SRC]], i64 7, i1 false)
>>   ; CHECK-NEXT:    ret void
>>   ;
>>     %a = alloca [7 x i8]
>> @@ -47,9 +44,6 @@ define void @test3(i8* %dest) {
>>     define void @test4(i8* %dest) {
>>   ; CHECK-LABEL: @test4(
>> -; CHECK-NEXT:    [[A1:%.*]] = alloca [7 x i8], align 1
>> -; CHECK-NEXT:    [[A1_SUB:%.*]] = getelementptr inbounds [7 x i8], 
>> [7 x i8]* [[A1]], i64 0, i64 0
>> -; CHECK-NEXT:    call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef 
>> nonnull align 1 dereferenceable(7) [[DEST:%.*]], i8* noundef nonnull 
>> align 1 dereferenceable(7) [[A1_SUB]], i64 7, i1 false)
>>   ; CHECK-NEXT:    ret void
>>   ;
>>     %a = alloca [7 x i8]
>> @@ -60,9 +54,6 @@ define void @test4(i8* %dest) {
>>     define void @test5(i8* %dest) {
>>   ; CHECK-LABEL: @test5(
>> -; CHECK-NEXT:    [[A:%.*]] = alloca [7 x i8], align 1
>> -; CHECK-NEXT:    [[P2:%.*]] = getelementptr inbounds [7 x i8], [7 x 
>> i8]* [[A]], i64 0, i64 4
>> -; CHECK-NEXT:    call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef 
>> nonnull align 1 dereferenceable(3) [[DEST:%.*]], i8* noundef nonnull 
>> align 1 dereferenceable(3) [[P2]], i64 3, i1 false)
>>   ; CHECK-NEXT:    ret void
>>   ;
>>     %a = alloca [7 x i8]
>>
>>
>>          _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits