[llvm] r348496 - [GVN] Don't perform scalar PRE on GEPs

Fri Dec 7 11:29:17 PST 2018

Did a bit more thinking about this, and came up with an example of one 
case where this is a pessimization.  I am specifically not asking for 
any action now, just noting this for later in case we find cases where 
patterns like this matter.

define i32* @ret_phi(i1 %c, i64 %offset, i32* %p) {
entry:
   br i1 %c, label %if.then, label %if.else

if.then:
   %addr1 = getelementptr i32, i32* %p, i64 %offset
   store i32 5, i32* %addr1
   br label %return

if.else:
   br label %return

return:
   %addr2 = getelementptr i32, i32* %p, i64 %offset
   ret i32* %addr2
}

Previously, this generated the following x86-64 code:

# %bb.0:                                # %entry
     testb    $1, %dil
     je    .LBB4_2
# %bb.1:                                # %if.then
     leaq    (%rdx,%rsi,4), %rax
     movl    $5, (%rdx,%rsi,4)
     retq
.LBB4_2:                                # %if.else
     leaq    (%rdx,%rsi,4), %rax
     retq

Now, it yields:

# %bb.0:                                # %entry
     testb    $1, %dil
     je    .LBB4_2
# %bb.1:                                # %if.then
     movl    $5, (%rdx,%rsi,4)
.LBB4_2:                                # %return
     leaq    (%rdx,%rsi,4), %rax
     retq

(i.e no tail duplication is applied, and we don't pull out the address 
computation)

This isn't a major difference in this case, but it is the type of thing 
which *might* be problematic in combination with some more complicated 
pattern we haven't found yet.  It's just enough to make me a bit nervous 
without any clear motivation for an objection.

On 12/7/18 10:18 AM, Philip Reames wrote:
>
> Well, it looks like I'm wrong here.  As you correctly noted, I should 
> have provided a test case which broke.  When trying to construct one, 
> I realized that the existing address translation code inside 
> performLoadPRE handles the case I had in mind. Worse, I took another 
> look at your patch, and you even call this out in comment form.  Sorry 
> for what was clearly a rushed response in retrospect.
>
>
> I'm still a bit concerned there's a case we haven't thought of here, 
> but my level of concern has sharply dropped.  Given that, I retract my 
> request for a revert, and we'll identify any problems which crop up 
> through normal process.
>
>
> Philip
>
>
> On 12/7/18 1:43 AM, Alexandros Lamprineas wrote:
>>
>> Hi Philip,
>>
>>
>> Thanks for the feedback. If I understand correct this patch prevents 
>> the PRE of loads in some cases. Could you give me an example (an IR 
>> reproducer) that demonstrates the problem? Ideally we would want it 
>> in the llvm regression tests.
>>
>>
>> Cheers,
>>
>> Alexandros
>>
>> ------------------------------------------------------------------------
>> *From:* Philip Reames <listmail at philipreames.com>
>> *Sent:* Thursday, December 6, 2018 8:11:08 PM
>> *To:* Alexandros Lamprineas; llvm-commits at lists.llvm.org
>> *Subject:* Re: [llvm] r348496 - [GVN] Don't perform scalar PRE on GEPs
>> I think this patch is a bad idea and needs to be reverted.
>>
>> Specifically, this will effectively break all LoadPRE for loads from
>> fixed offsets off distinct base objects or where an address is not
>> globally available along all relevant paths.  The code for LoadPRE
>> specifically calls into performScalarPRE for GEPs specifically to make
>> this work.
>>
>> While I understand the desire not to make CGP even more complicated, I
>> think this is a case where doing so is called for.
>>
>> Philip
>>
>> On 12/6/18 8:11 AM, Alexandros Lamprineas via llvm-commits wrote:
>> > Author: alelab01
>> > Date: Thu Dec  6 08:11:58 2018
>> > New Revision: 348496
>> >
>> > URL: http://llvm.org/viewvc/llvm-project?rev=348496&view=rev
>> > Log:
>> > [GVN] Don't perform scalar PRE on GEPs
>> >
>> > Partial Redundancy Elimination of GEPs prevents CodeGenPrepare from
>> > sinking the addressing mode computation of memory instructions back
>> > to its uses. The problem comes from the insertion of PHIs, which
>> > confuse CGP and make it bail.
>> >
>> > I've autogenerated the check lines of an existing test and added a
>> > store instruction to demonstrate the motivation behind this change.
>> > The store is now using the gep instead of a phi.
>> >
>> > Differential Revision: https://reviews.llvm.org/D55009
>> >
>> > Modified:
>> >      llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>> > llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll
>> >
>> > Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>> > URL: 
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=348496&r1=348495&r2=348496&view=diff
>> > 
>> ==============================================================================
>> > --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original)
>> > +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Thu Dec  6 08:11:58 2018
>> > @@ -2156,6 +2156,16 @@ bool GVN::performScalarPRE(Instruction *
>> >     if (isa<CmpInst>(CurInst))
>> >       return false;
>> >
>> > +  // Don't do PRE on GEPs. The inserted PHI would prevent 
>> CodeGenPrepare from
>> > +  // sinking the addressing mode computation back to its uses. 
>> Extending the
>> > +  // GEP's live range increases the register pressure, and 
>> therefore it can
>> > +  // introduce unnecessary spills.
>> > +  //
>> > +  // This doesn't prevent Load PRE. PHI translation will make the 
>> GEP available
>> > +  // to the load by moving it to the predecessor block if necessary.
>> > +  if (isa<GetElementPtrInst>(CurInst))
>> > +    return false;
>> > +
>> >     // We don't currently value number ANY inline asm calls.
>> >     if (CallInst *CallI = dyn_cast<CallInst>(CurInst))
>> >       if (CallI->isInlineAsm())
>> >
>> > Modified: llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll
>> > URL: 
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll?rev=348496&r1=348495&r2=348496&view=diff
>> > 
>> ==============================================================================
>> > --- llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll (original)
>> > +++ llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll Thu Dec  6 
>> 08:11:58 2018
>> > @@ -1,3 +1,4 @@
>> > +; NOTE: Assertions have been autogenerated by 
>> utils/update_test_checks.py
>> >   ; RUN: opt < %s -basicaa -gvn -enable-load-pre -S | FileCheck %s
>> >   ; RUN: opt < %s -aa-pipeline=basic-aa -passes=gvn 
>> -enable-load-pre -S | FileCheck %s
>> >
>> > @@ -6,11 +7,49 @@ target triple = "aarch64--linux-gnu"
>> >
>> >   define double @foo(i32 %stat, i32 %i, double** %p) {
>> >   ; CHECK-LABEL: @foo(
>> > +; CHECK-NEXT:  entry:
>> > +; CHECK-NEXT:    switch i32 [[STAT:%.*]], label [[SW_DEFAULT:%.*]] [
>> > +; CHECK-NEXT:    i32 0, label [[SW_BB:%.*]]
>> > +; CHECK-NEXT:    i32 1, label [[SW_BB]]
>> > +; CHECK-NEXT:    i32 2, label [[ENTRY_SW_BB2_CRIT_EDGE:%.*]]
>> > +; CHECK-NEXT:    ]
>> > +; CHECK:       entry.sw.bb2_crit_edge:
>> > +; CHECK-NEXT:    [[DOTPRE:%.*]] = load double*, double** 
>> [[P:%.*]], align 8
>> > +; CHECK-NEXT:    [[DOTPRE1:%.*]] = sext i32 [[I:%.*]] to i64
>> > +; CHECK-NEXT: [[ARRAYIDX5_PHI_TRANS_INSERT:%.*]] = getelementptr 
>> inbounds double, double* [[DOTPRE]], i64 [[DOTPRE1]]
>> > +; CHECK-NEXT:    [[DOTPRE2:%.*]] = load double, double* 
>> [[ARRAYIDX5_PHI_TRANS_INSERT]], align 8
>> > +; CHECK-NEXT:    br label [[SW_BB2:%.*]]
>> > +; CHECK:       sw.bb:
>> > +; CHECK-NEXT:    [[IDXPROM:%.*]] = sext i32 [[I]] to i64
>> > +; CHECK-NEXT:    [[TMP0:%.*]] = load double*, double** [[P]], align 8
>> > +; CHECK-NEXT:    [[ARRAYIDX1:%.*]] = getelementptr inbounds 
>> double, double* [[TMP0]], i64 [[IDXPROM]]
>> > +; CHECK-NEXT:    [[TMP1:%.*]] = load double, double* 
>> [[ARRAYIDX1]], align 8
>> > +; CHECK-NEXT:    [[SUB:%.*]] = fsub double [[TMP1]], 1.000000e+00
>> > +; CHECK-NEXT:    [[CMP:%.*]] = fcmp olt double [[SUB]], 0.000000e+00
>> > +; CHECK-NEXT:    br i1 [[CMP]], label [[IF_THEN:%.*]], label 
>> [[IF_END:%.*]]
>> > +; CHECK:       if.then:
>> > +; CHECK-NEXT:    br label [[RETURN:%.*]]
>> > +; CHECK:       if.end:
>> > +; CHECK-NEXT:    br label [[SW_BB2]]
>> > +; CHECK:       sw.bb2:
>> > +; CHECK-NEXT:    [[TMP2:%.*]] = phi double [ [[DOTPRE2]], 
>> [[ENTRY_SW_BB2_CRIT_EDGE]] ], [ [[TMP1]], [[IF_END]] ]
>> > +; CHECK-NEXT:    [[IDXPROM3_PRE_PHI:%.*]] = phi i64 [ [[DOTPRE1]], 
>> [[ENTRY_SW_BB2_CRIT_EDGE]] ], [ [[IDXPROM]], [[IF_END]] ]
>> > +; CHECK-NEXT:    [[TMP3:%.*]] = phi double* [ [[DOTPRE]], 
>> [[ENTRY_SW_BB2_CRIT_EDGE]] ], [ [[TMP0]], [[IF_END]] ]
>> > +; CHECK-NEXT:    [[ARRAYIDX5:%.*]] = getelementptr inbounds 
>> double, double* [[TMP3]], i64 [[IDXPROM3_PRE_PHI]]
>> > +; CHECK-NEXT:    [[SUB6:%.*]] = fsub double 3.000000e+00, [[TMP2]]
>> > +; CHECK-NEXT:    store double [[SUB6]], double* [[ARRAYIDX5]]
>> > +; CHECK-NEXT:    br label [[RETURN]]
>> > +; CHECK:       sw.default:
>> > +; CHECK-NEXT:    br label [[RETURN]]
>> > +; CHECK:       return:
>> > +; CHECK-NEXT:    [[RETVAL_0:%.*]] = phi double [ 0.000000e+00, 
>> [[SW_DEFAULT]] ], [ [[SUB6]], [[SW_BB2]] ], [ [[SUB]], [[IF_THEN]] ]
>> > +; CHECK-NEXT:    ret double [[RETVAL_0]]
>> > +;
>> >   entry:
>> >     switch i32 %stat, label %sw.default [
>> > -    i32 0, label %sw.bb
>> > -    i32 1, label %sw.bb
>> > -    i32 2, label %sw.bb2
>> > +  i32 0, label %sw.bb
>> > +  i32 1, label %sw.bb
>> > +  i32 2, label %sw.bb2
>> >     ]
>> >
>> > sw.bb:                                            ; preds = %entry, 
>> %entry
>> > @@ -35,11 +74,8 @@ sw.bb2:
>> >     %2 = load double*, double** %arrayidx4, align 8
>> >     %arrayidx5 = getelementptr inbounds double, double* %2, i64 
>> %idxprom3
>> >     %3 = load double, double* %arrayidx5, align 8
>> > -; CHECK: sw.bb2:
>> > -; CHECK-NOT: sext
>> > -; CHECK: phi double [
>> > -; CHECK-NOT: load
>> >     %sub6 = fsub double 3.000000e+00, %3
>> > +  store double %sub6, double* %arrayidx5
>> >     br label %return
>> >
>> > sw.default:                                       ; preds = %entry
>> > @@ -55,12 +91,24 @@ return:
>> >   ; actually processed. Make sure we can deal with the situation.
>> >
>> >   define void @test_shortcut_safe(i1 %tst, i32 %p1, i32* %a) {
>> > -; CHECK-LABEL: define void @test_shortcut_safe
>> > -; CHECK: [[SEXT1:%.*]] = sext i32 %p1 to i64
>> > -; CHECK: [[PHI1:%.*]] = phi i64 [ [[SEXT1]], {{%.*}} ], [ 
>> [[PHI2:%.*]], {{%.*}} ]
>> > -; CHECK: [[SEXT2:%.*]] = sext i32 %p1 to i64
>> > -; CHECK: [[PHI2]] = phi i64 [ [[SEXT2]], {{.*}} ], [ [[PHI1]], 
>> {{%.*}} ]
>> > -; CHECK: getelementptr inbounds i32, i32* %a, i64 [[PHI2]]
>> > +; CHECK-LABEL: @test_shortcut_safe(
>> > +; CHECK-NEXT:    br i1 [[TST:%.*]], label [[SEXT1:%.*]], label 
>> [[DOTPRE_DEST_CRIT_EDGE:%.*]]
>> > +; CHECK:       .pre.dest_crit_edge:
>> > +; CHECK-NEXT:    [[DOTPRE1:%.*]] = sext i32 [[P1:%.*]] to i64
>> > +; CHECK-NEXT:    br label [[PRE_DEST:%.*]]
>> > +; CHECK:       pre.dest:
>> > +; CHECK-NEXT:    [[DOTPRE_PRE_PHI:%.*]] = phi i64 [ [[DOTPRE1]], 
>> [[DOTPRE_DEST_CRIT_EDGE]] ], [ [[IDXPROM2_PRE_PHI:%.*]], 
>> [[SEXT_USE:%.*]] ]
>> > +; CHECK-NEXT:    br label [[SEXT_USE]]
>> > +; CHECK:       sext1:
>> > +; CHECK-NEXT:    [[IDXPROM:%.*]] = sext i32 [[P1]] to i64
>> > +; CHECK-NEXT:    br label [[SEXT_USE]]
>> > +; CHECK:       sext.use:
>> > +; CHECK-NEXT:    [[IDXPROM2_PRE_PHI]] = phi i64 [ [[IDXPROM]], 
>> [[SEXT1]] ], [ [[DOTPRE_PRE_PHI]], [[PRE_DEST]] ]
>> > +; CHECK-NEXT:    [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, 
>> i32* [[A:%.*]], i64 [[IDXPROM2_PRE_PHI]]
>> > +; CHECK-NEXT:    [[VAL:%.*]] = load i32, i32* [[ARRAYIDX3]], align 4
>> > +; CHECK-NEXT:    tail call void @g(i32 [[VAL]])
>> > +; CHECK-NEXT:    br label [[PRE_DEST]]
>> > +;
>> >
>> >     br i1 %tst, label %sext1, label %pre.dest
>> >
>> >
>> >
>> > _______________________________________________
>> > llvm-commits mailing list
>> > llvm-commits at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>> IMPORTANT NOTICE: The contents of this email and any attachments are 
>> confidential and may also be privileged. If you are not the intended 
>> recipient, please notify the sender immediately and do not disclose 
>> the contents to any other person, use it for any purpose, or store or 
>> copy the information in any medium. Thank you. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181207/761d5a03/attachment-0001.html>