[llvm] r348496 - [GVN] Don't perform scalar PRE on GEPs
Philip Reames via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 7 11:29:17 PST 2018
Did a bit more thinking about this, and came up with an example of one
case where this is a pessimization. I am specifically not asking for
any action now, just noting this for later in case we find cases where
patterns like this matter.
define i32* @ret_phi(i1 %c, i64 %offset, i32* %p) {
entry:
br i1 %c, label %if.then, label %if.else
if.then:
%addr1 = getelementptr i32, i32* %p, i64 %offset
store i32 5, i32* %addr1
br label %return
if.else:
br label %return
return:
%addr2 = getelementptr i32, i32* %p, i64 %offset
ret i32* %addr2
}
Previously, this generated the following x86-64 code:
# %bb.0: # %entry
testb $1, %dil
je .LBB4_2
# %bb.1: # %if.then
leaq (%rdx,%rsi,4), %rax
movl $5, (%rdx,%rsi,4)
retq
.LBB4_2: # %if.else
leaq (%rdx,%rsi,4), %rax
retq
Now, it yields:
# %bb.0: # %entry
testb $1, %dil
je .LBB4_2
# %bb.1: # %if.then
movl $5, (%rdx,%rsi,4)
.LBB4_2: # %return
leaq (%rdx,%rsi,4), %rax
retq
(i.e no tail duplication is applied, and we don't pull out the address
computation)
This isn't a major difference in this case, but it is the type of thing
which *might* be problematic in combination with some more complicated
pattern we haven't found yet. It's just enough to make me a bit nervous
without any clear motivation for an objection.
On 12/7/18 10:18 AM, Philip Reames wrote:
>
> Well, it looks like I'm wrong here. As you correctly noted, I should
> have provided a test case which broke. When trying to construct one,
> I realized that the existing address translation code inside
> performLoadPRE handles the case I had in mind. Worse, I took another
> look at your patch, and you even call this out in comment form. Sorry
> for what was clearly a rushed response in retrospect.
>
>
> I'm still a bit concerned there's a case we haven't thought of here,
> but my level of concern has sharply dropped. Given that, I retract my
> request for a revert, and we'll identify any problems which crop up
> through normal process.
>
>
> Philip
>
>
> On 12/7/18 1:43 AM, Alexandros Lamprineas wrote:
>>
>> Hi Philip,
>>
>>
>> Thanks for the feedback. If I understand correct this patch prevents
>> the PRE of loads in some cases. Could you give me an example (an IR
>> reproducer) that demonstrates the problem? Ideally we would want it
>> in the llvm regression tests.
>>
>>
>> Cheers,
>>
>> Alexandros
>>
>> ------------------------------------------------------------------------
>> *From:* Philip Reames <listmail at philipreames.com>
>> *Sent:* Thursday, December 6, 2018 8:11:08 PM
>> *To:* Alexandros Lamprineas; llvm-commits at lists.llvm.org
>> *Subject:* Re: [llvm] r348496 - [GVN] Don't perform scalar PRE on GEPs
>> I think this patch is a bad idea and needs to be reverted.
>>
>> Specifically, this will effectively break all LoadPRE for loads from
>> fixed offsets off distinct base objects or where an address is not
>> globally available along all relevant paths. The code for LoadPRE
>> specifically calls into performScalarPRE for GEPs specifically to make
>> this work.
>>
>> While I understand the desire not to make CGP even more complicated, I
>> think this is a case where doing so is called for.
>>
>> Philip
>>
>> On 12/6/18 8:11 AM, Alexandros Lamprineas via llvm-commits wrote:
>> > Author: alelab01
>> > Date: Thu Dec 6 08:11:58 2018
>> > New Revision: 348496
>> >
>> > URL: http://llvm.org/viewvc/llvm-project?rev=348496&view=rev
>> > Log:
>> > [GVN] Don't perform scalar PRE on GEPs
>> >
>> > Partial Redundancy Elimination of GEPs prevents CodeGenPrepare from
>> > sinking the addressing mode computation of memory instructions back
>> > to its uses. The problem comes from the insertion of PHIs, which
>> > confuse CGP and make it bail.
>> >
>> > I've autogenerated the check lines of an existing test and added a
>> > store instruction to demonstrate the motivation behind this change.
>> > The store is now using the gep instead of a phi.
>> >
>> > Differential Revision: https://reviews.llvm.org/D55009
>> >
>> > Modified:
>> > llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>> > llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll
>> >
>> > Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>> > URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=348496&r1=348495&r2=348496&view=diff
>> >
>> ==============================================================================
>> > --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original)
>> > +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Thu Dec 6 08:11:58 2018
>> > @@ -2156,6 +2156,16 @@ bool GVN::performScalarPRE(Instruction *
>> > if (isa<CmpInst>(CurInst))
>> > return false;
>> >
>> > + // Don't do PRE on GEPs. The inserted PHI would prevent
>> CodeGenPrepare from
>> > + // sinking the addressing mode computation back to its uses.
>> Extending the
>> > + // GEP's live range increases the register pressure, and
>> therefore it can
>> > + // introduce unnecessary spills.
>> > + //
>> > + // This doesn't prevent Load PRE. PHI translation will make the
>> GEP available
>> > + // to the load by moving it to the predecessor block if necessary.
>> > + if (isa<GetElementPtrInst>(CurInst))
>> > + return false;
>> > +
>> > // We don't currently value number ANY inline asm calls.
>> > if (CallInst *CallI = dyn_cast<CallInst>(CurInst))
>> > if (CallI->isInlineAsm())
>> >
>> > Modified: llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll
>> > URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll?rev=348496&r1=348495&r2=348496&view=diff
>> >
>> ==============================================================================
>> > --- llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll (original)
>> > +++ llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll Thu Dec 6
>> 08:11:58 2018
>> > @@ -1,3 +1,4 @@
>> > +; NOTE: Assertions have been autogenerated by
>> utils/update_test_checks.py
>> > ; RUN: opt < %s -basicaa -gvn -enable-load-pre -S | FileCheck %s
>> > ; RUN: opt < %s -aa-pipeline=basic-aa -passes=gvn
>> -enable-load-pre -S | FileCheck %s
>> >
>> > @@ -6,11 +7,49 @@ target triple = "aarch64--linux-gnu"
>> >
>> > define double @foo(i32 %stat, i32 %i, double** %p) {
>> > ; CHECK-LABEL: @foo(
>> > +; CHECK-NEXT: entry:
>> > +; CHECK-NEXT: switch i32 [[STAT:%.*]], label [[SW_DEFAULT:%.*]] [
>> > +; CHECK-NEXT: i32 0, label [[SW_BB:%.*]]
>> > +; CHECK-NEXT: i32 1, label [[SW_BB]]
>> > +; CHECK-NEXT: i32 2, label [[ENTRY_SW_BB2_CRIT_EDGE:%.*]]
>> > +; CHECK-NEXT: ]
>> > +; CHECK: entry.sw.bb2_crit_edge:
>> > +; CHECK-NEXT: [[DOTPRE:%.*]] = load double*, double**
>> [[P:%.*]], align 8
>> > +; CHECK-NEXT: [[DOTPRE1:%.*]] = sext i32 [[I:%.*]] to i64
>> > +; CHECK-NEXT: [[ARRAYIDX5_PHI_TRANS_INSERT:%.*]] = getelementptr
>> inbounds double, double* [[DOTPRE]], i64 [[DOTPRE1]]
>> > +; CHECK-NEXT: [[DOTPRE2:%.*]] = load double, double*
>> [[ARRAYIDX5_PHI_TRANS_INSERT]], align 8
>> > +; CHECK-NEXT: br label [[SW_BB2:%.*]]
>> > +; CHECK: sw.bb:
>> > +; CHECK-NEXT: [[IDXPROM:%.*]] = sext i32 [[I]] to i64
>> > +; CHECK-NEXT: [[TMP0:%.*]] = load double*, double** [[P]], align 8
>> > +; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds
>> double, double* [[TMP0]], i64 [[IDXPROM]]
>> > +; CHECK-NEXT: [[TMP1:%.*]] = load double, double*
>> [[ARRAYIDX1]], align 8
>> > +; CHECK-NEXT: [[SUB:%.*]] = fsub double [[TMP1]], 1.000000e+00
>> > +; CHECK-NEXT: [[CMP:%.*]] = fcmp olt double [[SUB]], 0.000000e+00
>> > +; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label
>> [[IF_END:%.*]]
>> > +; CHECK: if.then:
>> > +; CHECK-NEXT: br label [[RETURN:%.*]]
>> > +; CHECK: if.end:
>> > +; CHECK-NEXT: br label [[SW_BB2]]
>> > +; CHECK: sw.bb2:
>> > +; CHECK-NEXT: [[TMP2:%.*]] = phi double [ [[DOTPRE2]],
>> [[ENTRY_SW_BB2_CRIT_EDGE]] ], [ [[TMP1]], [[IF_END]] ]
>> > +; CHECK-NEXT: [[IDXPROM3_PRE_PHI:%.*]] = phi i64 [ [[DOTPRE1]],
>> [[ENTRY_SW_BB2_CRIT_EDGE]] ], [ [[IDXPROM]], [[IF_END]] ]
>> > +; CHECK-NEXT: [[TMP3:%.*]] = phi double* [ [[DOTPRE]],
>> [[ENTRY_SW_BB2_CRIT_EDGE]] ], [ [[TMP0]], [[IF_END]] ]
>> > +; CHECK-NEXT: [[ARRAYIDX5:%.*]] = getelementptr inbounds
>> double, double* [[TMP3]], i64 [[IDXPROM3_PRE_PHI]]
>> > +; CHECK-NEXT: [[SUB6:%.*]] = fsub double 3.000000e+00, [[TMP2]]
>> > +; CHECK-NEXT: store double [[SUB6]], double* [[ARRAYIDX5]]
>> > +; CHECK-NEXT: br label [[RETURN]]
>> > +; CHECK: sw.default:
>> > +; CHECK-NEXT: br label [[RETURN]]
>> > +; CHECK: return:
>> > +; CHECK-NEXT: [[RETVAL_0:%.*]] = phi double [ 0.000000e+00,
>> [[SW_DEFAULT]] ], [ [[SUB6]], [[SW_BB2]] ], [ [[SUB]], [[IF_THEN]] ]
>> > +; CHECK-NEXT: ret double [[RETVAL_0]]
>> > +;
>> > entry:
>> > switch i32 %stat, label %sw.default [
>> > - i32 0, label %sw.bb
>> > - i32 1, label %sw.bb
>> > - i32 2, label %sw.bb2
>> > + i32 0, label %sw.bb
>> > + i32 1, label %sw.bb
>> > + i32 2, label %sw.bb2
>> > ]
>> >
>> > sw.bb: ; preds = %entry,
>> %entry
>> > @@ -35,11 +74,8 @@ sw.bb2:
>> > %2 = load double*, double** %arrayidx4, align 8
>> > %arrayidx5 = getelementptr inbounds double, double* %2, i64
>> %idxprom3
>> > %3 = load double, double* %arrayidx5, align 8
>> > -; CHECK: sw.bb2:
>> > -; CHECK-NOT: sext
>> > -; CHECK: phi double [
>> > -; CHECK-NOT: load
>> > %sub6 = fsub double 3.000000e+00, %3
>> > + store double %sub6, double* %arrayidx5
>> > br label %return
>> >
>> > sw.default: ; preds = %entry
>> > @@ -55,12 +91,24 @@ return:
>> > ; actually processed. Make sure we can deal with the situation.
>> >
>> > define void @test_shortcut_safe(i1 %tst, i32 %p1, i32* %a) {
>> > -; CHECK-LABEL: define void @test_shortcut_safe
>> > -; CHECK: [[SEXT1:%.*]] = sext i32 %p1 to i64
>> > -; CHECK: [[PHI1:%.*]] = phi i64 [ [[SEXT1]], {{%.*}} ], [
>> [[PHI2:%.*]], {{%.*}} ]
>> > -; CHECK: [[SEXT2:%.*]] = sext i32 %p1 to i64
>> > -; CHECK: [[PHI2]] = phi i64 [ [[SEXT2]], {{.*}} ], [ [[PHI1]],
>> {{%.*}} ]
>> > -; CHECK: getelementptr inbounds i32, i32* %a, i64 [[PHI2]]
>> > +; CHECK-LABEL: @test_shortcut_safe(
>> > +; CHECK-NEXT: br i1 [[TST:%.*]], label [[SEXT1:%.*]], label
>> [[DOTPRE_DEST_CRIT_EDGE:%.*]]
>> > +; CHECK: .pre.dest_crit_edge:
>> > +; CHECK-NEXT: [[DOTPRE1:%.*]] = sext i32 [[P1:%.*]] to i64
>> > +; CHECK-NEXT: br label [[PRE_DEST:%.*]]
>> > +; CHECK: pre.dest:
>> > +; CHECK-NEXT: [[DOTPRE_PRE_PHI:%.*]] = phi i64 [ [[DOTPRE1]],
>> [[DOTPRE_DEST_CRIT_EDGE]] ], [ [[IDXPROM2_PRE_PHI:%.*]],
>> [[SEXT_USE:%.*]] ]
>> > +; CHECK-NEXT: br label [[SEXT_USE]]
>> > +; CHECK: sext1:
>> > +; CHECK-NEXT: [[IDXPROM:%.*]] = sext i32 [[P1]] to i64
>> > +; CHECK-NEXT: br label [[SEXT_USE]]
>> > +; CHECK: sext.use:
>> > +; CHECK-NEXT: [[IDXPROM2_PRE_PHI]] = phi i64 [ [[IDXPROM]],
>> [[SEXT1]] ], [ [[DOTPRE_PRE_PHI]], [[PRE_DEST]] ]
>> > +; CHECK-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32,
>> i32* [[A:%.*]], i64 [[IDXPROM2_PRE_PHI]]
>> > +; CHECK-NEXT: [[VAL:%.*]] = load i32, i32* [[ARRAYIDX3]], align 4
>> > +; CHECK-NEXT: tail call void @g(i32 [[VAL]])
>> > +; CHECK-NEXT: br label [[PRE_DEST]]
>> > +;
>> >
>> > br i1 %tst, label %sext1, label %pre.dest
>> >
>> >
>> >
>> > _______________________________________________
>> > llvm-commits mailing list
>> > llvm-commits at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>> IMPORTANT NOTICE: The contents of this email and any attachments are
>> confidential and may also be privileged. If you are not the intended
>> recipient, please notify the sender immediately and do not disclose
>> the contents to any other person, use it for any purpose, or store or
>> copy the information in any medium. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181207/761d5a03/attachment-0001.html>
More information about the llvm-commits
mailing list