<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=windows-1252">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Did a bit more thinking about this, and came up with an example
      of one case where this is a pessimization.  I am specifically not
      asking for any action now, just noting this for later in case we
      find cases where patterns like this matter.  <br>
    </p>
    <p><br>
    </p>
    <p>define i32* @ret_phi(i1 %c, i64 %offset, i32* %p) {<br>
      entry:<br>
        br i1 %c, label %if.then, label %if.else<br>
      <br>
      if.then:<br>
        %addr1 = getelementptr i32, i32* %p, i64 %offset<br>
        store i32 5, i32* %addr1<br>
        br label %return<br>
      <br>
      if.else:<br>
        br label %return<br>
      <br>
      return:<br>
        %addr2 = getelementptr i32, i32* %p, i64 %offset<br>
        ret i32* %addr2<br>
      }<br>
      <br>
    </p>
    <p>Previously, this generated the following x86-64 code:</p>
    <p># %bb.0:                                # %entry<br>
          testb    $1, %dil<br>
          je    .LBB4_2<br>
      # %bb.1:                                # %if.then<br>
          leaq    (%rdx,%rsi,4), %rax<br>
          movl    $5, (%rdx,%rsi,4)<br>
          retq<br>
      .LBB4_2:                                # %if.else<br>
          leaq    (%rdx,%rsi,4), %rax<br>
          retq<br>
    </p>
    <p><br>
    </p>
    <p>Now, it yields:</p>
    <p># %bb.0:                                # %entry<br>
          testb    $1, %dil<br>
          je    .LBB4_2<br>
      # %bb.1:                                # %if.then<br>
          movl    $5, (%rdx,%rsi,4)<br>
      .LBB4_2:                                # %return<br>
          leaq    (%rdx,%rsi,4), %rax<br>
          retq<br>
      <br>
    </p>
    <p>(i.e no tail duplication is applied, and we don't pull out the
      address computation)</p>
    <p><br>
    </p>
    <p>This isn't a major difference in this case, but it is the type of
      thing which *might* be problematic in combination with some more
      complicated pattern we haven't found yet.  It's just enough to
      make me a bit nervous without any clear motivation for an
      objection.  <br>
    </p>
    <p><br>
    </p>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 12/7/18 10:18 AM, Philip Reames
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:3f6cbc62-b21d-4e92-0351-1c487d7109b5@philipreames.com">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <p>Well, it looks like I'm wrong here.  As you correctly noted, I
        should have provided a test case which broke.  When trying to
        construct one, I realized that the existing address translation
        code inside performLoadPRE handles the case I had in mind. 
        Worse, I took another look at your patch, and you even call this
        out in comment form.  Sorry for what was clearly a rushed
        response in retrospect.  <br>
      </p>
      <p><br>
      </p>
      <p>I'm still a bit concerned there's a case we haven't thought of
        here, but my level of concern has sharply dropped.  Given that,
        I retract my request for a revert, and we'll identify any
        problems which crop up through normal process.  <br>
      </p>
      <p><br>
      </p>
      <p>Philip<br>
      </p>
      <p><br>
      </p>
      <div class="moz-cite-prefix">On 12/7/18 1:43 AM, Alexandros
        Lamprineas wrote:<br>
      </div>
      <blockquote type="cite"
cite="mid:DB7PR08MB386566AC530BBB9E880DB6D2E2AA0@DB7PR08MB3865.eurprd08.prod.outlook.com">
        <meta http-equiv="Content-Type" content="text/html;
          charset=windows-1252">
        <style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
        <div id="divtagdefaultwrapper"
style="font-size:12pt;color:#000000;font-family:Calibri,Helvetica,sans-serif;"
          dir="ltr">
          <div id="divtagdefaultwrapper" dir="ltr" style="font-size:
            12pt;&#xA; color: rgb(0, 0, 0); font-family: Calibri,
            Helvetica,&#xA; sans-serif, EmojiFont, "Apple Color
            Emoji",&#xA; "Segoe UI Emoji",
            NotoColorEmoji, "Segoe UI&#xA; Symbol",
            "Android Emoji", EmojiSymbols;">
            <p style="margin-top:0; margin-bottom:0">Hi Philip,</p>
            <p style="margin-top:0; margin-bottom:0"><br>
            </p>
            <p style="margin-top:0; margin-bottom:0">Thanks for the
              feedback. If I understand correct this patch prevents the
              PRE of loads in some cases. Could you give me an example
              (an IR reproducer) that demonstrates the problem? Ideally
              we would want it in the llvm regression tests.</p>
            <p style="margin-top:0; margin-bottom:0"><br>
            </p>
            <p style="margin-top:0; margin-bottom:0">Cheers,</p>
            <p style="margin-top:0; margin-bottom:0">Alexandros</p>
          </div>
          <hr tabindex="-1" style="display:inline-block; width:98%">
          <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
              face="Calibri, sans-serif" color="#000000"><b>From:</b>
              Philip Reames <a class="moz-txt-link-rfc2396E"
                href="mailto:listmail@philipreames.com"
                moz-do-not-send="true"><listmail@philipreames.com></a><br>
              <b>Sent:</b> Thursday, December 6, 2018 8:11:08 PM<br>
              <b>To:</b> Alexandros Lamprineas; <a
                class="moz-txt-link-abbreviated"
                href="mailto:llvm-commits@lists.llvm.org"
                moz-do-not-send="true">llvm-commits@lists.llvm.org</a><br>
              <b>Subject:</b> Re: [llvm] r348496 - [GVN] Don't perform
              scalar PRE on GEPs</font>
            <div> </div>
          </div>
          <div class="BodyFragment"><font size="2"><span
                style="font-size:11pt">
                <div class="PlainText">I think this patch is a bad idea
                  and needs to be reverted.<br>
                  <br>
                  Specifically, this will effectively break all LoadPRE
                  for loads from <br>
                  fixed offsets off distinct base objects or where an
                  address is not <br>
                  globally available along all relevant paths.  The code
                  for LoadPRE <br>
                  specifically calls into performScalarPRE for GEPs
                  specifically to make <br>
                  this work.<br>
                  <br>
                  While I understand the desire not to make CGP even
                  more complicated, I <br>
                  think this is a case where doing so is called for.<br>
                  <br>
                  Philip<br>
                  <br>
                  On 12/6/18 8:11 AM, Alexandros Lamprineas via
                  llvm-commits wrote:<br>
                  > Author: alelab01<br>
                  > Date: Thu Dec  6 08:11:58 2018<br>
                  > New Revision: 348496<br>
                  ><br>
                  > URL: <a
                    href="http://llvm.org/viewvc/llvm-project?rev=348496&view=rev"
                    moz-do-not-send="true">http://llvm.org/viewvc/llvm-project?rev=348496&view=rev</a><br>
                  > Log:<br>
                  > [GVN] Don't perform scalar PRE on GEPs<br>
                  ><br>
                  > Partial Redundancy Elimination of GEPs prevents
                  CodeGenPrepare from<br>
                  > sinking the addressing mode computation of memory
                  instructions back<br>
                  > to its uses. The problem comes from the insertion
                  of PHIs, which<br>
                  > confuse CGP and make it bail.<br>
                  ><br>
                  > I've autogenerated the check lines of an existing
                  test and added a<br>
                  > store instruction to demonstrate the motivation
                  behind this change.<br>
                  > The store is now using the gep instead of a phi.<br>
                  ><br>
                  > Differential Revision: <a
                    href="https://reviews.llvm.org/D55009"
                    moz-do-not-send="true">https://reviews.llvm.org/D55009</a><br>
                  ><br>
                  > Modified:<br>
                  >      llvm/trunk/lib/Transforms/Scalar/GVN.cpp<br>
                  >     
                  llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll<br>
                  ><br>
                  > Modified:
                  llvm/trunk/lib/Transforms/Scalar/GVN.cpp<br>
                  > URL: <a
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=348496&r1=348495&r2=348496&view=diff"
                    moz-do-not-send="true">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=348496&r1=348495&r2=348496&view=diff</a><br>
                  >
==============================================================================<br>
                  > --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp
                  (original)<br>
                  > +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Thu
                  Dec  6 08:11:58 2018<br>
                  > @@ -2156,6 +2156,16 @@ bool
                  GVN::performScalarPRE(Instruction *<br>
                  >     if (isa<CmpInst>(CurInst))<br>
                  >       return false;<br>
                  >   <br>
                  > +  // Don't do PRE on GEPs. The inserted PHI
                  would prevent CodeGenPrepare from<br>
                  > +  // sinking the addressing mode computation
                  back to its uses. Extending the<br>
                  > +  // GEP's live range increases the register
                  pressure, and therefore it can<br>
                  > +  // introduce unnecessary spills.<br>
                  > +  //<br>
                  > +  // This doesn't prevent Load PRE. PHI
                  translation will make the GEP available<br>
                  > +  // to the load by moving it to the predecessor
                  block if necessary.<br>
                  > +  if (isa<GetElementPtrInst>(CurInst))<br>
                  > +    return false;<br>
                  > +<br>
                  >     // We don't currently value number ANY inline
                  asm calls.<br>
                  >     if (CallInst *CallI =
                  dyn_cast<CallInst>(CurInst))<br>
                  >       if (CallI->isInlineAsm())<br>
                  ><br>
                  > Modified:
                  llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll<br>
                  > URL: <a
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll?rev=348496&r1=348495&r2=348496&view=diff"
                    moz-do-not-send="true">
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll?rev=348496&r1=348495&r2=348496&view=diff</a><br>
                  >
==============================================================================<br>
                  > ---
                  llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll
                  (original)<br>
                  > +++
                  llvm/trunk/test/Transforms/GVN/PRE/pre-gep-load.ll Thu
                  Dec  6 08:11:58 2018<br>
                  > @@ -1,3 +1,4 @@<br>
                  > +; NOTE: Assertions have been autogenerated by
                  utils/update_test_checks.py<br>
                  >   ; RUN: opt < %s -basicaa -gvn
                  -enable-load-pre -S | FileCheck %s<br>
                  >   ; RUN: opt < %s -aa-pipeline=basic-aa
                  -passes=gvn -enable-load-pre -S | FileCheck %s<br>
                  >   <br>
                  > @@ -6,11 +7,49 @@ target triple =
                  "aarch64--linux-gnu"<br>
                  >   <br>
                  >   define double @foo(i32 %stat, i32 %i, double**
                  %p) {<br>
                  >   ; CHECK-LABEL: @foo(<br>
                  > +; CHECK-NEXT:  entry:<br>
                  > +; CHECK-NEXT:    switch i32 [[STAT:%.*]], label
                  [[SW_DEFAULT:%.*]] [<br>
                  > +; CHECK-NEXT:    i32 0, label [[SW_BB:%.*]]<br>
                  > +; CHECK-NEXT:    i32 1, label [[SW_BB]]<br>
                  > +; CHECK-NEXT:    i32 2, label
                  [[ENTRY_SW_BB2_CRIT_EDGE:%.*]]<br>
                  > +; CHECK-NEXT:    ]<br>
                  > +; CHECK:       entry.sw.bb2_crit_edge:<br>
                  > +; CHECK-NEXT:    [[DOTPRE:%.*]] = load double*,
                  double** [[P:%.*]], align 8<br>
                  > +; CHECK-NEXT:    [[DOTPRE1:%.*]] = sext i32
                  [[I:%.*]] to i64<br>
                  > +; CHECK-NEXT:   
                  [[ARRAYIDX5_PHI_TRANS_INSERT:%.*]] = getelementptr
                  inbounds double, double* [[DOTPRE]], i64 [[DOTPRE1]]<br>
                  > +; CHECK-NEXT:    [[DOTPRE2:%.*]] = load double,
                  double* [[ARRAYIDX5_PHI_TRANS_INSERT]], align 8<br>
                  > +; CHECK-NEXT:    br label [[SW_BB2:%.*]]<br>
                  > +; CHECK:       sw.bb:<br>
                  > +; CHECK-NEXT:    [[IDXPROM:%.*]] = sext i32
                  [[I]] to i64<br>
                  > +; CHECK-NEXT:    [[TMP0:%.*]] = load double*,
                  double** [[P]], align 8<br>
                  > +; CHECK-NEXT:    [[ARRAYIDX1:%.*]] =
                  getelementptr inbounds double, double* [[TMP0]], i64
                  [[IDXPROM]]<br>
                  > +; CHECK-NEXT:    [[TMP1:%.*]] = load double,
                  double* [[ARRAYIDX1]], align 8<br>
                  > +; CHECK-NEXT:    [[SUB:%.*]] = fsub double
                  [[TMP1]], 1.000000e+00<br>
                  > +; CHECK-NEXT:    [[CMP:%.*]] = fcmp olt double
                  [[SUB]], 0.000000e+00<br>
                  > +; CHECK-NEXT:    br i1 [[CMP]], label
                  [[IF_THEN:%.*]], label [[IF_END:%.*]]<br>
                  > +; CHECK:       if.then:<br>
                  > +; CHECK-NEXT:    br label [[RETURN:%.*]]<br>
                  > +; CHECK:       if.end:<br>
                  > +; CHECK-NEXT:    br label [[SW_BB2]]<br>
                  > +; CHECK:       sw.bb2:<br>
                  > +; CHECK-NEXT:    [[TMP2:%.*]] = phi double [
                  [[DOTPRE2]], [[ENTRY_SW_BB2_CRIT_EDGE]] ], [ [[TMP1]],
                  [[IF_END]] ]<br>
                  > +; CHECK-NEXT:    [[IDXPROM3_PRE_PHI:%.*]] = phi
                  i64 [ [[DOTPRE1]], [[ENTRY_SW_BB2_CRIT_EDGE]] ], [
                  [[IDXPROM]], [[IF_END]] ]<br>
                  > +; CHECK-NEXT:    [[TMP3:%.*]] = phi double* [
                  [[DOTPRE]], [[ENTRY_SW_BB2_CRIT_EDGE]] ], [ [[TMP0]],
                  [[IF_END]] ]<br>
                  > +; CHECK-NEXT:    [[ARRAYIDX5:%.*]] =
                  getelementptr inbounds double, double* [[TMP3]], i64
                  [[IDXPROM3_PRE_PHI]]<br>
                  > +; CHECK-NEXT:    [[SUB6:%.*]] = fsub double
                  3.000000e+00, [[TMP2]]<br>
                  > +; CHECK-NEXT:    store double [[SUB6]], double*
                  [[ARRAYIDX5]]<br>
                  > +; CHECK-NEXT:    br label [[RETURN]]<br>
                  > +; CHECK:       sw.default:<br>
                  > +; CHECK-NEXT:    br label [[RETURN]]<br>
                  > +; CHECK:       return:<br>
                  > +; CHECK-NEXT:    [[RETVAL_0:%.*]] = phi double [
                  0.000000e+00, [[SW_DEFAULT]] ], [ [[SUB6]], [[SW_BB2]]
                  ], [ [[SUB]], [[IF_THEN]] ]<br>
                  > +; CHECK-NEXT:    ret double [[RETVAL_0]]<br>
                  > +;<br>
                  >   entry:<br>
                  >     switch i32 %stat, label %sw.default [<br>
                  > -    i32 0, label %sw.bb<br>
                  > -    i32 1, label %sw.bb<br>
                  > -    i32 2, label %sw.bb2<br>
                  > +  i32 0, label %sw.bb<br>
                  > +  i32 1, label %sw.bb<br>
                  > +  i32 2, label %sw.bb2<br>
                  >     ]<br>
                  >   <br>
                  >  
                  sw.bb:                                            ;
                  preds = %entry, %entry<br>
                  > @@ -35,11 +74,8 @@ sw.bb2:<br>
                  >     %2 = load double*, double** %arrayidx4, align
                  8<br>
                  >     %arrayidx5 = getelementptr inbounds double,
                  double* %2, i64 %idxprom3<br>
                  >     %3 = load double, double* %arrayidx5, align 8<br>
                  > -; CHECK: sw.bb2:<br>
                  > -; CHECK-NOT: sext<br>
                  > -; CHECK: phi double [<br>
                  > -; CHECK-NOT: load<br>
                  >     %sub6 = fsub double 3.000000e+00, %3<br>
                  > +  store double %sub6, double* %arrayidx5<br>
                  >     br label %return<br>
                  >   <br>
                  >  
                  sw.default:                                       ;
                  preds = %entry<br>
                  > @@ -55,12 +91,24 @@ return:<br>
                  >   ; actually processed. Make sure we can deal
                  with the situation.<br>
                  >   <br>
                  >   define void @test_shortcut_safe(i1 %tst, i32
                  %p1, i32* %a) {<br>
                  > -; CHECK-LABEL: define void @test_shortcut_safe<br>
                  > -; CHECK: [[SEXT1:%.*]] = sext i32 %p1 to i64<br>
                  > -; CHECK: [[PHI1:%.*]] = phi i64 [ [[SEXT1]],
                  {{%.*}} ], [ [[PHI2:%.*]], {{%.*}} ]<br>
                  > -; CHECK: [[SEXT2:%.*]] = sext i32 %p1 to i64<br>
                  > -; CHECK: [[PHI2]] = phi i64 [ [[SEXT2]], {{.*}}
                  ], [ [[PHI1]], {{%.*}} ]<br>
                  > -; CHECK: getelementptr inbounds i32, i32* %a,
                  i64 [[PHI2]]<br>
                  > +; CHECK-LABEL: @test_shortcut_safe(<br>
                  > +; CHECK-NEXT:    br i1 [[TST:%.*]], label
                  [[SEXT1:%.*]], label [[DOTPRE_DEST_CRIT_EDGE:%.*]]<br>
                  > +; CHECK:       .pre.dest_crit_edge:<br>
                  > +; CHECK-NEXT:    [[DOTPRE1:%.*]] = sext i32
                  [[P1:%.*]] to i64<br>
                  > +; CHECK-NEXT:    br label [[PRE_DEST:%.*]]<br>
                  > +; CHECK:       pre.dest:<br>
                  > +; CHECK-NEXT:    [[DOTPRE_PRE_PHI:%.*]] = phi
                  i64 [ [[DOTPRE1]], [[DOTPRE_DEST_CRIT_EDGE]] ], [
                  [[IDXPROM2_PRE_PHI:%.*]], [[SEXT_USE:%.*]] ]<br>
                  > +; CHECK-NEXT:    br label [[SEXT_USE]]<br>
                  > +; CHECK:       sext1:<br>
                  > +; CHECK-NEXT:    [[IDXPROM:%.*]] = sext i32
                  [[P1]] to i64<br>
                  > +; CHECK-NEXT:    br label [[SEXT_USE]]<br>
                  > +; CHECK:       sext.use:<br>
                  > +; CHECK-NEXT:    [[IDXPROM2_PRE_PHI]] = phi i64
                  [ [[IDXPROM]], [[SEXT1]] ], [ [[DOTPRE_PRE_PHI]],
                  [[PRE_DEST]] ]<br>
                  > +; CHECK-NEXT:    [[ARRAYIDX3:%.*]] =
                  getelementptr inbounds i32, i32* [[A:%.*]], i64
                  [[IDXPROM2_PRE_PHI]]<br>
                  > +; CHECK-NEXT:    [[VAL:%.*]] = load i32, i32*
                  [[ARRAYIDX3]], align 4<br>
                  > +; CHECK-NEXT:    tail call void @g(i32 [[VAL]])<br>
                  > +; CHECK-NEXT:    br label [[PRE_DEST]]<br>
                  > +;<br>
                  >   <br>
                  >     br i1 %tst, label %sext1, label %pre.dest<br>
                  >   <br>
                  ><br>
                  ><br>
                  > _______________________________________________<br>
                  > llvm-commits mailing list<br>
                  > <a class="moz-txt-link-abbreviated"
                    href="mailto:llvm-commits@lists.llvm.org"
                    moz-do-not-send="true">llvm-commits@lists.llvm.org</a><br>
                  > <a
                    href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits"
                    moz-do-not-send="true">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
                </div>
              </span></font></div>
        </div>
        IMPORTANT NOTICE: The contents of this email and any attachments
        are confidential and may also be privileged. If you are not the
        intended recipient, please notify the sender immediately and do
        not disclose the contents to any other person, use it for any
        purpose, or store or copy the information in any medium. Thank
        you. </blockquote>
    </blockquote>
  </body>
</html>