<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p>I've put up <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D26867">https://reviews.llvm.org/D26867</a> for review which
      mostly solves the WZR/XZR spilling issue.  Please take a look when
      you get a chance.</p>
    <p>Thanks,</p>
    <p>-Geoff<br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 11/15/2016 8:19 PM, Quentin Colombet
      wrote:<br>
    </div>
    <blockquote
      cite="mid:454B7A18-FEA9-4EAE-B789-8C4A43DA3438@apple.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      In the meantime, I pushed the base version (r287070) given this is
      general goodness.
      <div class=""><br class="">
        <div>
          <blockquote type="cite" class="">
            <div class="">On Nov 15, 2016, at 12:51 PM, Geoff Berry <<a
                moz-do-not-send="true"
                href="mailto:gberry@codeaurora.org" class="">gberry@codeaurora.org</a>>
              wrote:</div>
            <br class="Apple-interchange-newline">
            <div class="">
              <meta content="text/html; charset=utf-8"
                http-equiv="Content-Type" class="">
              <div bgcolor="#FFFFFF" text="#000000" class="">
                <p class="">I think I have a change building on
                  Quentin's patch that catches this case with only a few
                  extra changes.  I should have a rough draft ready for
                  feedback in a day or two.<br class="">
                </p>
                <br class="">
                <div class="moz-cite-prefix">On 11/15/2016 3:46 PM,
                  Matthias Braun wrote:<br class="">
                </div>
                <blockquote
                  cite="mid:86AC25FA-47A7-490B-8C50-260C3117FD10@braunis.de"
                  type="cite" class="">
                  <meta http-equiv="Content-Type" content="text/html;
                    charset=utf-8" class="">
                  <div class="">Ah right,</div>
                  <div class=""><br class="">
                  </div>
                  <div class="">that is indeed a different issue. I had
                    that issue on my TODO list as well (but somewhere
                    deep down). Without actually trying it, I hope we
                    can do that by having a smarter spill creation
                    callback which matches a spill(copy(zero)) pattern.</div>
                  <div class=""><br class="">
                  </div>
                  <div class="">- Matthias</div>
                  <br class="">
                  <div class="">
                    <blockquote type="cite" class="">
                      <div class="">On Nov 15, 2016, at 12:13 PM, Geoff
                        Berry <<a moz-do-not-send="true"
                          href="mailto:gberry@codeaurora.org" class="">gberry@codeaurora.org</a>>
                        wrote:</div>
                      <br class="Apple-interchange-newline">
                      <div class="">
                        <meta content="text/html; charset=utf-8"
                          http-equiv="Content-Type" class="">
                        <div bgcolor="#FFFFFF" text="#000000" class="">
                          <p class="">Hi Matthias,</p>
                          <p class="">Those changes don't catch the
                            particular cases I'm looking at.  The
                            problem is that the xzr/wzr copies don't
                            become candidates for coalescing until the
                            middle of register allocation when some live
                            ranges have been split/spilled.  Here's an
                            example test case:<br class="">
                          </p>
                          declare i32 @bar()<br class="">
                          declare i32 @baz()<br class="">
                          ; Check that the spill of the zero value gets
                          stored directly instead<br class="">
                          ; of being copied from wzr and then stored.<br
                            class="">
                          define i32 @test_zr_spill_copyprop(i1 %c) {<br
                            class="">
                          ; CHECK-LABEL: test_zr_spill_copyprop:<br
                            class="">
                          entry:<br class="">
                            br i1 %c, label %if.else, label %if.then<br
                            class="">
                          <br class="">
                          if.else:<br class="">
                          ; CHECK: bl bar<br class="">
                          ; CHECK-NEXT: str w0, [sp, #[[SLOT:[0-9]+]]]<br
                            class="">
                            %call1 = tail call i32 @bar()<br class="">
                            br label %if.end<br class="">
                          <br class="">
                          if.then:<br class="">
                          ; CHECK: bl baz<br class="">
                          ; CHECK-NEXT: str wzr, [sp, #[[SLOT]]]<br
                            class="">
                            %call2 = tail call i32 @baz()<br class="">
                            br label %if.end<br class="">
                          <br class="">
                          if.end:<br class="">
                            %x.0 = phi i32 [ 0, %if.then ], [ %call1,
                          %if.else ]<br class="">
                            call void asm sideeffect "",
"~{x0},~{x1},~{x2},~{x3},~{x4},~{x5},~{x6},~{x7},~{x8},~{x9},~{x10},~{x11},~{x12},~{x13},~{x14},~{x15},~{x16},~{x17},~{x18},~{x19},~{x20},~{x21},~{x22},~{x23},~{x24},~{x25},~{x26},~{x27},~{x28},~{fp},~{lr},~{sp}"()
                          nounwind<br class="">
                            %x.1 = add i32 %x.0, 1<br class="">
                            ret i32 %x.1<br class="">
                          }<br class="">
                          <br class="">
                          <br class="">
                          <div class="moz-cite-prefix">On 11/15/2016
                            2:58 PM, Matthias Braun wrote:<br class="">
                          </div>
                          <blockquote
                            cite="mid:61E456A4-F98C-43B7-ADE4-AFCF426B1BCB@braunis.de"
                            type="cite" class="">
                            <meta http-equiv="Content-Type"
                              content="text/html; charset=utf-8"
                              class="">
                            <div class="">I would xzr/wzr problems
                              expect to be handled by the changes in <a
                                moz-do-not-send="true"
                                href="https://reviews.llvm.org/D26106"
                                class="">https://reviews.llvm.org/D26106</a> as
                              soon as <a moz-do-not-send="true"
                                href="https://reviews.llvm.org/D26111"
                                class="">https://reviews.llvm.org/D26111</a> is
                              committed.</div>
                            <div class=""><br class="">
                            </div>
                            <div class="">- Matthias</div>
                            <div class=""><br class="">
                            </div>
                            <div class="">
                              <blockquote type="cite" class="">
                                <div class="">On Nov 15, 2016, at 11:01
                                  AM, Geoff Berry <<a
                                    moz-do-not-send="true"
                                    href="mailto:gberry@codeaurora.org"
                                    class="">gberry@codeaurora.org</a>>
                                  wrote:</div>
                                <br class="Apple-interchange-newline">
                                <div class="">
                                  <meta content="text/html;
                                    charset=utf-8"
                                    http-equiv="Content-Type" class="">
                                  <div bgcolor="#FFFFFF" text="#000000"
                                    class="">
                                    <p class="">Hi Quentin,</p>
                                    <p class="">Thanks for taking a
                                      look.  I would greatly prefer a
                                      one line change :)  My original
                                      motivation for looking into this
                                      was a case involving XZR/WZR copys
                                      on aarch64, so I would like to
                                      catch those cases as well.  I'll
                                      take a look into adding constant
                                      reg support to the register
                                      allocator unless you already know
                                      how you'd like to add that
                                      functionality.  Once we have that
                                      I can compare the two approaches
                                      to make sure there aren't more
                                      cases being missed.<br class="">
                                    </p>
                                    <div class="moz-cite-prefix">On
                                      11/14/2016 9:37 PM, Quentin
                                      Colombet wrote:<br class="">
                                    </div>
                                    <blockquote
                                      cite="mid:3B5F4700-B001-437E-A6D3-8DF06F99D7E5@apple.com"
                                      type="cite" class="">
                                      <pre class="" wrap="">Hi Geoff,

I had a closer look at the problem and in particular checked why the “hint fix-up” thing was not catching those cases.
The reason why it didn’t catch them is because we were doing the fix-up only for live-range that regalloc split not all missed hints.
Fixing that seems to solve your motivating cases (but the wzr case because this is a reserved reg and we don’t touch it, though I could improve the code to handle constant reg… let me know if you want me to have a look).

Could you try the attached patch and tell me how it works for you?
Assuming it solves the cases you wanted to catch, I’ll push it and we can drop that patch.

Cheers,
-Quentin
</pre>
                                      <br class="">
                                      <fieldset
                                        class="mimeAttachmentHeader"></fieldset>
                                      <br class="">
                                      <blockquote type="cite" class="">
                                        <pre class="" wrap="">On Nov 10, 2016, at 4:37 PM, Quentin Colombet <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:qcolombet@apple.com"><qcolombet@apple.com></a> wrote:

qcolombet added inline comments.


================
Comment at: lib/CodeGen/VirtRegMap.cpp:158
+// Map from physical regunit to list of def SlotIndexes.
+using PhysDefsMap = DenseMap<unsigned, SmallVector<SlotIndex, 4>>;
+
----------------
For consistency with the rest of LLVM source code I would use typedef.


================
Comment at: lib/CodeGen/VirtRegMap.cpp:158
+// Map from physical regunit to list of def SlotIndexes.
+using PhysDefsMap = DenseMap<unsigned, SmallVector<SlotIndex, 4>>;
+
----------------
qcolombet wrote:
</pre>
                                        <blockquote type="cite" class="">
                                          <pre class="" wrap="">For consistency with the rest of LLVM source code I would use typedef.
</pre>
                                        </blockquote>
                                        <pre class="" wrap="">Having read the code now, I believe it would be more efficient to map PhysReg to LI.
It would greatly simplify add/removeMappedPhysReg.


================
Comment at: lib/CodeGen/VirtRegMap.cpp:221
  AU.addRequired<VirtRegMap>();
+  AU.addRequired<MachineBlockFrequencyInfo>();
+  AU.addPreserved<MachineBlockFrequencyInfo>();
----------------
We could gate that dependency on DisableRenameCopys.
That being said, if we move all that code in its own pass, that's not going to be a problem :).


================
Comment at: lib/CodeGen/VirtRegMap.cpp:296
+      bool FoundDef = false;
+      for (auto DI = DefIndexes.begin(); DI != DefIndexes.end(); ++DI)
+        if (*DI == DefIndex) {
----------------
Seems inefficient to use a vector here.
At the very least it should be sorted.


================
Comment at: lib/CodeGen/VirtRegMap.cpp:336
+                                    const TargetRegisterInfo &TRI,
+                                    const VirtRegMap &VRM) {
+
----------------
Looks to me like this is reimplementing functionalities already available in the LiveRegMatrix class.
Reuse that instead.


================
Comment at: lib/CodeGen/VirtRegMap.cpp:428
+    // Check the cost/benefit of renaming.
+    BlockFrequency DefFreq = MBFI->getBlockFreq(MI->getParent());
+    BlockFrequency Benefit(DefFreq);
----------------
Wasn't the goal to propagate useless copies, i.e., that should always be beneficial, right?

If you want to do more, then please add test cases covering that. Indeed, as far as I can tell, the changes in the test cases only cover the simple case of propagating useless copies.


================
Comment at: lib/CodeGen/VirtRegMap.cpp:491
+    // reg so kill flags computation will be correct.
+    assert(LI.getNumValNums() == 1);
+    SlotIndex SrcDefPrevSlot = LI.getValNumInfo(0)->def.getPrevSlot();
----------------
Add && "Msg" in the assert.

Also explain in the comment before the assert what would it take to extend that to multiple VNs.


<a moz-do-not-send="true" class="moz-txt-link-freetext" href="https://reviews.llvm.org/D25347">https://reviews.llvm.org/D25347</a>



</pre>
                                      </blockquote>
                                    </blockquote>
                                    <br class="">
                                    <pre class="moz-signature" cols="72">-- 
Geoff Berry
Employee of Qualcomm Datacenter Technologies, Inc.
 Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.</pre>
                                  </div>
                                </div>
                              </blockquote>
                            </div>
                            <br class="">
                          </blockquote>
                          <br class="">
                          <pre class="moz-signature" cols="72">-- 
Geoff Berry
Employee of Qualcomm Datacenter Technologies, Inc.
 Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.</pre>
                        </div>
                      </div>
                    </blockquote>
                  </div>
                  <br class="">
                </blockquote>
                <br class="">
                <pre class="moz-signature" cols="72">-- 
Geoff Berry
Employee of Qualcomm Datacenter Technologies, Inc.
 Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.</pre>
              </div>
            </div>
          </blockquote>
        </div>
        <br class="">
      </div>
    </blockquote>
    <br>
    <pre class="moz-signature" cols="72">-- 
Geoff Berry
Employee of Qualcomm Datacenter Technologies, Inc.
 Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.</pre>
  </body>
</html>