[PATCH] D88020: [SplitKit] In addDeadDef tolerate parent range that defines more lanes

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 21 05:28:07 PDT 2020


foad added a comment.

In the test case, `RAGreedy::tryLocalSplit` splits %72 at 7376r. First we have this:

  %72 [1072r,7376r:0)[7376r,7440r:1)  0 at 1072r 1 at 7376r L000000000000000C [1072r,1072d:0)[7376r,7440r:1)  0 at 1072r 1 at 7376r L00000000000000F3 [1072r,7440r:0)  0 at 1072r weight:5.969267e-04
  ...
  1072B	  %72:sgpr_128 = COPY %71:sgpr_128
  ...
  7376B	  %72.sub1:sgpr_128 = COPY %912:sreg_32_xm0_xexec
  7440B	  %558:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM %72:sgpr_128, 0, 0, 0 :: (dereferenceable invariant load 4)

Then `SplitEditor::enterIntvBefore` creates %914 and inserts the bundled copies (annoying the SlotIndexes get renumbered at this point):

  %72 [1072r,7388r:0)[7388r,7452r:1)  0 at 1072r 1 at 7388r L000000000000000C [1072r,1072d:0)[7388r,7452r:1)  0 at 1072r 1 at 7388r L00000000000000F3 [1072r,7452r:0)  0 at 1072r weight:5.969267e-04
  %914 EMPTY  0 at 7380r L00000000000000F0 [7380r,7380d:0)  0 at 7380r L0000000000000003 [7380r,7380d:0)  0 at 7380r L000000000000000C EMPTY weight:0.000000e+00
  ...
  1072B	  %72:sgpr_128 = COPY %71:sgpr_128
  ...
  7380B	  undef %914.sub2_sub3:sgpr_128 = COPY %72.sub2_sub3:sgpr_128 {
  	    internal %914.sub0:sgpr_128 = COPY %72.sub0:sgpr_128
  7388B	  }
    %72.sub1:sgpr_128 = COPY %912:sreg_32_xm0_xexec
  7452B	  %558:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM %72:sgpr_128, 0, 0, 0 :: (dereferenceable invariant load 4)

Then `SplitEditor::finish` tries to add dead defs for %914 L00000000000000F0 and L0000000000000003 but fails to find matching lane masks in the original live ranges for %72 -- which just had a single lane mask L00000000000000F3.

As an alternative fix, I wonder if //something// should have spotted that the live ranges for %914:

  %914 EMPTY  0 at 7380r L00000000000000F0 [7380r,7380d:0)  0 at 7380r L0000000000000003 [7380r,7380d:0)  0 at 7380r L000000000000000C EMPTY weight:0.000000e+00

should be simplified to:

  %914 EMPTY  0 at 7380r L00000000000000F3 [7380r,7380d:0)  0 at 7380r L000000000000000C EMPTY weight:0.000000e+00

since the subranges for LF0 and L03 are identical.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88020/new/

https://reviews.llvm.org/D88020



More information about the llvm-commits mailing list