[PATCH] D88020: [SplitKit] In addDeadDef tolerate parent range that defines more lanes
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 21 05:28:07 PDT 2020
foad added a comment.
In the test case, `RAGreedy::tryLocalSplit` splits %72 at 7376r. First we have this:
%72 [1072r,7376r:0)[7376r,7440r:1) 0 at 1072r 1 at 7376r L000000000000000C [1072r,1072d:0)[7376r,7440r:1) 0 at 1072r 1 at 7376r L00000000000000F3 [1072r,7440r:0) 0 at 1072r weight:5.969267e-04
...
1072B %72:sgpr_128 = COPY %71:sgpr_128
...
7376B %72.sub1:sgpr_128 = COPY %912:sreg_32_xm0_xexec
7440B %558:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM %72:sgpr_128, 0, 0, 0 :: (dereferenceable invariant load 4)
Then `SplitEditor::enterIntvBefore` creates %914 and inserts the bundled copies (annoying the SlotIndexes get renumbered at this point):
%72 [1072r,7388r:0)[7388r,7452r:1) 0 at 1072r 1 at 7388r L000000000000000C [1072r,1072d:0)[7388r,7452r:1) 0 at 1072r 1 at 7388r L00000000000000F3 [1072r,7452r:0) 0 at 1072r weight:5.969267e-04
%914 EMPTY 0 at 7380r L00000000000000F0 [7380r,7380d:0) 0 at 7380r L0000000000000003 [7380r,7380d:0) 0 at 7380r L000000000000000C EMPTY weight:0.000000e+00
...
1072B %72:sgpr_128 = COPY %71:sgpr_128
...
7380B undef %914.sub2_sub3:sgpr_128 = COPY %72.sub2_sub3:sgpr_128 {
internal %914.sub0:sgpr_128 = COPY %72.sub0:sgpr_128
7388B }
%72.sub1:sgpr_128 = COPY %912:sreg_32_xm0_xexec
7452B %558:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM %72:sgpr_128, 0, 0, 0 :: (dereferenceable invariant load 4)
Then `SplitEditor::finish` tries to add dead defs for %914 L00000000000000F0 and L0000000000000003 but fails to find matching lane masks in the original live ranges for %72 -- which just had a single lane mask L00000000000000F3.
As an alternative fix, I wonder if //something// should have spotted that the live ranges for %914:
%914 EMPTY 0 at 7380r L00000000000000F0 [7380r,7380d:0) 0 at 7380r L0000000000000003 [7380r,7380d:0) 0 at 7380r L000000000000000C EMPTY weight:0.000000e+00
should be simplified to:
%914 EMPTY 0 at 7380r L00000000000000F3 [7380r,7380d:0) 0 at 7380r L000000000000000C EMPTY weight:0.000000e+00
since the subranges for LF0 and L03 are identical.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D88020/new/
https://reviews.llvm.org/D88020
More information about the llvm-commits
mailing list