[PATCH] D87757: [SplitKit] Only copy live lanes
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 16 05:07:26 PDT 2020
foad created this revision.
Herald added subscribers: llvm-commits, kerbowa, hiraditya, tpr, nhaehnle, jvesely, qcolombet.
Herald added a project: LLVM.
foad requested review of this revision.
When splitting a live interval with subranges, only insert copies for
the lanes that are live at the point of the split. This avoids some
unnecessary copies and fixes a problem where copying dead lanes was
generating MIR that failed verification. The test case for this is
test/CodeGen/AMDGPU/splitkit-copy-live-lanes.mir.
Without this fix, some earlier live range splitting would create %430:
%430 [256r,848r:0)[848r,2584r:1) 0 at 256r 1 at 848r L0000000000000003 [848r,2584r:0) 0 at 848r L0000000000000030 [256r,2584r:0) 0 at 256r weight:1.480938e-03
...
256B undef %430.sub2:vreg_128 = V_LSHRREV_B32_e32 16, %20.sub1:vreg_128, implicit $exec
...
848B %430.sub0:vreg_128 = V_AND_B32_e32 %92:sreg_32, %20.sub1:vreg_128, implicit $exec
...
2584B %431:vreg_128 = COPY %430:vreg_128
Then RAGreedy::tryLocalSplit would split %430 into %432 and %433 just
before 848B giving:
%432 [256r,844r:0) 0 at 256r L0000000000000030 [256r,844r:0) 0 at 256r weight:3.066802e-03
%433 [844r,848r:0)[848r,2584r:1) 0 at 844r 1 at 848r L0000000000000030 [844r,2584r:0) 0 at 844r L0000000000000003 [844r,844d:0)[848r,2584r:1) 0 at 844r 1 at 848r weight:2.831776e-03
...
256B undef %432.sub2:vreg_128 = V_LSHRREV_B32_e32 16, %20.sub1:vreg_128, implicit $exec
...
844B undef %433.sub0:vreg_128 = COPY %432.sub0:vreg_128 {
internal %433.sub2:vreg_128 = COPY %432.sub2:vreg_128
848B }
%433.sub0:vreg_128 = V_AND_B32_e32 %92:sreg_32, %20.sub1:vreg_128, implicit $exec
...
2584B %431:vreg_128 = COPY %433:vreg_128
Note that the copy from %432 to %433 at 844B is a curious
bundle-without-a-BUNDLE-instruction that SplitKit creates deliberately,
and it includes a copy of .sub0 which is not live at this point, and
that causes it to fail verification:
- Bad machine code: No live subrange at use ***
- function: zextload_global_v64i16_to_v64i64
- basic block: %bb.0 (0x7faed48) [0B;2848B)
- instruction: 844B undef %433.sub0:vreg_128 = COPY %432.sub0:vreg_128
- operand 1: %432.sub0:vreg_128
- interval: %432 [256r,844r:0) 0 at 256r L0000000000000030 [256r,844r:0) 0 at 256r weight:3.066802e-03
- at: 844B
Using real bundles with a BUNDLE instruction might also fix this
problem, but the current fix is less invasive and also avoids some
unnecessary copies.
https://bugs.llvm.org/show_bug.cgi?id=47492
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D87757
Files:
llvm/lib/CodeGen/SplitKit.cpp
llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll
llvm/test/CodeGen/AMDGPU/splitkit-copy-live-lanes.mir
llvm/test/CodeGen/AMDGPU/subreg-split-live-in-error.mir
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D87757.292190.patch
Type: text/x-patch
Size: 16194 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200916/9ae34b1f/attachment.bin>
More information about the llvm-commits
mailing list