[PATCH] D131884: Fix subrange liveness checking at rematerialization

Nicolas Miller via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 15 05:01:22 PDT 2022


npmiller created this revision.
npmiller added reviewers: arsenm, rampitec.
Herald added subscribers: kosarev, kerbowa, hiraditya, tpr, jvesely, MatzeB.
Herald added a project: All.
npmiller requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.

This patch fixes an issue where an instruction reading a whole register would be moved during register allocation into a spot where one of the subregisters was dead.

The code to check whether an instruction can be rematerialized at a given point or not was already checking for subranges to ensure that subregisters are live, but only when the instruction being moved was using a subregister, this patch changes that so the subranges are checked even when the moved instruction uses the full register.

This patch also adds a case to the original test for the subrange checking that trigger the issue described above.

The original subrange checking code was introduced in this revision: https://reviews.llvm.org/D115278

And I've encountered this issue on AMDGPUs while working with DPC++: https://github.com/intel/llvm/issues/6209

Essentially the greedy register allocator attempts to move the following instruction:

  %3961:vreg_64 = V_LSHLREV_B64_e64 3, %3078:vreg_64, implicit $exec

>From `@3440` into the body of a loop `@16312`, but `%3078` has the following live ranges:

  %3078 [2224r,2240r:0)[2240r,3488B:1)[16192B,38336B:1) 0 at 2224r 1 at 2240r  L0000000000000003 [2224r,3440r:0) 0 at 2224r  L000000000000000C [2240r,3488B:0)[16192B,38336B:0) 0 at 2240r

So `@16312e` `%3078.sub1` is alive but `%3078.sub0` is dead, so this instruction being moved there leads to invalid memory accesses as `3078.sub0` ends up being trashed and the result of this instruction is used as part of an address calculation for a load.

On the original ticket this issue showed up on gfx906 and gfx90a but not on gfx908, this turned out to be because on gfx908 instead of moving the shift instruction into the loop, its value is spilled into an ACC register, gfx906 doesn't have ACC registers and for gfx90a ACC registers are used like regular vector registers and so aren't used for spilling.

With this patch the original application from the DPC++ ticket works properly on gfx906, and the result of the shift instruction is correctly spilled instead of moving the instruction in the loop.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D131884

Files:
  llvm/lib/CodeGen/LiveRangeEdit.cpp
  llvm/test/CodeGen/AMDGPU/remat-dead-subreg.mir


Index: llvm/test/CodeGen/AMDGPU/remat-dead-subreg.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/remat-dead-subreg.mir
+++ llvm/test/CodeGen/AMDGPU/remat-dead-subreg.mir
@@ -79,3 +79,28 @@
     %6:vreg_64 = V_MOV_B64_PSEUDO %2, implicit $exec
     S_NOP 0, implicit %1.sub0, implicit %1.sub3
 ...
+---
+name:            dead_subreg_whole_reg
+tracksRegLiveness: true
+body:             |
+  bb.0.entry:
+    ; GCN-LABEL: name: dead_subreg_whole_reg
+    ; GCN: $m0 = IMPLICIT_DEF
+    ; GCN-NEXT: renamable $sgpr0_sgpr1 = S_MOV_B64 1, implicit $m0
+    ; GCN-NEXT: renamable $sgpr2_sgpr3 = S_MOV_B64 renamable $sgpr0_sgpr1
+    ; GCN-NEXT: SI_SPILL_S64_SAVE killed renamable $sgpr2_sgpr3, %stack.0, implicit $exec, implicit $sp_reg :: (store (s64) into %stack.0, align 4, addrspace 5)
+    ; GCN-NEXT: renamable $sgpr4_sgpr5 = S_MOV_B64 2, implicit $m0
+    ; GCN-NEXT: renamable $sgpr2_sgpr3 = S_MOV_B64 3, implicit $m0
+    ; GCN-NEXT: dead %4:vgpr_32 = V_MOV_B32_e32 $sgpr0, implicit $exec, implicit killed $sgpr4_sgpr5, implicit killed $sgpr2_sgpr3
+    ; GCN-NEXT: renamable $sgpr2_sgpr3 = SI_SPILL_S64_RESTORE %stack.0, implicit $exec, implicit $sp_reg :: (load (s64) from %stack.0, align 4, addrspace 5)
+    ; GCN-NEXT: dead %5:vreg_64 = V_MOV_B64_PSEUDO killed $sgpr2_sgpr3, implicit $exec
+    ; GCN-NEXT: S_NOP 0, implicit killed renamable $sgpr0
+    $m0 = IMPLICIT_DEF
+    %0:sreg_64_xexec = S_MOV_B64 1, implicit $m0
+    %1:sreg_64 = S_MOV_B64 %0:sreg_64_xexec
+    %2:sreg_64 = S_MOV_B64 2, implicit $m0
+    %3:sreg_64 = S_MOV_B64 3, implicit $m0
+    %4:vgpr_32 = V_MOV_B32_e32 %0.sub0:sreg_64_xexec, implicit $exec, implicit %2, implicit %3
+    %5:vreg_64 = V_MOV_B64_PSEUDO %1, implicit $exec
+    S_NOP 0, implicit %0.sub0
+...
Index: llvm/lib/CodeGen/LiveRangeEdit.cpp
===================================================================
--- llvm/lib/CodeGen/LiveRangeEdit.cpp
+++ llvm/lib/CodeGen/LiveRangeEdit.cpp
@@ -134,9 +134,11 @@
       return false;
 
     // Check that subrange is live at UseIdx.
-    if (MO.getSubReg()) {
+    if (li.hasSubRanges()) {
       const TargetRegisterInfo *TRI = MRI.getTargetRegisterInfo();
-      LaneBitmask LM = TRI->getSubRegIndexLaneMask(MO.getSubReg());
+      unsigned SubReg = MO.getSubReg();
+      LaneBitmask LM = SubReg ? TRI->getSubRegIndexLaneMask(SubReg)
+                              : MRI.getMaxLaneMaskForVReg(MO.getReg());
       for (LiveInterval::SubRange &SR : li.subranges()) {
         if ((SR.LaneMask & LM).none())
           continue;


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D131884.452611.patch
Type: text/x-patch
Size: 2581 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220815/53ea9399/attachment.bin>


More information about the llvm-commits mailing list