[PATCH] D21189: Create subranges for new intervals resulting from live interval splitting

Krzysztof Parzyszek via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 12 12:27:18 PDT 2016


kparzysz marked 5 inline comments as done.
kparzysz added a comment.

Regarding the "shrinkToUses" in RenameIndependentSubregs.cpp, here's the failing testcase (a lit test from AMDGPU): vreg12 is the subject to optimization here.

  # *** IR Dump Before Rename Disconnected Subregister Components ***:
  # Machine code for function insertelement_v3f32_1: Properties: <Post SSA, tracking liveness, HasVRegs>
  Function Live Ins: %SGPR0_SGPR1 in %vreg0
  
  0B      BB#0: derived from LLVM BB %0
              Live Ins: %SGPR0_SGPR1
  16B             %vreg0<def> = COPY %SGPR0_SGPR1; SReg_64:%vreg0
  32B             %vreg10:sub0_sub1<def,read-undef> = S_LOAD_DWORDX2_IMM %vreg0, 9; mem:LD8[undef(addrspace=2)](nontemporal)(invariant) SReg_128:%vreg10 SReg_64:%vreg0
  48B             %vreg5<def> = S_LOAD_DWORDX4_IMM %vreg0, 13; mem:LD16[undef(addrspace=2)](nontemporal)(invariant) SReg_128:%vreg5 SReg_64:%vreg0
  96B             %vreg10:sub3<def> = S_MOV_B32 61440; SReg_128:%vreg10
  112B            %vreg10:sub2<def> = S_MOV_B32 -1; SReg_128:%vreg10
  192B            %vreg20<def> = V_MOV_B32_e32 1084227584, %EXEC<imp-use>; VGPR_32:%vreg20
  208B            %vreg12<def> = COPY %vreg5; VReg_128:%vreg12 SReg_128:%vreg5
  224B            %vreg12:sub1<def> = COPY %vreg20<undef>; VReg_128:%vreg12 VGPR_32:%vreg20
  256B            BUFFER_STORE_DWORD_OFFSET %vreg12:sub2, %vreg10, 0, 8, 0, 0, 0, %EXEC<imp-use>; mem:ST4[%out(addrspace=1)+8](align=8) VReg_128:%vreg12 SReg_128:%vreg10
  320B            %vreg12:sub1<def> = COPY %vreg20; VReg_128:%vreg12 VGPR_32:%vreg20
  336B            BUFFER_STORE_DWORDX2_OFFSET %vreg12:sub0_sub1, %vreg10, 0, 0, 0, 0, 0, %EXEC<imp-use>; mem:ST8[%out(addrspace=1)](align=16) VReg_128:%vreg12 SReg_128:%vreg10
  352B            S_ENDPGM



  # After Rename Disconnected Subregister Components
  ********** INTERVALS **********
  SGPR0 [0B,16r:0)  0 at 0B-phi
  SGPR1 [0B,16r:0)  0 at 0B-phi
  %vreg0 [16r,48r:0)  0 at 16r
  %vreg5 [48r,208r:0)  0 at 48r
  %vreg10 [32r,96r:0)[96r,112r:1)[112r,336r:2)  0 at 32r 1 at 96r 2 at 112r L00000001 [32r,336r:0)  0 at 32r L00000002 [32r,336r:0)  0 at 32r L00000004 [112r,336r:0)  0 at 112r  undef at 32r,96r L00000008 [96r,336r:0)  0 at 96r  undef at 32r
  %vreg12 [208r,320r:0)[320r,336r:1)  0 at 208r 1 at 320r L00000001 [208r,336r:0)  0 at 208r L00000004 [208r,256r:0)  0 at 208r  undef at 320r L00000002 [208r,208d:0)[320r,336r:1)  0 at 208r 1 at 320r L00000008 [208r,224r:0)  0 at 208r  undef at 224r,320r
  %vreg20 [192r,320r:0)  0 at 192r
  %vreg24 [224r,224d:0)  0 at 224r L00000002 [224r,224d:0)  0 at 224r
  RegMasks:
  ********** MACHINEINSTRS **********
  # Machine code for function insertelement_v3f32_1: Properties: <Post SSA, tracking liveness, HasVRegs>
  Function Live Ins: %SGPR0_SGPR1 in %vreg0
  
  0B      BB#0: derived from LLVM BB %0
              Live Ins: %SGPR0_SGPR1
  16B             %vreg0<def> = COPY %SGPR0_SGPR1; SReg_64:%vreg0
  32B             %vreg10:sub0_sub1<def,read-undef> = S_LOAD_DWORDX2_IMM %vreg0, 9; mem:LD8[undef(addrspace=2)](nontemporal)(invariant) SReg_128:%vreg10 SReg_64:%vreg0
  48B             %vreg5<def> = S_LOAD_DWORDX4_IMM %vreg0, 13; mem:LD16[undef(addrspace=2)](nontemporal)(invariant) SReg_128:%vreg5 SReg_64:%vreg0
  96B             %vreg10:sub3<def> = S_MOV_B32 61440; SReg_128:%vreg10
  112B            %vreg10:sub2<def> = S_MOV_B32 -1; SReg_128:%vreg10
  192B            %vreg20<def> = V_MOV_B32_e32 1084227584, %EXEC<imp-use>; VGPR_32:%vreg20
  208B            %vreg12<def> = COPY %vreg5; VReg_128:%vreg12 SReg_128:%vreg5
  224B            %vreg24:sub1<def,read-undef,dead> = COPY %vreg20<undef>; VReg_128:%vreg24 VGPR_32:%vreg20
  256B            BUFFER_STORE_DWORD_OFFSET %vreg12:sub2, %vreg10, 0, 8, 0, 0, 0, %EXEC<imp-use>; mem:ST4[%out(addrspace=1)+8](align=8) VReg_128:%vreg12 SReg_128:%vreg10
  320B            %vreg12:sub1<def> = COPY %vreg20; VReg_128:%vreg12 VGPR_32:%vreg20
  336B            BUFFER_STORE_DWORDX2_OFFSET %vreg12:sub0_sub1, %vreg10, 0, 0, 0, 0, 0, %EXEC<imp-use>; mem:ST8[%out(addrspace=1)](align=16) VReg_128:%vreg12 SReg_128:%vreg10
  352B            S_ENDPGM
  
  # End machine code for function insertelement_v3f32_1.
  
  *** Bad machine code: Instruction ending live segment doesn't read the register ***
  - function:    insertelement_v3f32_1
  - basic block: BB#0  (0x35d33a8) [0B;368B)
  - instruction: 224B     %vreg24:sub1<def,read-undef,dead> = COPY
  - liverange:   [208r,224r:0)  0 at 208r  undef at 224r,320r
  - register:    %vreg12
  - lanemask:    00000008
  - segment:     [208r,224r:0)
  LLVM ERROR: Found 1 machine code errors.


================
Comment at: lib/CodeGen/LiveIntervalAnalysis.cpp:1557-1558
@@ -1540,2 +1556,4 @@
 void LiveIntervals::removeVRegDefAt(LiveInterval &LI, SlotIndex Pos) {
+  // LI may not have the main range computed yet, but its subranges may
+  // be present.
   VNInfo *VNI = LI.getVNInfoAt(Pos);
----------------
MatzeB wrote:
> This can only be the case during LiveRangCalc, do we really need to handle this case?
This can also happen in SplitKit.cpp, in hoistCopies.  In SplitKit, the subranges are created first, and the main ranges are created later on (in rewriteAssigned).

================
Comment at: lib/CodeGen/RenameIndependentSubregs.cpp:381-385
@@ -356,2 +380,7 @@
     LIS->constructMainRangeFromSubranges(LI);
+    // A def of a subregister may be a use of other register lanes.
+    // Removing such a def may leave some subranges to extend beyond
+    // actual live ranges.
+    LIS->shrinkToUses(&LI);
+    LI.recalculateSubRangeUndefs();
   }
----------------
MatzeB wrote:
> This is odd, why do we need to redo shrinkToUses()? Can you give an example or an detailed explanation, the comment isn't clear to me right now.
Will add a top-level comment with a failing example.

================
Comment at: lib/CodeGen/SplitKit.cpp:992
@@ -948,3 +991,3 @@
       // Check for a simply defined value that can be blitted directly.
-      ValueForcePair VFP = Values.lookup(std::make_pair(RegIdx, ParentVNI->id));
+      ValueForcePair VFP = Values.lookup({RegIdx, (unsigned)ParentVNI->id});
       if (VNInfo *VNI = VFP.getPointer()) {
----------------
MatzeB wrote:
> Isn't `ParentVNI->id` already unsigned?
Yeah, this is a leftover from some previous approach.


Repository:
  rL LLVM

http://reviews.llvm.org/D21189





More information about the llvm-commits mailing list