[PATCH] D29105: Fix regalloc assignment of overlapping registers
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 24 16:59:27 PST 2017
rampitec created this revision.
Herald added a subscriber: wdng.
SplitEditor::defFromParent() can create a register copy. If register is a tuple of other registers and not all lanes are used a copy will be done on a full tuple regardless. Later register unit for an unused lane will be considered free and another overlapping register tuple can be assigned to a different value even though first register is live at that point. That is because interference only look at liveness info, while full register copy clobbers all lanes, even unused.
This patch fixes copy to only cover used lanes.
This is how it happens in the app I was debugging:
Before Virtual Register Rewriter:
187344B %vreg16749:sub0<def,read-undef> = V_ADD_I32_e32 %vreg12357, %vreg16754, %VCC<imp-def>, %EXEC<imp-use>; VReg_64:%vreg16749 SReg_32_XM0:%vreg12357 VGPR_32:%vreg16754
187360B %vreg16749:sub1<def> = V_ADDC_U32_e32 %vreg16742, %vreg15898, %VCC<imp-def,dead>, %VCC<imp-use>, %EXEC<imp-use>; VReg_64:%vreg16749 VGPR_32:%vreg16742,%vreg15898
...
197648B %vreg21888<def> = COPY %vreg21887; VReg_64:%vreg21888,%vreg21887
...
197988B %vreg12551<def> = FLAT_LOAD_DWORDX2 %vreg16749, 0, 0, 0, %EXEC<imp-use>, %FLAT_SCR<imp-use>; mem:LD8[%arrayidx112.i18013006(addrspace=2)](tbaa=!11) VReg_64:%vr
After Virtual Register Rewriter:
187344B %VGPR119<def> = V_ADD_I32_e32 %SGPR3, %VGPR1<kill>, %VCC<imp-def>, %EXEC<imp-use>
187360B %VGPR120<def> = V_ADDC_U32_e32 %VGPR8<kill>, %VGPR19, %VCC<imp-def,dead>, %VCC<imp-use>, %EXEC<imp-use>
...
197648B %VGPR118_VGPR119<def> = COPY %VGPR33_VGPR34
...
197988B %VGPR52_VGPR53<def> = FLAT_LOAD_DWORDX2 %VGPR119_VGPR120, 0, 0, 0, %EXEC<imp-use>, %FLAT_SCR<imp-use>; mem:LD8[%arrayidx112.i18013006(addrspace=2)](tbaa=!11)
The RA debug log excerpt:
selectOrSplit VReg_64:%vreg21888 [197648r,200388r:0) 0 at 197648r L00000001 [197648r,200388r:0) 0 at 197648r w=4.824841e-04
assigning %vreg21888 to %VGPR118_VGPR119: VGPR118 [197648r,200388r:0) 0 at 197648r
selectOrSplit VReg_64:%vreg16749 [187344r,187360r:0)[187360r,197988r:1) 0 at 187344r 1 at 187360r L00000001 [187344r,197988r:0) 0 at 187344r L00000002 [187360r,197988r:0) 0 at 187360r
assigning %vreg16749 to %VGPR119_VGPR120: VGPR119 [187344r,197988r:0) 0 at 187344r VGPR120 [187360r,197988r:0) 0 at 187360r
[%vreg16749 -> %VGPR119_VGPR120] VReg_64
[%vreg21888 -> %VGPR118_VGPR119] VReg_64
One can see that live intervals for vreg21888 and vreg16749 do overlap, but only lane 0 of %vreg21888 is used, so VGPR119 considered free. This allows rewriter to assign pair of registers 119~120 to vreg16749. Then VGPR119 is clobbered by the %VGPR118_VGPR119<def> = COPY %VGPR33_VGPR34.
After the fix the copy will become %vreg21888:sub0<def, read-undef> = COPY %vreg21887:sub0.
I’m struggling to create a small and robust testcase so far. The original testcase is more than 12000 lines or IR and MIR does not give a reproducible result. If/when I have a smaller and better testcase I will add it.
Repository:
rL LLVM
https://reviews.llvm.org/D29105
Files:
lib/CodeGen/SplitKit.cpp
Index: lib/CodeGen/SplitKit.cpp
===================================================================
--- lib/CodeGen/SplitKit.cpp
+++ lib/CodeGen/SplitKit.cpp
@@ -522,6 +522,26 @@
Def = LIS.getSlotIndexes()
->insertMachineInstrInMaps(*CopyMI, Late)
.getRegSlot();
+ if (LI->hasSubRanges()) {
+ LaneBitmask LM = LaneBitmask::getNone();
+ for (LiveInterval::SubRange &S : LI->subranges())
+ LM |= S.LaneMask;
+
+ if (MRI.getMaxLaneMaskForVReg(LI->reg) != LM) {
+ // Find subreg for the lane mask.
+ unsigned SubIdx = 0;
+ for (unsigned I = 1, E = TRI.getNumSubRegIndices(); I < E; ++I) {
+ if (TRI.getSubRegIndexLaneMask(I) == LM) {
+ SubIdx = I;
+ break;
+ }
+ }
+ assert (SubIdx != 0 && "Cannot find subreg index");
+ CopyMI->getOperand(0).setSubReg(SubIdx);
+ CopyMI->getOperand(1).setSubReg(SubIdx);
+ CopyMI->getOperand(0).setIsUndef(true);
+ }
+ }
++NumCopies;
}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D29105.85654.patch
Type: text/x-patch
Size: 1040 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170125/00790c98/attachment.bin>
More information about the llvm-commits
mailing list