[PATCH] D82258: [RegisterCoalescer] Fix IMPLICIT_DEF init removal for a register on joining
Valery Pykhtin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Jun 20 07:23:07 PDT 2020
vpykhtin created this revision.
vpykhtin added reviewers: MatzeB, arsenm.
Herald added subscribers: llvm-commits, kerbowa, tpr, hiraditya, nhaehnle, wdng, jvesely, qcolombet.
Herald added a project: LLVM.
This is actually a revert of 9d7bc0874cf20f44cd331c77f5a003b4c4b262bd:
RegisterCoalescer: Do not remove IMPLICIT_DEFS if they are required for subranges.
The register coalescer used to remove implicit_defs when they are
covered by the main range anyway. With subreg liveness tracking we can't
do that anymore in places where the IMPLICIT_DEF is required as begin of
a subregister liverange.
Without this patch the bb2 of the test looks like:
bb.2:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
%0.sub1:sgpr_64 = IMPLICIT_DEF
Since there is no undef flag %0 is considered uninitialized in bb2, leading to an assert on mir validation. The debug dump (manually enhanced) shows what happend:
0B bb.0:
successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)
16B S_CBRANCH_SCC0 %bb.2, implicit undef $scc
32B bb.1:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
48B undef %0.sub0:sgpr_64 = S_MOV_B32 1
64B %0.sub1:sgpr_64 = S_MOV_B32 2
80B %1:sgpr_32 = COPY %0.sub0:sgpr_64
96B S_BRANCH %bb.3
112B bb.2:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
128B %1:sgpr_32 = IMPLICIT_DEF
144B undef %0.sub0:sgpr_64 = IMPLICIT_DEF
160B %0.sub1:sgpr_64 = IMPLICIT_DEF
176B bb.3:
; predecessors: %bb.1, %bb.2
192B S_NOP 0, implicit %1:sgpr_32
208B S_NOP 0, implicit %0:sgpr_64
# End machine code for function coalescing_makes_lane_undefined.
********** SIMPLE REGISTER COALESCING **********
********** Function: coalescing_makes_lane_undefined
********** JOINING INTERVALS ***********
:
:
80B %1:sgpr_32 = COPY %0.sub0:sgpr_64
Considering merging to SGPR_64 with %1 in %0:sub0
RHS = %1 [80r,112B:1) 1 at 80r
[128r,176B:0) 0 at 128r
[176B,192r:2) 2 at 176B-phi weight:0.000000e+00
LHS = %0 [48r,64r:1) 1 at 48r
[64r,112B:3) 3 at 64r
[144r,160r:0) 0 at 144r
[160r,176B:2) 2 at 160r
[176B,208r:4) 4 at 176B-phi
L0000000000000003
[48r,112B:1) 1 at 48r
[144r,176B:0) 0 at 144r
[176B,208r:2) 2 at 176B-phi
L000000000000000C
[64r,112B:1) 1 at 64r
[160r,176B:0) 0 at 160r
[176B,208r:2) 2 at 176B-phi weight:0.000000e+00
merge %0:0 at 144r into %1:0 at 128r --> @128r
merge %1:1 at 80r into %0:3 at 64r --> @64r
merge %1:2 at 176B into %0:4 at 176B --> @176B
RHSVals %1:sub0:
0 at 128r Write:0000000000000003 Valid:0000000000000000 Keep ImpDef Pruned -> 0 at 128r,
1 at 80r Write:0000000000000003 Valid:0000000000000003 Erase Other:3 at 64r -> 3 at 64r,
2 at 176B-phi Write:0000000000000003 Valid:0000000000000003 Merge Other:4 at 176B-phi -> 4 at 176B-phi
LHSVals %0:
0 at 144r Write:0000000000000003 Valid:0000000000000003 Erase Other:0 at 128r ImpDef -> 0 at 128r,
1 at 48r Write:0000000000000003 Valid:0000000000000003 Keep -> 1 at 48r,
2 at 160r Write:000000000000000C Valid:000000000000000F Replace Other:0 at 128r Redef:0 at 144r ImpDef -> 2 at 160r,
3 at 64r Write:000000000000000C Valid:000000000000000F Keep Redef:1 at 48r -> 3 at 64r,
4 at 176B-phi Write:FFFFFFFFFFFFFFFF Valid:FFFFFFFFFFFFFFFF Keep Other:2 at 176B-phi -> 4 at 176B-phi
LHST = %0 %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4) 0 at 144r 1 at 48r 2 at 160r 3 at 64r 4 at 176B-phi L0000000000000003 [48r,112B:1)[144r,176B:0)[176B,208r:2) 0 at 144r 1 at 48r 2 at 176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2) 0 at 160r 1 at 64r 2 at 176B-phi weight:0.000000e+00
merge %0:0 at 144r into %1:0 at 128r --> @128r
merge %1:1 at 80r into %0:1 at 48r --> @48r
merge %1:2 at 176B into %0:2 at 176B --> @176B
RHSVals %1:sub0:0000000000000003:
0 at 128r Write:0000000000000001 Valid:0000000000000000 Keep ImpDef -> 0 at 128r,
1 at 80r Write:0000000000000001 Valid:0000000000000001 Erase Other:1 at 48r -> 1 at 48r,
2 at 176B-phi Write:0000000000000001 Valid:0000000000000001 Merge Other:2 at 176B-phi -> 2 at 176B-phi
LHSVals %0:0000000000000003:
0 at 144r Write:0000000000000001 Valid:0000000000000000 Erase Other:0 at 128r ImpDef -> 0 at 128r,
1 at 48r Write:0000000000000001 Valid:0000000000000001 Keep -> 1 at 48r,
2 at 176B-phi Write:0000000000000001 Valid:0000000000000001 Keep Other:2 at 176B-phi -> 2 at 176B-phi
joined lanes: 0000000000000003
[48r,112B:1) 1 at 48r
[128r,176B:0) 0 at 128r
[176B,208r:2) 2 at 176B-phi
Joined SubRanges %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4) 0 at 144r 1 at 48r 2 at 160r 3 at 64r 4 at 176B-phi L0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2) 0 at 128r 1 at 48r 2 at 176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2) 0 at 160r 1 at 64r 2 at 176B-phi weight:0.000000e+00
Expecting instruction removal at 144r
Expecting instruction removal at 128r
Prune sublane 0000000000000003 at 128r
Expecting instruction removal at 80r
pruned all of %0 at 144r: [48r,64r:1)[64r,112B:3)[160r,176B:2)[176B,208r:4) 0 at 144r 1 at 48r 2 at 160r 3 at 64r 4 at 176B-phi
pruned %1 at 160r: [80r,112B:1)[128r,160r:0)[176B,192r:2) 0 at 128r 1 at 80r 2 at 176B-phi
erased: 144r undef %0.sub0:sgpr_64 = IMPLICIT_DEF
removed 0 at 128r: [80r,112B:1)[176B,192r:2) 0 at x 1 at 80r 2 at 176B-phi
erased: 128r %1:sgpr_32 = IMPLICIT_DEF
erased: 80r %1:sgpr_32 = COPY %0.sub0:sgpr_64
restoring liveness to 2 points: 160r,176B: %0 [48r,64r:1)[64r,112B:3)[160r,176B:2)[176B,208r:4) 0 at x 1 at 48r 2 at 160r 3 at 64r 4 at 176B-phi L0000000000000003 [48r,112B:1)[176B,208r:2) 0 at x 1 at 48r 2 at 176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2) 0 at 160r 1 at 64r 2 at 176B-phi weight:0.000000e+00
# Machine code for function coalescing_makes_lane_undefined: NoPHIs, TracksLiveness
bb.0:
successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)
S_CBRANCH_SCC0 %bb.2, implicit undef $scc
bb.1:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
undef %0.sub0:sgpr_64 = S_MOV_B32 1
%0.sub1:sgpr_64 = S_MOV_B32 2
S_BRANCH %bb.3
bb.2:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
%0.sub1:sgpr_64 = IMPLICIT_DEF
bb.3:
; predecessors: %bb.1, %bb.2
S_NOP 0, implicit %1:sgpr_32
S_NOP 0, implicit %0:sgpr_64
# End machine code for function coalescing_makes_lane_undefined.
*** Bad machine code: Reading virtual register without a def ***
- function: coalescing_makes_lane_undefined
- basic block: %bb.3 (0x66758e8)
- instruction: S_NOP 0, implicit %1:sgpr_32
- operand 1: implicit %1:sgpr_32
*** Bad machine code: Virtual register defs don't dominate all uses. ***
- function: coalescing_makes_lane_undefined
- v. register: %0
LLVM ERROR: Found 2 machine code errors.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0. Program arguments: C:\work\git\llvm-project\build\Debug\bin\llc.exe -debug-only=regalloc -march=amdgcn -mcpu=gfx803 -run-pass simple-register-coalescing -verify-machineinstrs coalescing_makes_lanes_undef.mir
1. Running pass 'Function Pass Manager' on module 'coalescing_makes_lanes_undef.mir'.
2. Running pass 'Simple Register Coalescing' on function '@coalescing_makes_lane_undefined'
It erases 144 and 128 leaving 160. This happens because 160r replaces 128r, 128r is marked pruned due to the replace and since 128r is impdef it is erased.
I think its sufficient to erase IMPLICIT_DEF on any other incoming other value - in any case the reg would be initialized, no matter of subregs involved.
With the patch the dump looks like:
0B bb.0:
successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)
16B S_CBRANCH_SCC0 %bb.2, implicit undef $scc
32B bb.1:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
48B undef %0.sub0:sgpr_64 = S_MOV_B32 1
64B %0.sub1:sgpr_64 = S_MOV_B32 2
80B %1:sgpr_32 = COPY %0.sub0:sgpr_64
96B S_BRANCH %bb.3
112B bb.2:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
128B %1:sgpr_32 = IMPLICIT_DEF
144B undef %0.sub0:sgpr_64 = IMPLICIT_DEF
160B %0.sub1:sgpr_64 = IMPLICIT_DEF
176B bb.3:
; predecessors: %bb.1, %bb.2
192B S_NOP 0, implicit %1:sgpr_32
208B S_NOP 0, implicit %0:sgpr_64
# End machine code for function coalescing_makes_lane_undefined.
********** SIMPLE REGISTER COALESCING **********
********** Function: coalescing_makes_lane_undefined
********** JOINING INTERVALS ***********
:
:
80B %1:sgpr_32 = COPY %0.sub0:sgpr_64
Considering merging to SGPR_64 with %1 in %0:sub0
RHS = %1 [80r,112B:1)[128r,176B:0)[176B,192r:2) 0 at 128r 1 at 80r 2 at 176B-phi weight:0.000000e+00
LHS = %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4) 0 at 144r 1 at 48r 2 at 160r 3 at 64r 4 at 176B-phi L0000000000000003 [48r,112B:1)[144r,176B:0)[176B,208r:2) 0 at 144r 1 at 48r 2 at 176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2) 0 at 160r 1 at 64r 2 at 176B-phi weight:0.000000e+00
merge %0:0 at 144r into %1:0 at 128r --> @128r
merge %0:2 at 160r into %1:0 at 128r --> @128r
merge %1:1 at 80r into %0:3 at 64r --> @64r
merge %1:2 at 176B into %0:4 at 176B --> @176B
RHSVals %1:sub0:
0 at 128r Write:0000000000000003 Valid:0000000000000000 Keep ImpDef -> 0 at 128r,
1 at 80r Write:0000000000000003 Valid:0000000000000003 Erase Other:3 at 64r -> 3 at 64r,
2 at 176B-phi Write:0000000000000003 Valid:0000000000000003 Merge Other:4 at 176B-phi -> 4 at 176B-phi
LHSVals %0:
0 at 144r Write:0000000000000003 Valid:0000000000000003 Erase Other:0 at 128r ImpDef -> 0 at 128r,
1 at 48r Write:0000000000000003 Valid:0000000000000003 Keep -> 1 at 48r,
2 at 160r Write:000000000000000C Valid:000000000000000F Erase Other:0 at 128r Redef:0 at 144r ImpDef -> 0 at 128r,
3 at 64r Write:000000000000000C Valid:000000000000000F Keep Redef:1 at 48r -> 3 at 64r,
4 at 176B-phi Write:FFFFFFFFFFFFFFFF Valid:FFFFFFFFFFFFFFFF Keep Other:2 at 176B-phi -> 4 at 176B-phi
LHST = %0 %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4) 0 at 144r 1 at 48r 2 at 160r 3 at 64r 4 at 176B-phi L0000000000000003 [48r,112B:1)[144r,176B:0)[176B,208r:2) 0 at 144r 1 at 48r 2 at 176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2) 0 at 160r 1 at 64r 2 at 176B-phi weight:0.000000e+00
merge %0:0 at 144r into %1:0 at 128r --> @128r
merge %1:1 at 80r into %0:1 at 48r --> @48r
merge %1:2 at 176B into %0:2 at 176B --> @176B
RHSVals %1:sub0:0000000000000003:
0 at 128r Write:0000000000000001 Valid:0000000000000000 Keep ImpDef -> 0 at 128r,
1 at 80r Write:0000000000000001 Valid:0000000000000001 Erase Other:1 at 48r -> 1 at 48r,
2 at 176B-phi Write:0000000000000001 Valid:0000000000000001 Merge Other:2 at 176B-phi -> 2 at 176B-phi
LHSVals %0:0000000000000003:
0 at 144r Write:0000000000000001 Valid:0000000000000000 Erase Other:0 at 128r ImpDef -> 0 at 128r,
1 at 48r Write:0000000000000001 Valid:0000000000000001 Keep -> 1 at 48r,
2 at 176B-phi Write:0000000000000001 Valid:0000000000000001 Keep Other:2 at 176B-phi -> 2 at 176B-phi
joined lanes: 0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2) 0 at 128r 1 at 48r 2 at 176B-phi
Joined SubRanges %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4) 0 at 144r 1 at 48r 2 at 160r 3 at 64r 4 at 176B-phi L0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2) 0 at 128r 1 at 48r 2 at 176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2) 0 at 160r 1 at 64r 2 at 176B-phi weight:0.000000e+00
Expecting instruction removal at 144r
Expecting instruction removal at 160r
Prune sublane 000000000000000C at 160r
Expecting instruction removal at 80r
erased: 144r undef %0.sub0:sgpr_64 = IMPLICIT_DEF
erased: 160r %0.sub1:sgpr_64 = IMPLICIT_DEF
erased: 80r %1:sgpr_32 = COPY %0.sub0:sgpr_64
AllocationOrder(SGPR_64) = [ $sgpr0_sgpr1 $sgpr2_sgpr3 $sgpr4_sgpr5 $sgpr6_sgpr7 $sgpr8_sgpr9 $sgpr10_sgpr11 $sgpr12_sgpr13 $sgpr14_sgpr15 $sgpr16_sgpr17 $sgpr18_sgpr19 $sgpr20_sgpr21 $sgpr22_sgpr23 $sgpr24_sgpr25 $sgpr26_sgpr27 $sgpr28_sgpr29 $sgpr30_sgpr31 $sgpr32_sgpr33 $sgpr34_sgpr35 $sgpr36_sgpr37 $sgpr38_sgpr39 $sgpr40_sgpr41 $sgpr42_sgpr43 $sgpr44_sgpr45 $sgpr46_sgpr47 $sgpr48_sgpr49 $sgpr50_sgpr51 $sgpr52_sgpr53 $sgpr54_sgpr55 $sgpr56_sgpr57 $sgpr58_sgpr59 $sgpr60_sgpr61 $sgpr62_sgpr63 $sgpr64_sgpr65 $sgpr66_sgpr67 $sgpr68_sgpr69 $sgpr70_sgpr71 $sgpr72_sgpr73 $sgpr74_sgpr75 $sgpr76_sgpr77 $sgpr78_sgpr79 $sgpr80_sgpr81 $sgpr82_sgpr83 $sgpr84_sgpr85 $sgpr86_sgpr87 $sgpr88_sgpr89 $sgpr90_sgpr91 $sgpr92_sgpr93 $sgpr94_sgpr95 $sgpr96_sgpr97 $sgpr98_sgpr99 $sgpr100_sgpr101 ]
updated: 128B undef %0.sub0:sgpr_64 = IMPLICIT_DEF
updated: 192B S_NOP 0, implicit %0.sub0:sgpr_64
Success: %1:sub0 -> %0
Result = %0 [48r,64r:1)[64r,112B:2)[128r,176B:0)[176B,208r:3) 0 at 128r 1 at 48r 2 at 64r 3 at 176B-phi L0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2) 0 at 128r 1 at 48r 2 at 176B-phi L000000000000000C [64r,112B:1)[176B,208r:2) 0 at x 1 at 64r 2 at 176B-phi weight:0.000000e+00
:
:
Trying to inflate 0 regs.
********** INTERVALS **********
%0 [48r,64r:1)[64r,112B:2)[128r,176B:0)[176B,208r:3) 0 at 128r 1 at 48r 2 at 64r 3 at 176B-phi L0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2) 0 at 128r 1 at 48r 2 at 176B-phi L000000000000000C [64r,112B:1)[176B,208r:2) 0 at x 1 at 64r 2 at 176B-phi weight:0.000000e+00
RegMasks:
********** MACHINEINSTRS **********
# Machine code for function coalescing_makes_lane_undefined: NoPHIs, TracksLiveness
0B bb.0:
successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)
16B S_CBRANCH_SCC0 %bb.2, implicit undef $scc
32B bb.1:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
48B undef %0.sub0:sgpr_64 = S_MOV_B32 1
64B %0.sub1:sgpr_64 = S_MOV_B32 2
96B S_BRANCH %bb.3
112B bb.2:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
128B undef %0.sub0:sgpr_64 = IMPLICIT_DEF
176B bb.3:
; predecessors: %bb.1, %bb.2
192B S_NOP 0, implicit %0.sub0:sgpr_64
208B S_NOP 0, implicit %0:sgpr_64
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D82258
Files:
llvm/lib/CodeGen/RegisterCoalescer.cpp
llvm/test/CodeGen/AMDGPU/coalescing_makes_lanes_undef.mir
Index: llvm/test/CodeGen/AMDGPU/coalescing_makes_lanes_undef.mir
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/coalescing_makes_lanes_undef.mir
@@ -0,0 +1,31 @@
+# RUN: llc -march=amdgcn -mcpu=gfx803 -run-pass simple-register-coalescing -verify-machineinstrs -o - %s | FileCheck %s
+
+---
+name: coalescing_makes_lane_undefined
+tracksRegLiveness: true
+body: |
+ ; CHECK-LABEL: name: coalescing_makes_lane_undefined
+ ; CHECK-LABEL: bb.2:
+ ; CHECK: undef %0.sub0:sgpr_64 = IMPLICIT_DEF
+ bb.0:
+ successors: %bb.1, %bb.2
+ S_CBRANCH_SCC0 %bb.2, implicit undef $scc
+
+ bb.1:
+ successors: %bb.3
+ undef %1.sub0:sgpr_64 = S_MOV_B32 1
+ %1.sub1:sgpr_64 = S_MOV_B32 2
+ %2:sgpr_32 = COPY %1.sub0
+ S_BRANCH %bb.3
+
+ bb.2:
+ successors: %bb.3
+ %2:sgpr_32 = IMPLICIT_DEF
+ undef %1.sub0:sgpr_64 = IMPLICIT_DEF
+ %1.sub1:sgpr_64 = IMPLICIT_DEF
+
+ bb.3:
+ S_NOP 0, implicit killed %2
+ S_NOP 0, implicit killed %1
+
+...
Index: llvm/lib/CodeGen/RegisterCoalescer.cpp
===================================================================
--- llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -2672,14 +2672,8 @@
return CR_Replace;
// Check for simple erasable conflicts.
- if (DefMI->isImplicitDef()) {
- // We need the def for the subregister if there is nothing else live at the
- // subrange at this point.
- if (TrackSubRegLiveness
- && (V.WriteLanes & (OtherV.ValidLanes | OtherV.WriteLanes)).none())
- return CR_Replace;
+ if (DefMI->isImplicitDef())
return CR_Erase;
- }
// Include the non-conflict where DefMI is a coalescable copy that kills
// OtherVNI. We still want the copy erased and value numbers merged.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D82258.272253.patch
Type: text/x-patch
Size: 1820 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200620/d83ee49c/attachment-0001.bin>
More information about the llvm-commits
mailing list