[PATCH] D106408: Allow rematerialization of virtual reg uses
Eli Friedman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 22 12:28:23 PDT 2021
efriedma added inline comments.
================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/memcall.ll:8
+; CHECK-NEXT: .save {r4, r5, r6, r7, r8, lr}
+; CHECK-NEXT: push.w {r4, r5, r6, r7, r8, lr}
; CHECK-NEXT: cmp r2, #1
----------------
rampitec wrote:
> efriedma wrote:
> > This looks like a regression. Not sure what's happening here; we're saving both r7 and r8, but they aren't used. Maybe something related to the hardware loop instructions?
> Here is what happens, these two instructions are now `isTriviallyReMaterializable()`:
> ```
> %16:rgpr = t2ADDri %8:rgpr, 15, 14, $noreg, $noreg
> %17:rgpr = t2LSRri %16:rgpr, 4, 14, $noreg, $noreg
> ```
> MachineLICM hoists these out of the loop because of that. RA uses r8 and r8 ends up in the frame setup:
> ```
> bb.0.entry:
> successors: %bb.1(0x50000000), %bb.2(0x30000000); %bb.1(62.50%), %bb.2(37.50%)
> liveins: $r0, $r1, $r2, $r3, $r4, $r5, $r6, $r7, $r8, $lr
> $sp = frame-setup t2STMDB_UPD $sp(tied-def 0), 14, $noreg, killed $r4, killed $r5, killed $r6, killed $r7, killed $r8, killed $lr
> frame-setup CFI_INSTRUCTION def_cfa_offset 24
> frame-setup CFI_INSTRUCTION offset $lr, -4
> frame-setup CFI_INSTRUCTION offset $r8, -8
> frame-setup CFI_INSTRUCTION offset $r7, -12
> frame-setup CFI_INSTRUCTION offset $r6, -16
> frame-setup CFI_INSTRUCTION offset $r5, -20
> frame-setup CFI_INSTRUCTION offset $r4, -24
> t2CMPri renamable $r2, 1, 14, $noreg, implicit-def $cpsr
> t2Bcc %bb.2, 11, killed $cpsr
> t2B %bb.1, 14, $noreg
>
> bb.1.for.body.preheader:
> ; predecessors: %bb.0
> successors: %bb.3(0x80000000); %bb.3(100.00%)
> liveins: $r0, $r1, $r2, $r3
> renamable $r12 = nsw t2LSLri renamable $r3, 2, 14, $noreg, $noreg
> renamable $r4 = t2MOVi 0, 14, $noreg, $noreg
> renamable $r7 = t2ADDri renamable $r3, 15, 14, $noreg, $noreg
> renamable $r8 = t2LSRri killed renamable $r7, 4, 14, $noreg, $noreg
> t2B %bb.3, 14, $noreg
> ```
> But it is eliminated by the ARM Low Overhead Loops pass:
> ```
> # *** IR Dump After ARM Low Overhead Loops pass (arm-low-overhead-loops) ***:
>
> bb.1.for.body.preheader:
> ; predecessors: %bb.0
> successors: %bb.2(0x80000000); %bb.2(100.00%)
> liveins: $r0, $r1, $r2, $r3
> renamable $r12 = nsw t2LSLri renamable $r3, 2, 14, $noreg, $noreg
> renamable $r4, dead $cpsr = tMOVi8 0, 14, $noreg
> tB %bb.2, 14, $noreg
> ```
> Frame setup however already done and not updated.
>
> Maybe pass an extra argument into `isTriviallyReMaterializable()`? That could return false if any virtual registers are used for the purpose of MachineLICM and CalcSpillWeights and only return true for the regalloc/coalescer itself.
That's unfortunate.
In general, the hoisting is probably fine. The problem here is that if the instructions are used as input to the low-overhead loop pseudo-instructions, we don't want to hoist them: they're likely to be eliminated by the LowOverheadLoops pass, so it isn't profitable. (The low-overhead loop instructions get formed very late because they have odd restrictions on what branches are allowed.)
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D106408/new/
https://reviews.llvm.org/D106408
More information about the llvm-commits
mailing list