[PATCH] D106408: Allow rematerialization of virtual reg uses

Eli Friedman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 22 12:28:23 PDT 2021


efriedma added inline comments.


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/memcall.ll:8
+; CHECK-NEXT:    .save {r4, r5, r6, r7, r8, lr}
+; CHECK-NEXT:    push.w {r4, r5, r6, r7, r8, lr}
 ; CHECK-NEXT:    cmp r2, #1
----------------
rampitec wrote:
> efriedma wrote:
> > This looks like a regression.  Not sure what's happening here; we're saving both r7 and r8, but they aren't used.  Maybe something related to the hardware loop instructions?
> Here is what happens, these two instructions are now `isTriviallyReMaterializable()`:
> ```
>   %16:rgpr = t2ADDri %8:rgpr, 15, 14, $noreg, $noreg
>   %17:rgpr = t2LSRri %16:rgpr, 4, 14, $noreg, $noreg
> ```
> MachineLICM hoists these out of the loop because of that. RA uses r8 and r8 ends up in the frame setup:
> ```
> bb.0.entry:
>   successors: %bb.1(0x50000000), %bb.2(0x30000000); %bb.1(62.50%), %bb.2(37.50%)
>   liveins: $r0, $r1, $r2, $r3, $r4, $r5, $r6, $r7, $r8, $lr
>   $sp = frame-setup t2STMDB_UPD $sp(tied-def 0), 14, $noreg, killed $r4, killed $r5, killed $r6, killed $r7, killed $r8, killed $lr
>   frame-setup CFI_INSTRUCTION def_cfa_offset 24
>   frame-setup CFI_INSTRUCTION offset $lr, -4
>   frame-setup CFI_INSTRUCTION offset $r8, -8
>   frame-setup CFI_INSTRUCTION offset $r7, -12
>   frame-setup CFI_INSTRUCTION offset $r6, -16
>   frame-setup CFI_INSTRUCTION offset $r5, -20
>   frame-setup CFI_INSTRUCTION offset $r4, -24
>   t2CMPri renamable $r2, 1, 14, $noreg, implicit-def $cpsr
>   t2Bcc %bb.2, 11, killed $cpsr
>   t2B %bb.1, 14, $noreg
> 
> bb.1.for.body.preheader:
> ; predecessors: %bb.0
>   successors: %bb.3(0x80000000); %bb.3(100.00%)
>   liveins: $r0, $r1, $r2, $r3
>   renamable $r12 = nsw t2LSLri renamable $r3, 2, 14, $noreg, $noreg
>   renamable $r4 = t2MOVi 0, 14, $noreg, $noreg
>   renamable $r7 = t2ADDri renamable $r3, 15, 14, $noreg, $noreg
>   renamable $r8 = t2LSRri killed renamable $r7, 4, 14, $noreg, $noreg
>   t2B %bb.3, 14, $noreg
> ```
> But it is eliminated by the ARM Low Overhead Loops pass:
> ```
> # *** IR Dump After ARM Low Overhead Loops pass (arm-low-overhead-loops) ***:
> 
> bb.1.for.body.preheader:
> ; predecessors: %bb.0
>   successors: %bb.2(0x80000000); %bb.2(100.00%)
>   liveins: $r0, $r1, $r2, $r3
>   renamable $r12 = nsw t2LSLri renamable $r3, 2, 14, $noreg, $noreg
>   renamable $r4, dead $cpsr = tMOVi8 0, 14, $noreg
>   tB %bb.2, 14, $noreg
> ```
> Frame setup however already done and not updated.
> 
> Maybe pass an extra argument into `isTriviallyReMaterializable()`? That could return false if any virtual registers are used for the purpose of MachineLICM and CalcSpillWeights and only return true for the regalloc/coalescer itself.
That's unfortunate.

In general, the hoisting is probably fine.  The problem here is that if the instructions are used as input to the low-overhead loop pseudo-instructions, we don't want to hoist them: they're likely to be eliminated by the LowOverheadLoops pass, so it isn't profitable.  (The low-overhead loop instructions get formed very late because they have odd restrictions on what branches are allowed.)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106408/new/

https://reviews.llvm.org/D106408



More information about the llvm-commits mailing list