[PATCH] D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks

Sam Tebbs via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 13 02:52:13 PST 2020


samtebbs added inline comments.


================
Comment at: llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp:1595
+      auto Next = ++MachineBasicBlock::iterator(VPST);
+      bool NextIsPredicated = getVPTInstrPredicate(*Next) != ARMVCC::None;
+      bool IntermediateInstrsUseVPR = false;
----------------
dmgreen wrote:
> I think this should always be true. Maybe turn it into an assert?
👍 


================
Comment at: llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp:1597
+      bool IntermediateInstrsUseVPR = false;
+      if (VCMP) {
+        for (auto I = ++MachineBasicBlock::iterator(VCMP),
----------------
dmgreen wrote:
> The `if VCMP` can probably be hoisted to the `Insts.front()->getOpcode() == ARM::MVE_VPST` if.
👍 


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/vcmp-vpst-combination-across-blocks.mir:283
+    renamable $q0 = MVE_VORR renamable $q1, renamable $q1, 0, $noreg, killed renamable $q0
+    MVE_VPST 4, killed implicit $vpr
+    renamable $q1 = MVE_VORR killed renamable $q1, renamable $q0, 0, $noreg, renamable $q1
----------------
dmgreen wrote:
> `MVE_VPST 8` I think, for this block
👍 


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/vcmp-vpst-combination-across-blocks.mir:369
+    renamable $r0, renamable $q1 = MVE_VLDRWU32_post killed renamable $r0, 16, 1, renamable $vpr
+    MVE_VPST 8, implicit $vpr
+    renamable $vpr = MVE_VCMPf32 renamable $q1, renamable $q0, 12, 1, killed renamable $vpr
----------------
dmgreen wrote:
> Should be `MVE_VPST 4` I think
👍 


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/vcmp-vpst-combination-across-blocks.mir:371
+    renamable $vpr = MVE_VCMPf32 renamable $q1, renamable $q0, 12, 1, killed renamable $vpr
+    renamable $q0 = MVE_VORR renamable $q1, renamable $q1, 0, renamable $vpr, killed renamable $q0
+    MVE_VPST 8, implicit $vpr
----------------
dmgreen wrote:
> 0 -> 1 for "ARMVCC" operand
I made it 0 intentionally to simulate a non-predicated vpr use.


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/vcmp-vpst-combination-across-blocks.mir:6
+  define void @combine_previous() {
+  entry:
+    br label %while.body.preheader
----------------
dmgreen wrote:
> If you remove the ".entry" from "bb.0.entry" below, you can probably remove these skeletons too.
That's much cleaner


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/vcmp-vpst-combination-across-blocks.mir:170-182
+  hasPatchPoint:   false
+  stackSize:       8
+  offsetAdjustment: 0
+  maxAlignment:    4
+  adjustsStack:    false
+  hasCalls:        false
+  stackProtector:  ''
----------------
dmgreen wrote:
> A lot of this can often be removed.
Very nice


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/vcmp-vpst-combination-across-blocks.mir:347-348
+    renamable $vpr = MVE_VCMPf32 renamable $q1, renamable $q0, 12, 1, killed renamable $vpr
+    renamable $q0 = MVE_VORR killed renamable $q1, renamable $q1, 1, renamable $vpr, killed renamable $q0
+    MVE_VPST 8, killed implicit $vpr
+    t2LoopEnd renamable $lr, %bb.6, implicit-def dead $cpsr
----------------
dmgreen wrote:
> I don't think this is valid input, with the VPST not creating a block with any instruction and the MVE_VORR not being in a the previous block. Same for some of the tests below.
> 
> Maybe make sure the last VPST block has some instruction in it, and the middle VPST has the correct block mask
That's true! I've fixed the blocks and renamed the functions to make it more clear what they are testing.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90935/new/

https://reviews.llvm.org/D90935



More information about the llvm-commits mailing list