[PATCH] D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 12 00:06:27 PST 2020


dmgreen added inline comments.


================
Comment at: llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp:1513
 
-    if (VPTState::isEntryPredicatedOnVCTP(Block, /*exclusive*/true)) {
+    auto ReplaceVCMPWithVPT = [&](MachineInstr *At) {
+      assert(VCMP && "Replacing a removed or non-existent VCMP");
----------------
It may be better if this takes VCMP as an argument - that would help keep the logic around here clearer where things are being used.


================
Comment at: llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp:1563
         // with the VPST This should be the divergent instruction
-        MachineInstr *VCMP =
-            VCMPOpcodeToVPT(Divergent->getOpcode()) != 0 ? Divergent : nullptr;
-
-        auto ReplaceVCMPWithVPT = [&]() {
-          // Replace the VCMP with a VPT
-          MachineInstrBuilder MIB = BuildMI(
-              *Divergent->getParent(), Divergent, Divergent->getDebugLoc(),
-              TII->get(VCMPOpcodeToVPT(VCMP->getOpcode())));
-          MIB.addImm(ARMVCC::Then);
-          // Register one
-          MIB.add(VCMP->getOperand(1));
-          // Register two
-          MIB.add(VCMP->getOperand(2));
-          // The comparison code, e.g. ge, eq, lt
-          MIB.add(VCMP->getOperand(3));
-          LLVM_DEBUG(dbgs()
-                     << "ARM Loops: Combining with VCMP to VPT: " << *MIB);
-          LoLoop.BlockMasksToRecompute.insert(MIB.getInstr());
-          LoLoop.ToRemove.insert(VCMP);
-        };
+        VCMP = VCMPOpcodeToVPT(Divergent->getOpcode()) != 0 ? Divergent : VCMP;
 
----------------
I feel like this should be using ` : nullptr`, even if just to keep it simpler. From what I can tell the older VCMP wouldn't ever be useful after the point that we have seen something that has altered the vpr predicate.


================
Comment at: llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp:1600
+             I != E; ++I) {
+          if (getVPTInstrPredicate(*I) != ARMVCC::None) {
+            IntermediateInstrsUseVPR = true;
----------------
Checking that an operand is vpr may be better.


================
Comment at: llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp:1618-1619
+      } else if (!NextIsPredicated) {
+        LLVM_DEBUG(dbgs() << "ARM Loops: Removing VPST: " << *VPST);
+        LoLoop.ToRemove.insert(VPST);
+      }
----------------
I _think_ this case would be invalid input, and not something we would have to worry about (it could be an assert - ideally there would be code in the verifier that checked that VPT blocks were always valid, but no one has written that yet).


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/vcmp-vpst-combination-across-blocks.mir:6
+  define void @combine_previous() {
+  entry:
+    br label %while.body.preheader
----------------
If you remove the ".entry" from "bb.0.entry" below, you can probably remove these skeletons too.


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/vcmp-vpst-combination-across-blocks.mir:170-182
+  hasPatchPoint:   false
+  stackSize:       8
+  offsetAdjustment: 0
+  maxAlignment:    4
+  adjustsStack:    false
+  hasCalls:        false
+  stackProtector:  ''
----------------
A lot of this can often be removed.


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/vcmp-vpst-combination-across-blocks.mir:347-348
+    renamable $vpr = MVE_VCMPf32 renamable $q1, renamable $q0, 12, 1, killed renamable $vpr
+    renamable $q0 = MVE_VORR killed renamable $q1, renamable $q1, 1, renamable $vpr, killed renamable $q0
+    MVE_VPST 8, killed implicit $vpr
+    t2LoopEnd renamable $lr, %bb.6, implicit-def dead $cpsr
----------------
I don't think this is valid input, with the VPST not creating a block with any instruction and the MVE_VORR not being in a the previous block. Same for some of the tests below.

Maybe make sure the last VPST block has some instruction in it, and the middle VPST has the correct block mask


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90935/new/

https://reviews.llvm.org/D90935



More information about the llvm-commits mailing list