[PATCH] D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 10 10:59:38 PST 2020


dmgreen added inline comments.


================
Comment at: llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp:1564
         // with the VPST This should be the divergent instruction
-        MachineInstr *VCMP =
-            VCMPOpcodeToVPT(Divergent->getOpcode()) != 0 ? Divergent : nullptr;
-
-        auto ReplaceVCMPWithVPT = [&]() {
-          // Replace the VCMP with a VPT
-          MachineInstrBuilder MIB = BuildMI(
-              *Divergent->getParent(), Divergent, Divergent->getDebugLoc(),
-              TII->get(VCMPOpcodeToVPT(VCMP->getOpcode())));
-          MIB.addImm(ARMVCC::Then);
-          // Register one
-          MIB.add(VCMP->getOperand(1));
-          // Register two
-          MIB.add(VCMP->getOperand(2));
-          // The comparison code, e.g. ge, eq, lt
-          MIB.add(VCMP->getOperand(3));
-          LLVM_DEBUG(dbgs()
-                     << "ARM Loops: Combining with VCMP to VPT: " << *MIB);
-          LoLoop.BlockMasksToRecompute.insert(MIB.getInstr());
-          LoLoop.ToRemove.insert(VCMP);
-        };
+        VCMP = VCMPOpcodeToVPT(Divergent->getOpcode()) != 0 ? Divergent : VCMP;
 
----------------
samtebbs wrote:
> dmgreen wrote:
> > samtebbs wrote:
> > > dmgreen wrote:
> > > > I think we need to make sure this _is_ the last instruction in the block. Otherwise we can't remove it.
> > > Shouldn't it be removable as long as instructions after it in the block don't use the VCMP's output?
> > The instructions after the VCMP in the vpt block will use the VCMP's predicate, as opposed to the one at the start of the block.
> > So something like:
> >   VPST
> >   VCMP
> >   VORR
> >   ..
> >   VPST
> > 
> > The VORR should be using the vpr from the VCMP (and so it can't be removed), and there's probably not a lot of point combining it into the VPST, as the predicate is already set.
> > 
> > I couldn't spot anything that check for that already.
> Sure for that example the VCMP shouldn't be changed. But if the instructions after the VCMP don't use the vpr, then it can be removed right? That, rather than it being the last instruction, is the fundamental requirement I think you're getting at. This would be fine, for example:
> 
> 
> ```
> VPST
> VCMP
> ADD (scalar)
> SUB (scalar)
> VPST
> ```
Yep, that sounds fine. I meant last instruction in the VPT block, not last instruction in the basic block.

I guess put better - there cannot be any uses of vpr between the VCMP that i being removed and the VPST that is being converted into the VPT. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90935/new/

https://reviews.llvm.org/D90935



More information about the llvm-commits mailing list