[PATCH] D87616: [ARM][LowOverheadLoops] Combine a VCMP and VPST into a VPT

Sam Parker via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 15 04:32:24 PDT 2020


samparker added inline comments.


================
Comment at: llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp:1302
+          // Find the VCMP preceding the VPST
+          if (I->getOpcode() == ARM::MVE_VCMPs8 && ++I == E)
+            VCMP = &*(--I);
----------------
Shouldn't you be searching for any VCMP opcode? RDA would be a nicer way of finding the VPR def, but that shouldn't be unnecessary anyway - I'm pretty certain the VCMP should be the 'Divergent' instruction.


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/vcmp-vpst-combination.ll:1
+; RUN: llc -mtriple=thumbv8.1m.main-none-none-eabi -mattr=+mve.fp -O3 -tail-predication=force-enabled-no-reductions -o - %s | FileCheck %s
+
----------------
Probably best not to run at -O3, just in case upstream/downstream have different optimisation pipelines.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87616/new/

https://reviews.llvm.org/D87616



More information about the llvm-commits mailing list