[PATCH] D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 16 03:50:26 PST 2020


dmgreen added inline comments.


================
Comment at: llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp:1619
+          }
+          // If the instruction after the VCMP is predicated then a different code path is expected to have merged the VCMP and VPST already. This assertion protects against changed to that behaviour
+          assert(!IntermediateInstrsUseVPR && "Instructions between the VCMP and VPST are not expected to be predicated");
----------------
Can you clang-format this?


================
Comment at: llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp:1620
+          // If the instruction after the VCMP is predicated then a different code path is expected to have merged the VCMP and VPST already. This assertion protects against changed to that behaviour
+          assert(!IntermediateInstrsUseVPR && "Instructions between the VCMP and VPST are not expected to be predicated");
+          ReplaceVCMPWithVPT(VCMP, VPST);
----------------
Is it possible to have:
```
VPST
VORR vpr
VCMP vpr
..
VPST
VORR vpr
..
VPST
VORR vpr
```
And would this attempt to fold the VCMP into the last vptblock? Do we have a test for that case?



================
Comment at: llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp:1621
+          assert(!IntermediateInstrsUseVPR && "Instructions between the VCMP and VPST are not expected to be predicated");
+          ReplaceVCMPWithVPT(VCMP, VPST);
+          LLVM_DEBUG(dbgs() << "ARM Loops: Removing VPST: " << *VPST);
----------------
Do we need to do the same thing to remove kill flags as D90964?


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/vcmp-vpst-combination-across-blocks.mir:187-190
+    MVE_VPST 4, implicit $vpr
+    renamable $vpr = MVE_VCMPf32 renamable $q1, renamable $q0, 12, 1, killed renamable $vpr
+    renamable $q2 = MVE_VORR renamable $q1, renamable $q1, 0, $noreg, killed renamable $q2
+    renamable $vpr = MVE_VCMPf32 renamable $q2, renamable $q1, 12, 1, killed renamable $vpr
----------------
Some of these still look a little funny, like they would not really be valid inputs. Are they to test the various edge cases?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90935/new/

https://reviews.llvm.org/D90935



More information about the llvm-commits mailing list