[PATCH] D76709: [Target][ARM] Adding MVE VPT Optimisation Pass

Tue Mar 24 16:17:48 PDT 2020

dmgreen added a comment.

Is it possible to split this into two patches? The pass and "replaces VCMPs with VPNOTs when possible" part, then the second part to replace the re-use with the not.  I think that would make each part easier to review, more manageable.

> The first optimisation (VCMPs into VPNOTs) could technically be done in the MVE VPT Block insertion pass, but I think it's better to do it in this new pass instead of overloading the block Insertion pass too much. 
>  The second optimisation (VPNOT Insertion for "spill prevention") can't be done in the Block Insertion pass as it needs to be done before register allocation (= before spill/reload instructions are emitted).
>  Overall, since both optimisations deal with VPNOTs and aren't directly related to VPT Block insertion/creation, I felt that it was a good idea to put them in a separate pass.

Plus (I think) in order to do #2, it's easier if you have already done #1.

================
Comment at: llvm/lib/Target/ARM/ARMTargetMachine.cpp:489
 void ARMPassConfig::addPreRegAlloc() {
+  addPass(createMVEVPTOptimisationsPass());
+
----------------
Perhaps put this into the below getOptLevel() != CodeGenOpt::None block? As it is an optimisation

================
Comment at: llvm/lib/Target/ARM/MVEVPTOptimisationsPass.cpp:7
+//
+//===----------------------------------------------------------------------===//
+
----------------
Can you add a file comment here like in other files, explaining what the pass does, like you have in the commit message.

================
Comment at: llvm/lib/Target/ARM/MVEVPTOptimisationsPass.cpp:54
+
+static bool IsVCMP(unsigned Opcode) {
+  switch (Opcode) {
----------------
Could you just use VCMPOpcodeToVPT, and check the return value isn't 0? To save this extra list being needed here.

================
Comment at: llvm/test/CodeGen/Thumb2/mve-vcmpf.ll:700
 ; CHECK-MVEFP-NEXT:    vpt.f32 le, q1, q0
-; CHECK-MVEFP-NEXT:    vcmpt.f32 lt, q0, q1
+; CHECK-MVEFP-NEXT:    vpnott
 ; CHECK-MVEFP-NEXT:    vpnot
----------------
What's going on here? I think it needs to be checking q1 <= q0 && q0 < q1. ord means ordered, which means no NaN's. (Floating point compares can be tricky like that).

================
Comment at: llvm/test/CodeGen/Thumb2/mve-vpt-optimisations.mir:1
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -run-pass arm-mve-vpt-opts %s -o - | FileCheck %s
----------------
I think this test would be simpler if there was a number of test functions inside it, each testing one thing (or maybe a small collection of things). Otherwise they run into each other a bit, and it will be hard to tell what's wrong if one of them starts to fail.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76709/new/

https://reviews.llvm.org/D76709