[PATCH] D89566: [LV] Epilogue Vectorization with Optimal Control Flow

Fri Nov 27 04:27:35 PST 2020

dmgreen added inline comments.

================
Comment at: llvm/test/Transforms/LoopVectorize/ARM/pointer_iv.ll:292

 define hidden void @pointer_phi_v16i8_add1(i8* noalias nocapture readonly %A, i8* noalias nocapture %B, i32 %y) {
 ; CHECK-LABEL: @pointer_phi_v16i8_add1(
----------------
I was surprised to see an MVE test like this chose to try and epilogue vectorize. I had presumed that would not happen on MVE - we only have a single vector width with no interleaving - the benefit of trying to do a single <8 x i8> iterations after a <16 x i8> main loop is not going to be worth the additional branching/setup we have to do, unfortunately.  I ran some extra tests and added a mve-qabs.ll test, where again the <16 x i8> loop is getting a remainder where it isn't beneficial.

I don't believe that MVE is a vector target that would ever benefit from epilogue vectorization, unfortunately. Can we get some sort of target hook that allows us to disable it? Perhaps something that sets a maximum epilogue vectorization factor given a VF * UF main loop? That would allow us to set it to none, whilst others tune it for their needs, like possibly always having the fallback as a 64bit vector under aarch64 (just a though, not sure if that's best idea or not but it at least allows targets to tune things).

================
Comment at: llvm/test/Transforms/LoopVectorize/optimal-epilog-vectorization.ll:10
+
+; Some simpler cases are found profitable even without tripple or user hints.
+; RUN: opt -passes='loop-vectorize' -epilogue-vectorization-force-VF=2 -S %s | FileCheck --check-prefix=CHECK-PROFITABLE-BY-DEFAULT %s
----------------
-> triple

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89566/new/

https://reviews.llvm.org/D89566