[PATCH] D146199: [LoopVectorize] Don't tail-fold for scalable VFs when there is no scalar tail

Tue Mar 21 10:15:45 PDT 2023

david-arm added a comment.

In D146199#4198834 <https://reviews.llvm.org/D146199#4198834>, @dmgreen wrote:

> I had written a very similar patch recently, but it would only use the fixed length if the scalable was unknown. The performance of it was pretty bad though, so I ended up dropping it. I had noticed that there is an xfail in llvm/test/Transforms/LoopVectorize/AArch64/eliminate-tail-predication.ll at the moment. Can it now be replaced with a check for `store <vscale x 4 x i32>`?
>
> TargetTransformInfo::isVScaleKnownToBeAPowerOfTwo isn't going to be useable from all the places that need it like instcombine. It might be best to add it to somewhere like vscale_range in the long run?

Hi @dmgreen, so I don't have a strong objection to doing this as a new vscale_power_of_2 attribute, but I am trying to avoid changing the LangRef again if we don't have a compelling case to do so yet. This is what we did originally with the vscale max, i.e. we first added a TTI interface, then as time went on we saw more and more convincing arguments for moving this to be a vscale_range attribute instead. There is nothing to stop us doing something similar in future I think, right? Of course, there is already a TLI hook of the same name that would need removing too.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146199/new/

https://reviews.llvm.org/D146199