[PATCH] D158988: [LV] Choose the wider VF where they have same cost

Thu Aug 31 01:01:43 PDT 2023

david-arm added a reviewer: fhahn.
david-arm added a subscriber: fhahn.
david-arm added a comment.
Herald added a subscriber: StephenFan.

In D158988#4630317 <https://reviews.llvm.org/D158988#4630317>, @Allen wrote:

> In D158988#4625738 <https://reviews.llvm.org/D158988#4625738>, @kmclaughlin wrote:
>
>> Hi @Allen,
>> I recently submitted D157628 <https://reviews.llvm.org/D157628>, which lowers the cost of extends when they can be folded into a urhadd or srhadd instruction.
>> The tests I added are similar the one in this patch, so I was wondering if D157628 <https://reviews.llvm.org/D157628> may have fixed the same issue as your changes?
>
> Thanks your information, I tried your PR and find it only affected the fixed length VF, so it will still prefer the **vscale x 8** with your PR.

Adding @fhahn as a reviewer.

Which CPU are you targeting and how did you build your example? I believe that @kmclaughlin used D157628 <https://reviews.llvm.org/D157628> to show that for certain loops in x264 when tail-folding we choose a higher VF for some SVE2-enabled CPUs due to the lower cost of the zext and sext instructions. Regardless of that, I'm still a bit worried by this patch because I believe it is a very significant change that will affect all targets across a wide range of CPUs. I'm not saying this change is wrong, but can you describe in the commit message what benchmarks you have run and for what targets?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158988/new/

https://reviews.llvm.org/D158988