[PATCH] D78847: [LV] Fix recording of BranchTakenCount for FoldTail

Ayal Zaks via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 12 04:47:54 PDT 2020


Ayal added a comment.

In D78847#2029339 <https://reviews.llvm.org/D78847#2029339>, @anhtuyen wrote:

>


[snip]

>> Right, VPWidenCanonicalIVRecipe::execute() also needs to treat VF==1 differently.
> 
> I looked at that, too. It still gives us the assert at a different location. We will need a little more work to do.
> 
>   

The above implies setting both  VStart to CanonicalIV instead of splatting, and VStep to ConstantInt::set(STy, Part) instead of ConstantVector::get(Indices), when VF==1. Would doing so pass all your tests?

Some of this issue stems from not using the overloaded getBroadcastInstrs().

The more general issues raised are whether to apply foldTail when VF==1 in the absence of masked scalar loads/stores, and/or whether to internally turn foldTail on for small loops (due to cost considerations) when the VF and/or UF are provided externally (bypassing their cost-based selection process).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78847/new/

https://reviews.llvm.org/D78847





More information about the llvm-commits mailing list