[PATCH] D78847: [LV] Fix recording of BranchTakenCount for FoldTail

Anh Tuyen Tran via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 12 16:09:56 PDT 2020


anhtuyen added a comment.

In D78847#2031205 <https://reviews.llvm.org/D78847#2031205>, @Ayal wrote:

> In D78847#2029339 <https://reviews.llvm.org/D78847#2029339>, @anhtuyen wrote:
>
> >
>
>
> [snip]
>
> >> Right, VPWidenCanonicalIVRecipe::execute() also needs to treat VF==1 differently.
> > 
> > I looked at that, too. It still gives us the assert at a different location. We will need a little more work to do.
> > 
> >   
>
> The above implies setting both  VStart to CanonicalIV instead of splatting, and VStep to ConstantInt::set(STy, Part) instead of ConstantVector::get(Indices), when VF==1. Would doing so pass all your tests?
>
> Some of this issue stems from not using the overloaded getBroadcastInstrs().
>
> The more general issues raised are whether to apply foldTail when VF==1 in the absence of masked scalar loads/stores, and/or whether to internally turn foldTail on for small loops (due to cost considerations) when the VF and/or UF are provided externally (bypassing their cost-based selection process).


Ah, yes! That should work for me. Thanks! 
My personal preference, but I think fold-tail-by-masking should be restricted for VF>1 only.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78847/new/

https://reviews.llvm.org/D78847





More information about the llvm-commits mailing list