[PATCH] D78847: [LV] Fix recording of BranchTakenCount for FoldTail
Ayal Zaks via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 12 04:47:54 PDT 2020
Ayal added a comment.
In D78847#2029339 <https://reviews.llvm.org/D78847#2029339>, @anhtuyen wrote:
>
[snip]
>> Right, VPWidenCanonicalIVRecipe::execute() also needs to treat VF==1 differently.
>
> I looked at that, too. It still gives us the assert at a different location. We will need a little more work to do.
>
>
The above implies setting both VStart to CanonicalIV instead of splatting, and VStep to ConstantInt::set(STy, Part) instead of ConstantVector::get(Indices), when VF==1. Would doing so pass all your tests?
Some of this issue stems from not using the overloaded getBroadcastInstrs().
The more general issues raised are whether to apply foldTail when VF==1 in the absence of masked scalar loads/stores, and/or whether to internally turn foldTail on for small loops (due to cost considerations) when the VF and/or UF are provided externally (bypassing their cost-based selection process).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D78847/new/
https://reviews.llvm.org/D78847
More information about the llvm-commits
mailing list