[PATCH] D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic
Valeriy Dmitriev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 23 11:13:12 PST 2020
vdmitrie added a comment.
Current SLP has significant drawback with regard to its cost modeling. And this patch highlights it.
Consider we have four scalar loads of i8 type. With prior approach (vectorization overhead) we had cost for such entry 4 (x86 target).
With this new approach we have two entries instead of one: ScatterVectorize loads + NeedToGather GEPs. And costs for these entries are 6 and 10 respectively, thus cost increased from 4 to 16.
And the problem here is once we put this pattern into the tree it pulls cost up for the entire tree. If we have multiple such patterns over the tree their effect is magnified. These entries finally outweigh possible profit of vectorization for remaining portion of the tree and we end up not vectorizing it at all (even if downstream optimizations could probably change it into optimal code). If SLP could make choice vectorization overhead vs gather intrinsic based in their costs while building vectorizable tree the outcome could be different.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D90445/new/
https://reviews.llvm.org/D90445
More information about the llvm-commits
mailing list