[PATCH] D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic

Mon Nov 23 11:13:12 PST 2020

vdmitrie added a comment.

Current SLP has significant drawback with regard to its cost modeling. And this patch highlights it.  
Consider we have  four scalar loads  of i8 type. With prior approach (vectorization overhead)  we had cost for such entry 4  (x86 target). 
With this new approach we have two entries instead of one:  ScatterVectorize  loads + NeedToGather GEPs. And costs for these entries are 6 and 10 respectively, thus cost increased from 4 to  16.
And the problem here is once we put this pattern into the tree it pulls cost up for the entire tree. If we have multiple such patterns over the tree their effect is magnified. These entries finally outweigh possible profit  of vectorization for remaining portion of the tree and we end up not vectorizing it at all (even if downstream optimizations could probably change it into optimal code).  If SLP could make choice  vectorization overhead vs gather intrinsic based in their costs while building  vectorizable tree the outcome could be different.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90445/new/

https://reviews.llvm.org/D90445