[PATCH] D57059: [SLP] Initial support for the vectorization of the non-power-of-2 vectors.

Fri Nov 20 11:18:38 PST 2020

ABataev added inline comments.

================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/pr47623.ll:26
 ;
 ; AVX-LABEL: @foo(
+; AVX-NEXT:    [[TMP1:%.*]] = load i32, i32* getelementptr inbounds ([8 x i32], [8 x i32]* @b, i64 0, i64 0), align 16
----------------
craig.topper wrote:
> ABataev wrote:
> > xbolva00 wrote:
> > > Regression on avx?
> > Yes, looks like the issue with the cost of `@llvm.masked.gather` for masked gather with some undefs in the mask
> Gather is slow on CPUs prior to AVX512. And its cost is proportional to the number of elements. I don't think the value of the mask should be a factor.
True, but in some cases it can be optimized into `gather + shuffle` instead of wide `gather`, if there are undefs in mask.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57059/new/

https://reviews.llvm.org/D57059