[PATCH] D57059: [SLP] Initial support for the vectorization of the non-power-of-2 vectors.

Alexey Bataev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 20 11:18:38 PST 2020


ABataev added inline comments.


================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/pr47623.ll:26
 ;
 ; AVX-LABEL: @foo(
+; AVX-NEXT:    [[TMP1:%.*]] = load i32, i32* getelementptr inbounds ([8 x i32], [8 x i32]* @b, i64 0, i64 0), align 16
----------------
craig.topper wrote:
> ABataev wrote:
> > xbolva00 wrote:
> > > Regression on avx?
> > Yes, looks like the issue with the cost of `@llvm.masked.gather` for masked gather with some undefs in the mask
> Gather is slow on CPUs prior to AVX512. And its cost is proportional to the number of elements. I don't think the value of the mask should be a factor.
True, but in some cases it can be optimized into `gather + shuffle` instead of wide `gather`, if there are undefs in mask.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57059/new/

https://reviews.llvm.org/D57059



More information about the llvm-commits mailing list