[PATCH] D111174: [X86][Costmodel] Improve cost modelling for not-fully-interleaved load

Tue Oct 5 12:30:36 PDT 2021

lebedev.ri added inline comments.

================
Comment at: llvm/test/Transforms/LoopVectorize/X86/pr48340.ll:12
-; CHECK: vector.body:
-; CHECK:         [[WIDE_MASKED_GATHER0:%.*]] = call <4 x %0*> @llvm.masked.gather.v4p0s_s.v4p0p0s_s.0(<4 x %0**> [[TMP5:%.*]], i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x %0*> undef)
-; CHECK-NEXT:    [[WIDE_MASKED_GATHER1:%.*]] = call <4 x %0*> @llvm.masked.gather.v4p0s_s.v4p0p0s_s.0(<4 x %0**> [[TMP6:%.*]], i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x %0*> undef)
----------------
jeroen.dobbelaere wrote:
> lebedev.ri wrote:
> > @jeroen.dobbelaere this test broke. any suggestions how it can be made less fragile? :)
> Any idea on how to convince (force) loop-vectorize to do the vectorization ?
Oh it did vectorize alright, it just decided to do the interleaved load instead of gather.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111174/new/

https://reviews.llvm.org/D111174