[PATCH] D111220: [X86][LV][TTi][Costmodel] LoopVectorizer: don't use `TTI::isLegalMaskedGather()` hook, introduce `TTI::shouldUseMaskedGatherForVectorization()`

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 6 04:56:34 PDT 2021


lebedev.ri created this revision.
lebedev.ri added reviewers: RKSimon, craig.topper, fhahn, efriedma, spatel.
lebedev.ri added a project: LLVM.
Herald added subscribers: pengfei, arphaman, hiraditya.
lebedev.ri requested review of this revision.

On X86, gather/scatter story is sad. Native support appeared only in AVX2,
but even then, only in Skylake and never their performance is not abysmal.
Even in Zen3 it's rather bad. So X86 says that masked gather/scatter
are not legal (except for `+avx512 || +fast-gather`),
and `ScalarizeMaskedMemIntrin` pass expands them.

But at the same time, we can model the cost of the expanded form
of gather/scatter, via `X86TTIImpl::getGatherScatterOpCost()`,
and most often it's better than the LV's "scalarization" cost,
but since we say the gather is illegal, LV does not even query it's cost.

I think this is not optimal. I propose to add a new TTI hook,
`shouldUseMaskedGatherForVectorization()`, which defaults to `isLegalMaskedGather()`,
but is overrided on X86 to unconditionally return true iff no variable mask is needed
(i.e. the gather/scatter sequence will not require branching).

I've updated the affected tests (other than `Analysis/CostModel/X86/interleaved-*`,
those are not going to be fun.)

If this makes sense i can follow up with SLP patch.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D111220

Files:
  llvm/include/llvm/Analysis/TargetTransformInfo.h
  llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
  llvm/lib/Analysis/TargetTransformInfo.cpp
  llvm/lib/Target/X86/X86TargetTransformInfo.cpp
  llvm/lib/Target/X86/X86TargetTransformInfo.h
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Analysis/CostModel/X86/gather-i16-with-i8-index.ll
  llvm/test/Analysis/CostModel/X86/gather-i32-with-i8-index.ll
  llvm/test/Analysis/CostModel/X86/gather-i64-with-i8-index.ll
  llvm/test/Analysis/CostModel/X86/gather-i8-with-i8-index.ll
  llvm/test/Analysis/CostModel/X86/scatter-i16-with-i8-index.ll
  llvm/test/Analysis/CostModel/X86/scatter-i32-with-i8-index.ll
  llvm/test/Analysis/CostModel/X86/scatter-i64-with-i8-index.ll
  llvm/test/Analysis/CostModel/X86/scatter-i8-with-i8-index.ll
  llvm/test/Transforms/LoopVectorize/X86/gather-cost.ll
  llvm/test/Transforms/LoopVectorize/X86/interleaving.ll
  llvm/test/Transforms/LoopVectorize/X86/load-deref-pred.ll
  llvm/test/Transforms/LoopVectorize/X86/parallel-loops.ll
  llvm/test/Transforms/LoopVectorize/X86/strided_load_cost.ll
  llvm/test/Transforms/LoopVectorize/X86/uniform_mem_op.ll
  llvm/test/Transforms/LoopVectorize/X86/vector_ptr_load_store.ll
  llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
  llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll
  llvm/test/Transforms/LoopVectorize/X86/x86_fp80-vector-store.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D111220.377501.patch
Type: text/x-patch
Size: 368573 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211006/8192053d/attachment-0001.bin>


More information about the llvm-commits mailing list