[PATCH] D111220: [X86][LV][TTi][Costmodel] LoopVectorizer: don't use `TTI::isLegalMaskedGather()` hook, introduce `TTI::shouldUseMaskedGatherForVectorization()`
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 6 04:56:34 PDT 2021
lebedev.ri created this revision.
lebedev.ri added reviewers: RKSimon, craig.topper, fhahn, efriedma, spatel.
lebedev.ri added a project: LLVM.
Herald added subscribers: pengfei, arphaman, hiraditya.
lebedev.ri requested review of this revision.
On X86, gather/scatter story is sad. Native support appeared only in AVX2,
but even then, only in Skylake and never their performance is not abysmal.
Even in Zen3 it's rather bad. So X86 says that masked gather/scatter
are not legal (except for `+avx512 || +fast-gather`),
and `ScalarizeMaskedMemIntrin` pass expands them.
But at the same time, we can model the cost of the expanded form
of gather/scatter, via `X86TTIImpl::getGatherScatterOpCost()`,
and most often it's better than the LV's "scalarization" cost,
but since we say the gather is illegal, LV does not even query it's cost.
I think this is not optimal. I propose to add a new TTI hook,
`shouldUseMaskedGatherForVectorization()`, which defaults to `isLegalMaskedGather()`,
but is overrided on X86 to unconditionally return true iff no variable mask is needed
(i.e. the gather/scatter sequence will not require branching).
I've updated the affected tests (other than `Analysis/CostModel/X86/interleaved-*`,
those are not going to be fun.)
If this makes sense i can follow up with SLP patch.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D111220
Files:
llvm/include/llvm/Analysis/TargetTransformInfo.h
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
llvm/lib/Analysis/TargetTransformInfo.cpp
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
llvm/lib/Target/X86/X86TargetTransformInfo.h
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
llvm/test/Analysis/CostModel/X86/gather-i16-with-i8-index.ll
llvm/test/Analysis/CostModel/X86/gather-i32-with-i8-index.ll
llvm/test/Analysis/CostModel/X86/gather-i64-with-i8-index.ll
llvm/test/Analysis/CostModel/X86/gather-i8-with-i8-index.ll
llvm/test/Analysis/CostModel/X86/scatter-i16-with-i8-index.ll
llvm/test/Analysis/CostModel/X86/scatter-i32-with-i8-index.ll
llvm/test/Analysis/CostModel/X86/scatter-i64-with-i8-index.ll
llvm/test/Analysis/CostModel/X86/scatter-i8-with-i8-index.ll
llvm/test/Transforms/LoopVectorize/X86/gather-cost.ll
llvm/test/Transforms/LoopVectorize/X86/interleaving.ll
llvm/test/Transforms/LoopVectorize/X86/load-deref-pred.ll
llvm/test/Transforms/LoopVectorize/X86/parallel-loops.ll
llvm/test/Transforms/LoopVectorize/X86/strided_load_cost.ll
llvm/test/Transforms/LoopVectorize/X86/uniform_mem_op.ll
llvm/test/Transforms/LoopVectorize/X86/vector_ptr_load_store.ll
llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll
llvm/test/Transforms/LoopVectorize/X86/x86_fp80-vector-store.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D111220.377501.patch
Type: text/x-patch
Size: 368573 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211006/8192053d/attachment-0001.bin>
More information about the llvm-commits
mailing list