[all-commits] [llvm/llvm-project] 8cd782: [X86][LoopVectorize] "Fix" `X86TTIImpl::getAddress...
Roman Lebedev via All-commits
all-commits at lists.llvm.org
Mon Nov 29 23:48:19 PST 2021
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 8cd782487fe68082e57d24a576b77f529d77f96c
https://github.com/llvm/llvm-project/commit/8cd782487fe68082e57d24a576b77f529d77f96c
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2021-11-30 (Tue, 30 Nov 2021)
Changed paths:
M llvm/lib/Target/X86/X86TargetTransformInfo.cpp
M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
M llvm/test/Analysis/CostModel/X86/gather-i16-with-i8-index.ll
M llvm/test/Analysis/CostModel/X86/gather-i32-with-i8-index.ll
M llvm/test/Analysis/CostModel/X86/gather-i64-with-i8-index.ll
M llvm/test/Analysis/CostModel/X86/gather-i8-with-i8-index.ll
M llvm/test/Analysis/CostModel/X86/interleaved-load-i16-stride-5.ll
M llvm/test/Analysis/CostModel/X86/masked-interleaved-load-i16.ll
M llvm/test/Analysis/CostModel/X86/masked-interleaved-store-i16.ll
M llvm/test/Analysis/CostModel/X86/masked-scatter-i32-with-i8-index.ll
M llvm/test/Analysis/CostModel/X86/masked-scatter-i64-with-i8-index.ll
M llvm/test/Analysis/CostModel/X86/scatter-i16-with-i8-index.ll
M llvm/test/Analysis/CostModel/X86/scatter-i32-with-i8-index.ll
M llvm/test/Analysis/CostModel/X86/scatter-i64-with-i8-index.ll
M llvm/test/Analysis/CostModel/X86/scatter-i8-with-i8-index.ll
M llvm/test/Transforms/LoopVectorize/X86/gather_scatter.ll
M llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll
Log Message:
-----------
[X86][LoopVectorize] "Fix" `X86TTIImpl::getAddressComputationCost()`
We ask `TTI.getAddressComputationCost()` about the cost of computing vector address,
and then multiply it by the vector width. This doesn't make any sense,
it implies that we'd do a vector GEP and then scalarize the vector of pointers,
but there is no such thing in the vectorized IR, we perform scalar GEP's.
This is *especially* bad on X86, and was effectively prohibiting any scalarized
vectorization of gathers/scatters, because `X86TTIImpl::getAddressComputationCost()`
says that cost of vector address computation is `10` as compared to `1` for scalar.
The computed costs are similar to the ones with D111222+D111220,
but we end up without masked memory intrinsics that we'd then have to
expand later on, without much luck. (D111363)
Differential Revision: https://reviews.llvm.org/D111460
More information about the All-commits
mailing list