[all-commits] [llvm/llvm-project] c66620: [X86][Costmodel] getMaskedMemoryOpCost(): don't sc...
Roman Lebedev via All-commits
all-commits at lists.llvm.org
Mon May 24 10:10:18 PDT 2021
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: c666208f63802cd104db2689cd72eb7a86e64a06
https://github.com/llvm/llvm-project/commit/c666208f63802cd104db2689cd72eb7a86e64a06
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2021-05-24 (Mon, 24 May 2021)
Changed paths:
M llvm/lib/Target/X86/X86TargetTransformInfo.cpp
M llvm/test/Analysis/CostModel/X86/masked-intrinsic-cost.ll
Log Message:
-----------
[X86][Costmodel] getMaskedMemoryOpCost(): don't scalarize non-power-of-two vectors with legal element type
This follows in steps of similar `getMemoryOpCost()` changes, D100099/D100684.
Intel SDM, `VPMASKMOV — Conditional SIMD Integer Packed Loads and Stores`:
```
Faults occur only due to mask-bit required memory accesses that caused the faults. Faults will not occur due to
referencing any memory location if the corresponding mask bit for that memory location is 0. For example, no
faults will be detected if the mask bits are all zero.
```
I.e., if mask is all-zeros, any address is fine.
Masked load/store's prime use-case is e.g. tail masking the loop remainder,
where for the last iteration, only first some few elements of a vector exist.
So much similarly, i don't see why must we scalarize non-power-of-two vectors,
iff the element type is something we can masked- store/load.
We simply need to legalize it, widen the mask, and be done with it.
And we even already count the cost of widening the mask.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D102990
More information about the All-commits
mailing list