[all-commits] [llvm/llvm-project] 70995a: [ScalarizeMaskedMemIntr] Optimize splat non-consta...
Krzysztof Drewniak via All-commits
all-commits at lists.llvm.org
Fri Aug 16 14:24:46 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 70995a1a3379ed3c21b1c5da6723f04166cb0ae6
https://github.com/llvm/llvm-project/commit/70995a1a3379ed3c21b1c5da6723f04166cb0ae6
Author: Krzysztof Drewniak <Krzysztof.Drewniak at amd.com>
Date: 2024-08-16 (Fri, 16 Aug 2024)
Changed paths:
M llvm/lib/Transforms/Scalar/ScalarizeMaskedMemIntrin.cpp
M llvm/test/CodeGen/X86/bfloat.ll
M llvm/test/CodeGen/X86/shuffle-half.ll
M llvm/test/Transforms/ScalarizeMaskedMemIntrin/X86/expand-masked-load.ll
M llvm/test/Transforms/ScalarizeMaskedMemIntrin/X86/expand-masked-store.ll
Log Message:
-----------
[ScalarizeMaskedMemIntr] Optimize splat non-constant masks (#104537)
In cases (like the ones added in the tests) where the condition of a
masked load or store is a splat but not a constant (that is, a masked
operation is being used to implement patterns like "load if the current
lane is in-bounds, otherwise return 0"), optimize the 'scalarized' code
to perform an aligned vector load/store if the splat constant is true.
Additionally, take a few steps to preserve aliasing information and
names when nothing is scalarized while I'm here.
As motivation, some LLVM IR users will genatate masked load/store in
cases that map to this kind of predicated operation (where either the
vector is loaded/stored or it isn't) in order to take advantage of
hardware primitives, but on AMDGPU, where we don't have a masked load or
store, this pass would scalarize a load or store that was intended to be
- and can be - vectorized while also introducing expensive branches.
Fixes #104520
Pre-commit tests at #104527
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list