[all-commits] [llvm/llvm-project] 25d976: [ScalarizeMaskedMemIntr] Don't use a scalar mask o...
Krzysztof Drewniak via All-commits
all-commits at lists.llvm.org
Thu Aug 22 17:03:06 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 25d976b45cb5b3d222d3a9cd94caa8a54031bbb7
https://github.com/llvm/llvm-project/commit/25d976b45cb5b3d222d3a9cd94caa8a54031bbb7
Author: Krzysztof Drewniak <Krzysztof.Drewniak at amd.com>
Date: 2024-08-22 (Thu, 22 Aug 2024)
Changed paths:
M llvm/lib/Transforms/Scalar/ScalarizeMaskedMemIntrin.cpp
M llvm/test/Transforms/ScalarizeMaskedMemIntrin/AMDGPU/expamd-masked-load.ll
M llvm/test/Transforms/ScalarizeMaskedMemIntrin/AMDGPU/expand-masked-gather.ll
M llvm/test/Transforms/ScalarizeMaskedMemIntrin/AMDGPU/expand-masked-scatter.ll
M llvm/test/Transforms/ScalarizeMaskedMemIntrin/AMDGPU/expand-masked-store.ll
Log Message:
-----------
[ScalarizeMaskedMemIntr] Don't use a scalar mask on GPUs (#104842)
ScalarizedMaskedMemIntr contains an optimization where the <N x i1> mask
is bitcast into an iN and then bit-tests with powers of two are used to
determine whether to load/store/... or not.
However, on machines with branch divergence (mainly GPUs), this is a
mis-optimization, since each i1 in the mask will be stored in a
condition register - that is, ecah of these "i1"s is likely to be a word
or two wide, making these bit operations counterproductive.
Therefore, amend this pass to skip the optimizaiton on targets that it
pessimizes.
Pre-commit tests #104645
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list