[all-commits] [llvm/llvm-project] a1f5fe: [NVPTX] Optimize v2x16 BUILD_VECTORs to PRMT (#116...
Fraser Cormack via All-commits
all-commits at lists.llvm.org
Tue Dec 17 02:22:41 PST 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: a1f5fe8c851ba6a0070e4cab9e7436e962677ac6
https://github.com/llvm/llvm-project/commit/a1f5fe8c851ba6a0070e4cab9e7436e962677ac6
Author: Fraser Cormack <fraser at codeplay.com>
Date: 2024-12-17 (Tue, 17 Dec 2024)
Changed paths:
M llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
M llvm/test/CodeGen/NVPTX/bf16-instructions.ll
M llvm/test/CodeGen/NVPTX/fma-relu-contract.ll
M llvm/test/CodeGen/NVPTX/fma-relu-fma-intrinsic.ll
M llvm/test/CodeGen/NVPTX/fma-relu-instruction-flag.ll
M llvm/test/CodeGen/NVPTX/i16x2-instructions.ll
Log Message:
-----------
[NVPTX] Optimize v2x16 BUILD_VECTORs to PRMT (#116675)
When two 16-bit values are combined into a v2x16 vector, and those
values are truncated come from 32-bit values, a PRMT instruction can
save registers by selecting bytes directly from the original 32-bit
values. We do this during a post-legalize DAG combine, as these
opportunities are typically only exposed after the BUILD_VECTOR's
operands have been legalized.
Additionally, if the 32-bit values are right-shifted, we can fold in the
shift by selecting higher bytes with PRMT. Only logical right-shifts by
16 are supported (for now) since those are the only situations seen in
practice. Right shifts by 16 often come up during the legalization of
EXTRACT_VECTOR_ELT.
This idea was brought up in a PR comment by @Artem-B.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list