[all-commits] [llvm/llvm-project] ba9bd2: [AMDGPU] Account for existing SDWA selections (#1...
Frederik Harwath via All-commits
all-commits at lists.llvm.org
Mon Mar 3 08:07:51 PST 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: ba9bd22e1b535a1669e3918fa77f6edaf6851d9a
https://github.com/llvm/llvm-project/commit/ba9bd22e1b535a1669e3918fa77f6edaf6851d9a
Author: Frederik Harwath <frederik.harwath at amd.com>
Date: 2025-03-03 (Mon, 03 Mar 2025)
Changed paths:
M llvm/lib/Target/AMDGPU/SIPeepholeSDWA.cpp
M llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/uaddsat.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/usubsat.ll
M llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fadd.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fadd.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fsub.ll
M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll
M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fsub.ll
M llvm/test/CodeGen/AMDGPU/idot4u.ll
M llvm/test/CodeGen/AMDGPU/local-atomicrmw-fadd.ll
M llvm/test/CodeGen/AMDGPU/local-atomicrmw-fsub.ll
M llvm/test/CodeGen/AMDGPU/permute_i8.ll
A llvm/test/CodeGen/AMDGPU/sdwa-peephole-instr-combine-sel-dst.mir
A llvm/test/CodeGen/AMDGPU/sdwa-peephole-instr-combine-sel-src.mir
M llvm/test/CodeGen/AMDGPU/sdwa-peephole-instr-combine-sel.ll
M llvm/test/CodeGen/AMDGPU/sdwa-peephole-instr-combine-sel.mir
M llvm/test/CodeGen/AMDGPU/sdwa-peephole-instr-gfx10.mir
M llvm/test/CodeGen/AMDGPU/sdwa-peephole-instr.mir
M llvm/test/CodeGen/AMDGPU/sdwa-preserve.mir
M llvm/test/CodeGen/AMDGPU/v_sat_pk_u8_i16.ll
Log Message:
-----------
[AMDGPU] Account for existing SDWA selections (#123221)
The si-peephole-sdwa pass adjusts the selections on sdwa instructions to
the selections on their operands during its conversions. For instance,
if an instruction selects `BYTE_0` and its operand selects `WORD_1`, the
combined selection should be `BYTE_2`, i.e. "`BYTE_0` of `WORD_1`". The
existing implementation does not always handle this correctly in some
complex situations with instructions across different basic blocks as
demonstrated by the test cases included in this PR.
This PR adds an additional selection combination step to the conversion
to fix this issue. It reverts the changes made by PR #123942 which had
disabled the conversion of preexisting SDWA instructions completely as a
quick fix.
---------
Co-authored-by: Jeffrey Byrnes <Jeffrey.Byrnes at amd.com>
Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list