[llvm] [AMDGPU] Extend wave reduce intrinsics for i32 type (PR #126469)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Mon May 5 06:37:42 PDT 2025
================
@@ -4955,13 +4977,78 @@ static MachineBasicBlock *lowerWaveReduce(MachineInstr &MI,
Register DstReg = MI.getOperand(0).getReg();
MachineBasicBlock *RetBB = nullptr;
if (isSGPR) {
- // These operations with a uniform value i.e. SGPR are idempotent.
- // Reduced value will be same as given sgpr.
- // clang-format off
- BuildMI(BB, MI, DL, TII->get(AMDGPU::S_MOV_B32), DstReg)
- .addReg(SrcReg);
- // clang-format on
- RetBB = &BB;
+ switch (Opc) {
+ case AMDGPU::S_MIN_U32:
+ case AMDGPU::S_MIN_I32:
+ case AMDGPU::S_MAX_U32:
+ case AMDGPU::S_MAX_I32:
+ case AMDGPU::S_AND_B32:
+ case AMDGPU::S_OR_B32: {
+ // Idempotent operations.
+ BuildMI(BB, MI, DL, TII->get(AMDGPU::S_MOV_B32), DstReg).addReg(SrcReg);
+ RetBB = &BB;
----------------
arsenm wrote:
I don't really follow what RetBB is for, it's always identical to BB?
https://github.com/llvm/llvm-project/pull/126469
More information about the llvm-commits
mailing list