[llvm-branch-commits] [llvm] [AMDGPU] DPP wave reduction for double types - 2 (PR #189391)
Juan Manuel Martinez CaamaƱo via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Wed Apr 15 01:13:17 PDT 2026
================
@@ -6590,17 +6592,17 @@ static MachineBasicBlock *lowerWaveReduce(MachineInstr &MI,
}
FinalDPPResult = RowBcast31;
}
- if (Opc == AMDGPU::V_SUB_F32_e64) {
- Register NegatedValVGPR =
- MRI.createVirtualRegister(&AMDGPU::VGPR_32RegClass);
- BuildMI(*CurrBB, MI, DL, TII->get(AMDGPU::V_SUB_F32_e64),
+ if (MIOpc == AMDGPU::WAVE_REDUCE_FSUB_PSEUDO_F32 ||
+ MIOpc == AMDGPU::WAVE_REDUCE_FSUB_PSEUDO_F64) {
+ Register NegatedValVGPR = MRI.createVirtualRegister(SrcRegClass);
+ BuildMI(*CurrBB, MI, DL, TII->get(Opc),
NegatedValVGPR)
- .addImm(SISrcMods::NONE) // src0 mods
- .addReg(IdentityVGPR) // src0
- .addImm(SISrcMods::NONE) // src1 mods
- .addReg(IsWave32 ? RowBcast15 : RowBcast31) // src1
- .addImm(SISrcMods::NONE) // clamp
- .addImm(SISrcMods::NONE); // omod
+ .addImm(SISrcMods::NONE) // src0 mods
+ .addReg(IdentityVGPR) // src0
+ .addImm(is32BitOpc ? SISrcMods::NONE : SISrcMods::NEG) // src1 mods
+ .addReg(IsWave32 ? RowBcast15 : RowBcast31) // src1
----------------
jmmartinez wrote:
Can you add a comment reminding that for 64bit fsub the opcode used is an add. I was puzzled why we were negating it for the 64bit case.
https://github.com/llvm/llvm-project/pull/189391
More information about the llvm-branch-commits
mailing list