[llvm-branch-commits] [llvm] [AMDGPU] DPP wave reduction for double types - 2 (PR #189391)

Juan Manuel Martinez CaamaƱo via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Wed Apr 15 01:13:17 PDT 2026


================
@@ -6590,17 +6592,17 @@ static MachineBasicBlock *lowerWaveReduce(MachineInstr &MI,
         }
         FinalDPPResult = RowBcast31;
       }
-      if (Opc == AMDGPU::V_SUB_F32_e64) {
-        Register NegatedValVGPR =
-            MRI.createVirtualRegister(&AMDGPU::VGPR_32RegClass);
-        BuildMI(*CurrBB, MI, DL, TII->get(AMDGPU::V_SUB_F32_e64),
+      if (MIOpc == AMDGPU::WAVE_REDUCE_FSUB_PSEUDO_F32 ||
+          MIOpc == AMDGPU::WAVE_REDUCE_FSUB_PSEUDO_F64) {
+        Register NegatedValVGPR = MRI.createVirtualRegister(SrcRegClass);
+        BuildMI(*CurrBB, MI, DL, TII->get(Opc),
                 NegatedValVGPR)
-            .addImm(SISrcMods::NONE)                    // src0 mods
-            .addReg(IdentityVGPR)                       // src0
-            .addImm(SISrcMods::NONE)                    // src1 mods
-            .addReg(IsWave32 ? RowBcast15 : RowBcast31) // src1
-            .addImm(SISrcMods::NONE)                    // clamp
-            .addImm(SISrcMods::NONE);                   // omod
+            .addImm(SISrcMods::NONE)                               // src0 mods
+            .addReg(IdentityVGPR)                                  // src0
+            .addImm(is32BitOpc ? SISrcMods::NONE : SISrcMods::NEG) // src1 mods
+            .addReg(IsWave32 ? RowBcast15 : RowBcast31)            // src1
----------------
jmmartinez wrote:

Can you add a comment reminding that for 64bit fsub the opcode used is an add. I was puzzled why we were negating it for the 64bit case.

https://github.com/llvm/llvm-project/pull/189391


More information about the llvm-branch-commits mailing list