[llvm-branch-commits] [llvm] [AMDGPU] Extending wave reduction intrinsics for `i64` types - 3 (PR #151310)

via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Thu Jul 31 01:24:59 PDT 2025


================
@@ -5189,10 +5196,54 @@ static MachineBasicBlock *lowerWaveReduce(MachineInstr &MI,
                 .addReg(NewAccumulator->getOperand(0).getReg())
                 .addImm(1)
                 .setOperandDead(3); // Dead scc
-            BuildMI(BB, MI, DL, TII->get(AMDGPU::S_MUL_I32), DstReg)
-                .addReg(SrcReg)
-                .addReg(ParityRegister);
-            break;
+            if (is32BitOpc) {
+              BuildMI(BB, MI, DL, TII->get(AMDGPU::S_MUL_I32), DstReg)
+                  .addReg(SrcReg)
+                  .addReg(ParityRegister);
+              break;
+            } else {
+              Register DestSub0 =
+                  MRI.createVirtualRegister(&AMDGPU::SReg_32RegClass);
+              Register DestSub1 =
+                  MRI.createVirtualRegister(&AMDGPU::SReg_32RegClass);
+              Register Op1H_Op0L_Reg =
+                  MRI.createVirtualRegister(&AMDGPU::SReg_32RegClass);
+              Register CarryReg =
----------------
easyonaadit wrote:

Right, I can do away with the carry.
Re: `s_bfe_i32` : either ways I will be emitting 2 instrs. I can try this tho.

https://github.com/llvm/llvm-project/pull/151310


More information about the llvm-branch-commits mailing list