[llvm-branch-commits] [llvm] [AMDGPU] Support Wave Reduction for true-16 types - 1 (PR #194809)

Matt Arsenault via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Thu Apr 30 06:06:51 PDT 2026


================
@@ -6154,9 +6165,18 @@ static MachineBasicBlock *lowerWaveReduce(MachineInstr &MI,
       BuildMI(*ComputeLoop, I, DL, TII->get(SFFOpc), FF1Reg)
           .addReg(ActiveBitsReg);
       if (is32BitOpc || is16BitOpc) {
+        Register ReadLaneSrc = SrcReg;
+        if (useRealTrue16) {
+          // Copy the 16-bit src to a 32-bit vgpr for the v_readlane
+          Register SrcReg32 =
+              MRI.createVirtualRegister(&AMDGPU::VGPR_32RegClass);
----------------
arsenm wrote:

This can't be a straight copy, you need to widen to the 32-bit register with REG_SEQUENCE + IMPLICIT_DEF or INSERT_SUBREG 

https://github.com/llvm/llvm-project/pull/194809


More information about the llvm-branch-commits mailing list