[llvm-branch-commits] [llvm] [AMDGPU] Support Wave Reduction for true-16 types - 1 (PR #194809)
Matt Arsenault via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Thu Apr 30 06:06:51 PDT 2026
================
@@ -6154,9 +6165,18 @@ static MachineBasicBlock *lowerWaveReduce(MachineInstr &MI,
BuildMI(*ComputeLoop, I, DL, TII->get(SFFOpc), FF1Reg)
.addReg(ActiveBitsReg);
if (is32BitOpc || is16BitOpc) {
+ Register ReadLaneSrc = SrcReg;
+ if (useRealTrue16) {
+ // Copy the 16-bit src to a 32-bit vgpr for the v_readlane
+ Register SrcReg32 =
+ MRI.createVirtualRegister(&AMDGPU::VGPR_32RegClass);
----------------
arsenm wrote:
This can't be a straight copy, you need to widen to the 32-bit register with REG_SEQUENCE + IMPLICIT_DEF or INSERT_SUBREG
https://github.com/llvm/llvm-project/pull/194809
More information about the llvm-branch-commits
mailing list