[PATCH] D32343: AMDGPU: Move v_readlane lane select from VGPR to SGPR
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 21 11:57:39 PDT 2017
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:2639
+ // Special case: V_READLANE_B32 accepts only immediate or SGPR operands for
+ // lane select.
+ if (Opc == AMDGPU::V_READLANE_B32 && Src1.isReg() &&
----------------
Also mention the source is assumed to be uniform?
================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:2642
+ RI.isVGPR(MRI, Src1.getReg())) {
+ unsigned Reg = MRI.createVirtualRegister(&AMDGPU::SReg_32RegClass);
+ DebugLoc DL = MI.getDebugLoc();
----------------
SReg32_XM0
================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:2643
+ unsigned Reg = MRI.createVirtualRegister(&AMDGPU::SReg_32RegClass);
+ DebugLoc DL = MI.getDebugLoc();
+ BuildMI(*MI.getParent(), MI, DL, get(AMDGPU::V_READFIRSTLANE_B32), Reg)
----------------
const reference
================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.readlane.ll:26
+define amdgpu_kernel void @test_readlane_vregs(i32 addrspace(1)* %out, <2 x i32> addrspace(1)* %in) #1 {
+ %args = load <2 x i32>, <2 x i32> addrspace(1)* %in
+ %value = extractelement <2 x i32> %args, i32 0
----------------
Can you add a GEP on workitem ID to ensure the scalar load optimization won't ever trigger on this
https://reviews.llvm.org/D32343
More information about the llvm-commits
mailing list