[PATCH] D32343: AMDGPU: Move v_readlane lane select from VGPR to SGPR

Fri Apr 21 11:57:39 PDT 2017

arsenm added inline comments.

================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:2639
+  // Special case: V_READLANE_B32 accepts only immediate or SGPR operands for
+  // lane select.
+  if (Opc == AMDGPU::V_READLANE_B32 && Src1.isReg() &&
----------------
Also mention the source is assumed to be uniform?

================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:2642
+      RI.isVGPR(MRI, Src1.getReg())) {
+    unsigned Reg = MRI.createVirtualRegister(&AMDGPU::SReg_32RegClass);
+    DebugLoc DL = MI.getDebugLoc();
----------------
SReg32_XM0

================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:2643
+    unsigned Reg = MRI.createVirtualRegister(&AMDGPU::SReg_32RegClass);
+    DebugLoc DL = MI.getDebugLoc();
+    BuildMI(*MI.getParent(), MI, DL, get(AMDGPU::V_READFIRSTLANE_B32), Reg)
----------------
const reference

================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.readlane.ll:26
+define amdgpu_kernel void @test_readlane_vregs(i32 addrspace(1)* %out, <2 x i32> addrspace(1)* %in) #1 {
+  %args = load <2 x i32>, <2 x i32> addrspace(1)* %in
+  %value = extractelement <2 x i32> %args, i32 0
----------------
Can you add a GEP on workitem ID to ensure the scalar load optimization won't ever trigger on this

https://reviews.llvm.org/D32343