[PATCH] D22489: AMDGPU/SI: Implement readlane/readfirstlane intrinsics to expose the instructions.

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 18 16:59:29 PDT 2016


arsenm added inline comments.

================
Comment at: include/llvm/IR/IntrinsicsAMDGPU.td:395
@@ -394,1 +394,3 @@
 
+// llvm.amdgcn.readfirlane src
+def int_amdgcn_readfirstlane :
----------------
Comment typo and not needed

================
Comment at: include/llvm/IR/IntrinsicsAMDGPU.td:400
@@ +399,3 @@
+
+// llvm.amdgcn.readlane
+def int_amdgcn_readlane :
----------------
Ditto

================
Comment at: lib/Target/AMDGPU/SIInstructions.td:2364-2365
@@ +2363,4 @@
+def : Pat <
+  (int_amdgcn_readfirstlane i32:$src),
+  (V_READFIRSTLANE_B32 $src)
+>;
----------------
This is a very simple pattern so should go with the instruction definition's pattern

================
Comment at: lib/Target/AMDGPU/SIInstructions.td:2371-2373
@@ +2370,5 @@
+//===----------------------------------------------------------------------===//
+def : Pat <
+  (int_amdgcn_readlane i32:$src0, i32:$src1),
+  (V_READLANE_B32 $src0, $src1)
+>;
----------------
Ditto

================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.readfirstlane.ll:7-11
@@ +6,7 @@
+; CHECK: v_readfirstlane_b32 s{{[0-9]+}}, v{{[0-9]+}}
+define void @test_readfirstlane(i32 addrspace(1)* %out, i32 %src) nounwind {
+  %readfirstlane = call i32 @llvm.amdgcn.readfirstlane(i32 %src) #0
+  store i32 %readfirstlane, i32 addrspace(1)* %out, align 4
+  ret void
+}
+
----------------
Should also include a test which has an immediate source to make sure that it is moved into a register. Another that uses inline asm to put a value in m0 would also be useful (same for the other intrinsic too)

================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.readlane.ll:8
@@ +7,3 @@
+define void @readlane_sreg(i32 addrspace(1)* %out, i32 %src0, i32 %src1) nounwind {
+  %readlane = call i32 @llvm.amdgcn.readlane(i32 %src0, i32 %src1) #0
+  store i32 %readlane, i32 addrspace(1)* %out, align 4
----------------
Attributes not needed on call site

================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.readlane.ll:15
@@ +14,3 @@
+; CHECK: v_readlane_b32 s{{[0-9]+}}, v{{[0-9]+}}, 32
+define void @readlane_imm(i32 addrspace(1)* %out, i32 %src0) nounwind {
+  %readlane = call i32 @llvm.amdgcn.readlane(i32 %src0, i32 32) #0
----------------
Use attribute group for the nounwind also


https://reviews.llvm.org/D22489





More information about the llvm-commits mailing list