[PATCH] D12452: AMDGPU/SI: Add support for llvm.r600.local.size.* instrics when targeting HSA

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 5 11:37:27 PDT 2015


arsenm added inline comments.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:476-477
@@ -474,3 +475,4 @@
                                          SDLoc SL, SDValue Chain,
-                                         unsigned Offset, bool Signed) const {
+                                         unsigned Offset, bool Signed,
+                                         unsigned BasePtrReg) const {
   const DataLayout &DL = DAG.getDataLayout();
----------------
I think Signed should stay the last parameter

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:1052-1053
@@ +1051,4 @@
+
+    switch (Dim) {
+      case 0:
+        // Clear the high bits.
----------------
The cases aren't supposed to be indented

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:1059-1060
@@ +1058,4 @@
+      case 1:
+        // Get local size y from the high bits.  We can use SRA here, because
+        // the max value range is 0-256, so the sign bit will always be zero.
+        Param = DAG.getNode(ISD::SRA, DL, MVT::i32, Param,
----------------
Why do you want to use SRA? I would expect SRL to be preferrable

================
Comment at: test/CodeGen/AMDGPU/work-item-intrinsics.ll:102-104
@@ +101,5 @@
+; VI-NOHSA: s_load_dword [[VAL:s[0-9]+]], s[0:1], 0x18
+; FIXME: Need to teach SelectionDAG about per-address space legalization
+; of ext loads.  We should be using SMRD.
+; HSA: buffer_load_ushort [[VVAL:v[0-9]+]], s[0:3], 0 offset:4
+; GCN-NOHSA: v_mov_b32_e32 [[VVAL:v[0-9]+]], [[VAL]]
----------------
For this case specifically, it already should not happen. I added shouldReduceLoadWidth to stop producing < 32-bit type extloads. Is there somewhere missing this?

================
Comment at: test/CodeGen/AMDGPU/work-item-intrinsics.ll:113
@@ -108,3 +112,3 @@
 }
 
 ; FUNC-LABEL: {{^}}local_size_y:
----------------
There should be tests that use all the combinations of local.size.x/y/z


http://reviews.llvm.org/D12452





More information about the llvm-commits mailing list