[PATCH] D12452: AMDGPU/SI: Add support for llvm.r600.local.size.* instrics when targeting HSA

Tom Stellard via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 28 17:15:01 PDT 2015


tstellarAMD added inline comments.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:1041
@@ +1040,3 @@
+    Offset = SI::DispatchPacketOffset::LOCAL_SIZE_X + (Dim * 4);
+    MemVT = MVT::i16;
+  } else {
----------------
arsenm wrote:
> Why is this an i16? We really don't want to have to do an argument extload, although from the tests it looks like that doesn't happen.
The extload isn't happening, because LowerParameter only does extloads for floating-point types, I can fix that.

The local size values are stored in memory as i16 values.  We could use a 32-bit non-ext load for the z value, since the next 16-bits after the z value will always be 0.

For x and y, we always load both and then mask/shift to get the value we need.  I'm not sure if 32-bit load + mask or shift is faster than 16-bit ext load.


http://reviews.llvm.org/D12452





More information about the llvm-commits mailing list