[PATCH] D12452: AMDGPU/SI: Add support for llvm.r600.local.size.* instrics when targeting HSA
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 5 11:37:27 PDT 2015
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:476-477
@@ -474,3 +475,4 @@
SDLoc SL, SDValue Chain,
- unsigned Offset, bool Signed) const {
+ unsigned Offset, bool Signed,
+ unsigned BasePtrReg) const {
const DataLayout &DL = DAG.getDataLayout();
----------------
I think Signed should stay the last parameter
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:1052-1053
@@ +1051,4 @@
+
+ switch (Dim) {
+ case 0:
+ // Clear the high bits.
----------------
The cases aren't supposed to be indented
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:1059-1060
@@ +1058,4 @@
+ case 1:
+ // Get local size y from the high bits. We can use SRA here, because
+ // the max value range is 0-256, so the sign bit will always be zero.
+ Param = DAG.getNode(ISD::SRA, DL, MVT::i32, Param,
----------------
Why do you want to use SRA? I would expect SRL to be preferrable
================
Comment at: test/CodeGen/AMDGPU/work-item-intrinsics.ll:102-104
@@ +101,5 @@
+; VI-NOHSA: s_load_dword [[VAL:s[0-9]+]], s[0:1], 0x18
+; FIXME: Need to teach SelectionDAG about per-address space legalization
+; of ext loads. We should be using SMRD.
+; HSA: buffer_load_ushort [[VVAL:v[0-9]+]], s[0:3], 0 offset:4
+; GCN-NOHSA: v_mov_b32_e32 [[VVAL:v[0-9]+]], [[VAL]]
----------------
For this case specifically, it already should not happen. I added shouldReduceLoadWidth to stop producing < 32-bit type extloads. Is there somewhere missing this?
================
Comment at: test/CodeGen/AMDGPU/work-item-intrinsics.ll:113
@@ -108,3 +112,3 @@
}
; FUNC-LABEL: {{^}}local_size_y:
----------------
There should be tests that use all the combinations of local.size.x/y/z
http://reviews.llvm.org/D12452
More information about the llvm-commits
mailing list