[PATCH] D75138: [WIP][AMDGPU] Eliminate the ScratchWaveOffset register from the calling convention

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 25 14:17:32 PST 2020


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIFrameLowering.cpp:602-603
+
+    // Save and restore SRSRC bits [48:63]. We only want to update the base
+    // address in bits [0:47].
+    BuildMI(MBB, I, DL, TII->get(AMDGPU::S_AND_B32), SavedWord)
----------------
arsenm wrote:
> Do we actually need these bits? I'm fairly confident these are always 0 in the HSA resource descriptor (or at least are a known constant we can just reproduce later)
According to this it's hardcoded: https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/master/src/core/runtime/amd_aql_queue.cpp#L1015

We just need to worry about SWIZZLE_ENABLE being set to 1. This is the high bit, so all it can do is trigger a carry on the second add. So I think that means you can get away with just doing the add, and then using s_bitset1_b32 to ensure it wasn't carried away


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75138/new/

https://reviews.llvm.org/D75138





More information about the llvm-commits mailing list