[PATCH] D75138: [WIP][AMDGPU] Eliminate the ScratchWaveOffset register from the calling convention

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 25 14:17:35 PST 2020


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIFrameLowering.cpp:602-603
+
+    // Save and restore SRSRC bits [48:63]. We only want to update the base
+    // address in bits [0:47].
+    BuildMI(MBB, I, DL, TII->get(AMDGPU::S_AND_B32), SavedWord)
----------------
arsenm wrote:
> arsenm wrote:
> > Do we actually need these bits? I'm fairly confident these are always 0 in the HSA resource descriptor (or at least are a known constant we can just reproduce later)
> According to this it's hardcoded: https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/master/src/core/runtime/amd_aql_queue.cpp#L1015
> 
> We just need to worry about SWIZZLE_ENABLE being set to 1. This is the high bit, so all it can do is trigger a carry on the second add. So I think that means you can get away with just doing the add, and then using s_bitset1_b32 to ensure it wasn't carried away
Actually, I don't think any add that fits in the 48-bit address space should ever touch the high bits (although I usually manage to be wrong about known bits optimizations with adds)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75138/new/

https://reviews.llvm.org/D75138





More information about the llvm-commits mailing list