[PATCH] D117132: AMDGPU/GlobalISel: Introduce pseudo to copy sp in call sequences

Thu Jan 13 09:59:10 PST 2022

sebastian-ne added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -global-isel -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -verify-machineinstrs -o - %s | FileCheck -enable-var-scope -check-prefix=MUBUF %s
+
----------------
sebastian-ne wrote:
> arsenm wrote:
> > sebastian-ne wrote:
> > > Could you also run tests with `-amdgpu-enable-flat-scratch`? I guess we don’t want to multiply by wavesize then
> > I thought all the offsets in scratch instructions were unswizzled, so it would still be scaled
> If I remember correctly, the stack pointer is not scaled when flat scratch is enabled.
> 
> I.e. if I understand it right, for buffer instructions we have
> ```
> sp = n * wavesize
> buffer_store voffset = sp / wavesize  ; hardware internally swizzles, so we end up with voffset = n * wavesize + laneid
> ```
> 
> with flat scratch it is
> ```
> sp = n
> scratch_store voffset = sp  ; hardware internally swizzles, so we end up with voffset = n * wavesize + laneid
> ```
> I guess we don’t want to multiply by wavesize then

I meant divide by wavesize, I confused the shifts.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117132/new/

https://reviews.llvm.org/D117132