[PATCH] D117132: AMDGPU/GlobalISel: Introduce pseudo to copy sp in call sequences

Sebastian Neubauer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 13 09:57:51 PST 2022


sebastian-ne added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -global-isel -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -verify-machineinstrs -o - %s | FileCheck -enable-var-scope -check-prefix=MUBUF %s
+
----------------
arsenm wrote:
> sebastian-ne wrote:
> > Could you also run tests with `-amdgpu-enable-flat-scratch`? I guess we don’t want to multiply by wavesize then
> I thought all the offsets in scratch instructions were unswizzled, so it would still be scaled
If I remember correctly, the stack pointer is not scaled when flat scratch is enabled.

I.e. if I understand it right, for buffer instructions we have
```
sp = n * wavesize
buffer_store voffset = sp / wavesize  ; hardware internally swizzles, so we end up with voffset = n * wavesize + laneid
```

with flat scratch it is
```
sp = n
scratch_store voffset = sp  ; hardware internally swizzles, so we end up with voffset = n * wavesize + laneid
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117132/new/

https://reviews.llvm.org/D117132



More information about the llvm-commits mailing list