[PATCH] D153517: [AMDGPU] ISel for amdgpu_cs_chain[_preserve] functions

Diana Picus via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 28 04:12:11 PDT 2023


rovka added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-preserve-cc.ll:16
+
+define amdgpu_cs_chain_preserve void @amdgpu_cs_chain_preserve_cc(<4 x i32> inreg %a, <4 x i32> %b) {
+  ; GISEL-GFX11-LABEL: name: amdgpu_cs_chain_preserve_cc
----------------
nhaehnle wrote:
> rovka wrote:
> > nhaehnle wrote:
> > > These functions probably need way more scratch save/restore. Actually, I'm starting to suspect that we can't reliably code-generate amdgpu_cs_chain_preserve functions that call amdgpu_gfx functions, because the amdgpu_gfx calling convention has some caller-saves registers, which just doesn't mesh well.
> > > 
> > > This probably needs a bit more thought, though it shouldn't be a big problem in practice: the purpose / intended use of the _preserve calling convention is very narrow.
> > I'm not sure I understand your comment, I assume you mean it needs more scratch save/restore so it actually preserves things, like the name says? :) If that's what you had in mind, isn't that PEI's job? (I've only just started looking into that, since I thought it's not ISel's job - but I could be wrong)
> Okay, you may be right about that, which suggests it would be good to have some tests that actually look at final assembly as well.
> 
> But there's a bigger problem: let's say `v192` is a caller-saved register in the `amdgpu_gfx` calling convention. What do we do if there's an `amdgpu_gfx` call in a _preserve function?
> 
> * Save and restore `v192` around the call inside the _preserve function unconditionally. But does this now mean that we always have to run this code with an allocation of 192 VGPRs? That's certainly bad for occupancy....
> * Do nothing. But then if the callee does use that many VGPR and clobbers `v192`, the outcome is incorrect.
> * Something else...?
> Okay, you may be right about that, which suggests it would be good to have some tests that actually look at final assembly as well.
> 
> But there's a bigger problem: let's say `v192` is a caller-saved register in the `amdgpu_gfx` calling convention. What do we do if there's an `amdgpu_gfx` call in a _preserve function?
> 
> * Save and restore `v192` around the call inside the _preserve function unconditionally. But does this now mean that we always have to run this code with an allocation of 192 VGPRs? That's certainly bad for occupancy....
> * Do nothing. But then if the callee does use that many VGPR and clobbers `v192`, the outcome is incorrect.
> * Something else...?

Hm, I see your point. I'm probably going to implement the first option initially (and people can avoid calls to amdgpu_gfx functions in sensitive code...). I'll try to keep this in mind though, and maybe we can come up with something later on.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153517/new/

https://reviews.llvm.org/D153517



More information about the llvm-commits mailing list