[PATCH] D123525: [AMDGPU] On gfx908, reserve VGPR for AGPR copy based on register budget.
Mahesha S via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Apr 16 23:02:16 PDT 2022
hsmhsm added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/spill-agpr.ll:72
-; Should spill agprs to memory for both gfx908 and gfx90a.
-; GCN-LABEL: {{^}}max_5regs_used_8a:
----------------
rampitec wrote:
> No reason to split this file. You can just add new function here.
Actually, I am bit confused here - I am having difficulty in understanding about fixing it.
Basically, with change in this patch (for gfx908), @max_5regs_used_8a() crashes (asserts) while compiling for gfx908, and hence no output will be generated for the file while compiling for gfx908.
To fix this crash, we need to increase num vgprs to 6 (from 5) only for gfx908, but, not for gfx90a, which means, we need to keep two versions of the same function - one for gfx90a (existing one - @max_5regs_used_8a), and another for gfx908 (new one - @max_6regs_used_8a).
But, the problem is - since the function @max_5regs_used_8a still exist in the file, the compilation for gfx908 still crashes, and hence lit still fails for gfx908.
How to fix this without spilting the file?
================
Comment at: llvm/test/CodeGen/AMDGPU/spill-vgpr-to-agpr.ll:4
+; GFX908: register allocation failed: maximum depth for recoloring reached
; GFX900: couldn't allocate input reg for constraint 'a'
----------------
rampitec wrote:
> What exactly do we test here now? Plus this is an obvious regression.
I am confused here, not getting, what exactly is happening with this lit test.
First, forget this change, consider the original file, and focus on the compilation for gfx900. - the expectation here is that the functions @max_10_vgprs_used_9a and @max_10_vgprs_used_1a_partial_spill should fail to compile for gfx900 since there is agpr usage here, which we test in the below check line.
```
; GFX900: couldn't allocate input reg for constraint 'a'
```
But, the quirk here is, - we also test for expected assembly output for other functions (say, max_10_vgprs) while compiling for gfx900.
This is a mystery to me, because, I expected no assembly output when the compiler throws an error. However, llc has a quirk where on error, it will print the asm output if it's defaulting to stdout output from stdin, but won't if you specify -o (courtesy Matt), and we are taking advantage of this quirk nature of llc while testing this file for gfx900.
Whether above testing by taking advantage of the quirk nature of llc is right thing to do? Whether this quirk llc behavior is consistent with all types of compiler errors? which I am not sure, for example, with asserts, looks like llc will not exbibit this quirk behavior, meaning stops compilation when asserts.
Now, coming to new behavior of testing this file for gfx908 with this patch - basically llc throws below error since 10 vgprs is not enough.
```
register allocation failed: maximum depth for recoloring reached
```
To fix it, we need to increase the num of vgprs from 10 to 11 only for gfx908, but we need to keep 10 while compiling for gfx900. So, we need to keep two versions of same function one for gfx900, and another for gfx908. But, if we keep both the versions in the same file, then compilation for gfx908 still fails because vpgr 10 version is still exist in the file.
How to fix it by maintaining everything in a single file without spliting the file? For now, I took advantage of the quirk nature llc as explained above.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D123525/new/
https://reviews.llvm.org/D123525
More information about the llvm-commits
mailing list