[clang] [llvm] [mlir] [AMDGPU] Add a new amdgcn.load.to.lds intrinsic (PR #137425)
Krzysztof Drewniak via cfe-commits
cfe-commits at lists.llvm.org
Fri May 2 14:32:17 PDT 2025
krzysz00 wrote:
Re discussion on the other PR about "why is this even an intrinsic" - since this probably shouldn't just be in @jayfoad's DMs:
The reason I disagree with "just pattern-match it" is that you can't get the scheduling you want without a guarantee of the intrinssic
Namely, while
```
global_load_b32 v1, v0
ds_write_addtid_b32 v1, s0
```
is obviously
```
s_mov_b32 m0, s0
global_load_lds_b32 v0
```
if we turn that first example into
```
pipelined_loop: {
global_load_b32 v2, v0
...
waitcnt(lds only) + barrier
ds_read v*, ...
mfmas(v)
waitcnt(lds)+s_barrier
waitcnt(vmem) ;; and not substantially earlier please
ds_write_addtid_b32 v2, s0
jle pipelined_loop
}
```
for example, we really don't want that match firing because LDS gets overridden.
... *unless* we're double-buffering into LDS and so trying to do
```
pipelined_lds: {
waitcnt(vmem,lds)+barrier
load_lds(global1(iv), lds2)
do_compute(lds1)
waitcnt(vmem,lds)+barrier
load_lds(global2(iv), lds1)
do_compute(lds2) ;; We'd better not be waiting on LDS1 to settle at/before here
iv += 2
}
```
where, if the pattern match for the addtid load fails, say by waitcnt insertion, that'll cause proglems for the program
Not to mention, because we don't have an intrinsic for ds_addtid, and because there are a *lot* of ways to spell the lane ID (mbcnt, workitem.id.x with annotations, a bunch of workitem IDs mod 64, etc etc), that'll be quite fragile
So in the context of GEMM stuff, I'd rather not have this at "hope the compiler recognizes what we're trying to do". If the compiler can be made to recognize what we're trying to do reliably in the future, that'll be cool, but I can't be the one to write that patch and I don't think there's infinite bandwidth among the AMDGPU crowd for this improvement either
https://github.com/llvm/llvm-project/pull/137425
More information about the cfe-commits
mailing list