[llvm] [AMDGPU] Document GFX12 Memory Model (PR #98599)

Mon Aug 19 02:09:31 PDT 2024

Pierre-vh wrote:

> > Can you please review the changes I made for L1 as a buffer?
> 
> I'm confused that [c2625c2](https://github.com/llvm/llvm-project/commit/c2625c2c89529dbaffd7503c12d48ab071f4c561) changes the text about what `global_inv` are required, but does not update anything in the code sequences table.

Good catch, and you're right. I think we could replace those SCOPE_DEV with SCOPE_SE, but I'm not really convinced it's the right decision because:

- global_inv is meant as a release operation, and it makes sense that if we do a agent scope release, we should have a `global_inv scope:SCOPE_DEV`. 
- I prefer to not rely too much on the device configuration when it's not strictly needed. e.g. it's technically possible for L2 to have SCOPE_SE depending on mtype, device layout, etc. which means that for a agent release we would have to invalidate it too, and a SCOPE_SE inv won't do it. (Though this is a bit of a "whataboutism")
- AFAIK L1 forwards everything to L2, so SCOPE_SE vs DEV isn't any less efficient, it'll reach L2 in any case. 

I will bring this up with @t-tye on our next meeting. My intuition is that we should leave SCOPE_DEV, and then add a new paragraph to explain how we approach global_inv/wb emission.

https://github.com/llvm/llvm-project/pull/98599