[llvm] [AMDGPU] Document & Finalize GFX12 Memory Model (PR #98599)

Mon Aug 19 09:34:17 PDT 2024

t-tye wrote:

> > > Can you please review the changes I made for L1 as a buffer?
> > 
> > 
> > I'm confused that [c2625c2](https://github.com/llvm/llvm-project/commit/c2625c2c89529dbaffd7503c12d48ab071f4c561) changes the text about what `global_inv` are required, but does not update anything in the code sequences table.
> 
> Good catch, and you're right. I think we could replace those SCOPE_DEV with SCOPE_SE, but I'm not really convinced it's the right decision because:
> 
>     * global_inv is meant as a release operation, and it makes sense that if we do a agent scope release, we should have a `global_inv scope:SCOPE_DEV`.
> 
>     * I prefer to not rely too much on the device configuration when it's not strictly needed. e.g. it's technically possible for L2 to have SCOPE_SE depending on mtype, device layout, etc. which means that for a agent release we would have to invalidate it too, and a SCOPE_SE inv won't do it. (Though this is a bit of a "whataboutism")
> 
>     * AFAIK L1 forwards everything to L2, so SCOPE_SE vs DEV isn't any less efficient, it'll reach L2 in any case.
> 
> 
> I will bring this up with @t-tye on our next meeting. My intuition is that we should leave SCOPE_DEV, and then add a new paragraph to explain how we approach global_inv/wb emission.

Yes let's discuss this as there are several things here that seem questionable:-)

https://github.com/llvm/llvm-project/pull/98599