[llvm] [AMDGPU] Document & Finalize GFX12 Memory Model (PR #98599)

Mon Aug 19 09:49:35 PDT 2024

t-tye wrote:

> > I think we could replace those SCOPE_DEV with SCOPE_SE
> > [...]
> > AFAIK L1 forwards everything to L2, so SCOPE_SE vs DEV isn't any less efficient, it'll reach L2 in any case.
> 
> I'm not entirely sure that's true, and in any case I think we shouldn't rely on such details. Let's just have the scopes follow the semantics we want. So like you said, and agent scope release should do a SCOPE_DEV writeback, an agent scope acquire should do a SCOPE_DEV invalidate.

This makes more sense to me. Release is not about invalidating, it is about "writing back" and ensuring it has completed. The WB instructions do this. Even if the cache is write through, we still have to confirm that the write through has completed to the scope we want to release to. That is were the WB instruction comes in. It does more than just trigger a write back, it also confirms the write is complete at a specified scope.

We want the hardware instruction scopes to reflect the source language semantics. Unfortunately this is not always the case and so we have to modify the scope according to the modality of the configuration in some cases. But where the scopes do reflect the language semantics, we can use them and not worry about which caches they manipulate as the hardware will make sure it controls the appropriate caches for the source language semantic action in conjunction with the hardware modal configuration.

https://github.com/llvm/llvm-project/pull/98599