[llvm] [AMDGPU] Set glc bit for nontemporal loads on GFX10/11 (PR #89739)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 25 02:14:30 PDT 2024


jayfoad wrote:

> From what I understand, HIT_EVICT means that an existing line is allowed to match, while MISS_EVICT means that no match is allowed. Why would HIT_EVICT be preferred for a non-temporal operation?

My take on this is: nontemporal means that no match is expected, but if we _do_ get a match then, hey, why not take advantage of it?

> So depending on what "non-temporal" means to different use-cases, either HIT_EVICT on L1 needs to be MISS_EVICT in general for non-temporal data, or the specific "low-latency protocol" use case needs a new kind of builtin to set that policy.

Nontemporal is only a performance hint so it does not "need" to use MISS_EVICT. But I have been told already that RCCL is relying on nontemporal to do things that it is not strictly required to do.

https://github.com/llvm/llvm-project/pull/89739


More information about the llvm-commits mailing list