[llvm] [AMDGPU] Set glc/slc on volatile/nontemporal SMEM loads (PR #77443)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 11 08:40:48 PST 2024
================
@@ -5813,6 +5813,18 @@ in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx6-gfx9-table`.
be reordered by
hardware.
+ load *none* *none* - constant - !volatile & !nontemporal
+
+ 1. s_load/s_buffer_load
+
+ - !volatile & nontemporal
+
+ 1. s_load/s_buffer_load glc=1 slc=1
+
+ - volatile
+
+ 1. s_load/s_buffer_load glc=1
----------------
jayfoad wrote:
The VMEM case adds a waitcnt. Do we need to add `s_waitcnt lgkmcnt(0)` here? That still wouldn't prevent reordering of two volatile accesses, one of which used global_load and one which used s_load.
https://github.com/llvm/llvm-project/pull/77443
More information about the llvm-commits
mailing list