[llvm] [AMDGPU] Set glc/slc on volatile/nontemporal SMEM loads (PR #77443)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 11 08:40:48 PST 2024


================
@@ -5813,6 +5813,18 @@ in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx6-gfx9-table`.
                                                               be reordered by
                                                               hardware.
 
+     load         *none*       *none*         - constant - !volatile & !nontemporal
+
+                                                           1. s_load/s_buffer_load
+
+                                                         - !volatile & nontemporal
+
+                                                           1. s_load/s_buffer_load glc=1 slc=1
+
+                                                         - volatile
+
+                                                           1. s_load/s_buffer_load glc=1
----------------
jayfoad wrote:

The VMEM case adds a waitcnt. Do we need to add `s_waitcnt lgkmcnt(0)` here? That still wouldn't prevent reordering of two volatile accesses, one of which used global_load and one which used s_load.

https://github.com/llvm/llvm-project/pull/77443


More information about the llvm-commits mailing list