[PATCH] D36862: AMDGPU: Handle non-temporal loads and stores

Tony Tye via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 17 23:33:01 PDT 2017


t-tye added inline comments.


================
Comment at: lib/Target/AMDGPU/SIMemoryLegalizer.cpp:123
+  bool enableSLCBit(const MachineBasicBlock::iterator &MI) const {
+    return enableNamedBit<AMDGPU::OpName::slc>(MI);
+  }
----------------
Do vector buffer/flat/global/scratch and scalar buffer/flat all have the slc bit? And not DS?


================
Comment at: lib/Target/AMDGPU/SIMemoryLegalizer.cpp:336-337
+  if (MOI.IsNonTemporal) {
+    // FIXME: handle non-temporal atomic loads?
+    assert(!MOI.IsAtomic);
+
----------------
According to http://llvm.org/docs/LangRef.html#load-instruction :
```
!nontemporal does not have any defined semantics for atomic loads.
```

My reading of this is that it is allowed to have nontemporal on atomic loads, but the meaning of it is not defined by LLVM. If so an assert is not correct here.

My suggestion is that AMDGPU target treats it the same as non-atomic.


================
Comment at: lib/Target/AMDGPU/SIMemoryLegalizer.cpp:376-377
+  if (MOI.IsNonTemporal) {
+    // FIXME: handle non-temporal atomic stores?
+    assert(!MOI.IsAtomic);
+
----------------
Same comment as for loads.


================
Comment at: test/CodeGen/AMDGPU/memory-legalizer-nontemporal-load.ll:1-12
+; RUN: llc -mtriple=amdgcn-amd- -mcpu=gfx803 -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx803 -verify-machineinstrs < %s | FileCheck %s
+
+; CHECK-LABEL: {{^}}nontemporal_load
+; CHECK:       flat_load_dword [[RET:v[0-9]+]], v[{{[0-9]+}}:{{[0-9]+}}] glc slc{{$}}
+define amdgpu_kernel void @nontemporal_load(
+    i32 addrspace(4)* %in, i32 addrspace(4)* %out) {
----------------
Add tests for private address space so scratch buffer instructions can be tested.

Add tests with uniform values so scalar instructions are generated.

For gfx9 could global/scratch instructions be generated?

Does the LLVM IR validator allow NonTemporal on load atomic, if so add that?


================
Comment at: test/CodeGen/AMDGPU/memory-legalizer-nontemporal-store.ll:1-11
+; RUN: llc -mtriple=amdgcn-amd- -mcpu=gfx803 -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx803 -verify-machineinstrs < %s | FileCheck %s
+
+; CHECK-LABEL: {{^}}nontemporal_store
+; CHECK:       flat_store_dword v[{{[0-9]+}}:{{[0-9]+}}], {{v[0-9]+}} glc slc{{$}}
+define amdgpu_kernel void @nontemporal_store(
+    i32 %in, i32 addrspace(4)* %out) {
----------------
Same comment as load.


https://reviews.llvm.org/D36862





More information about the llvm-commits mailing list