[llvm] [NVPTX] Add Intrinsics for discard.* (PR #128404)

via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 23 04:24:00 PST 2025


================
@@ -630,6 +630,31 @@ uses and eviction priority which can be accessed by the '``.level::eviction_prio
 For more information, refer to the PTX ISA
 `<https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-prefetch-prefetchu>`_.
 
+``llvm.nvvm.discard.*``'
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+  declare void  @llvm.nvvm.discard.global.L2(ptr addrspace(1) %global_ptr, i64 %size)
+  declare void  @llvm.nvvm.discard.L2(ptr %ptr, i64 %size)
+
+Overview:
+"""""""""
+
+The '``@llvm.nvvm.discard.*``'  invalidates the data at the address range [a .. a + (size - 1)] 
----------------
gonzalobg wrote:

This is not accurate, the **effects** of `llvm.nvvm.discard` are equivalent to those of an `llvm.memset` that writes `undef` to memory: 

```llvm
llvm.nvvm discard ptr %p, i64 8 // writes `undef` to [p, p+8)
%a = load i64, ptr %p  // loads undef
%b = load i64, ptr %p // loads undef
%fa = freeze %a // freezes undef to stable bitpatter
%fb = freeze %b // freezes undef to stable bitpattern
assert %fa == %fb || %fa != %fb // %fa != %fb is a correct outcome
```

The `llvm.nvvm.discard` intrinsic writes undef with a **performance hint** to avoid write-backs in a particular cache level, but that's not really part of its effects.

https://github.com/llvm/llvm-project/pull/128404


More information about the llvm-commits mailing list