[llvm] [NVPTX] Add Intrinsics for discard.* (PR #128404)
via llvm-commits
llvm-commits at lists.llvm.org
Sun Feb 23 04:24:00 PST 2025
================
@@ -630,6 +630,31 @@ uses and eviction priority which can be accessed by the '``.level::eviction_prio
For more information, refer to the PTX ISA
`<https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-prefetch-prefetchu>`_.
+``llvm.nvvm.discard.*``'
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+ declare void @llvm.nvvm.discard.global.L2(ptr addrspace(1) %global_ptr, i64 %size)
+ declare void @llvm.nvvm.discard.L2(ptr %ptr, i64 %size)
+
+Overview:
+"""""""""
+
+The '``@llvm.nvvm.discard.*``' invalidates the data at the address range [a .. a + (size - 1)]
----------------
gonzalobg wrote:
This is not accurate, the **effects** of `llvm.nvvm.discard` are equivalent to those of an `llvm.memset` that writes `undef` to memory:
```llvm
llvm.nvvm discard ptr %p, i64 8 // writes `undef` to [p, p+8)
%a = load i64, ptr %p // loads undef
%b = load i64, ptr %p // loads undef
%fa = freeze %a // freezes undef to stable bitpatter
%fb = freeze %b // freezes undef to stable bitpattern
assert %fa == %fb || %fa != %fb // %fa != %fb is a correct outcome
```
The `llvm.nvvm.discard` intrinsic writes undef with a **performance hint** to avoid write-backs in a particular cache level, but that's not really part of its effects.
https://github.com/llvm/llvm-project/pull/128404
More information about the llvm-commits
mailing list