[llvm] [LangRef] Update initializes definition (PR #134370)

Nikita Popov via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 7 04:54:20 PDT 2025


https://github.com/nikic updated https://github.com/llvm/llvm-project/pull/134370

>From 95f23ad1595cab0b69e74a3aae7820ed77037d1a Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Fri, 4 Apr 2025 12:52:48 +0200
Subject: [PATCH 1/2] [LangRef] Update initializes definition

Specify the initializes attribute in terms of an "initialized"
shadow state, such that:

 * Loads prior to initialization return undef.
 * Bytes that are not explicitly initialized are written with
   undef on function return.

This is intended to preserve the core semantics of the attribute,
but adjusts the wording in a way that is compatible with existing
optimizations, such as insertion of spurious loads and removal
of uninitialized writes.

Fixes https://github.com/llvm/llvm-project/issues/133038.
Fixes https://github.com/llvm/llvm-project/issues/133059.
---
 llvm/docs/LangRef.rst | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index d242c945816cc..8b423971b94db 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -1690,10 +1690,24 @@ Currently, only the following parameter attributes are defined:
 
 ``initializes((Lo1, Hi1), ...)``
     This attribute indicates that the function initializes the ranges of the
-    pointer parameter's memory, ``[%p+LoN, %p+HiN)``. Initialization of memory
-    means the first memory access is a non-volatile, non-atomic write. The
-    write must happen before the function returns. If the function unwinds,
-    the write may not happen.
+    pointer parameter's memory ``[%p+LoN, %p+HiN)``. Colloquially, this means
+    that all bytes in the specified range are written before the function
+    returns, and not read prior to the initializing write. If the function
+    unwinds, the write may not happen.
+
+    Formally, this is specified in terms of an "initialized" shadow state for
+    all bytes in the range, which is set to "not initialized" at function entry.
+    If a memory access is performed through a pointer based on the argument,
+    and an accessed byte has not been marked as "initialized" yet, then:
+
+     * If the byte is stored with a non-volatile, non-atomic write, mark it as
+       "initialized".
+     * If the byte is stored with a volatile or atomic write, the behavior is
+       undefined.
+     * If the byte is loaded, return an undef value.
+
+    Additionally, if the function returns normally, write an undef value to all
+    bytes that are part of the range and have not been marked as "initialized".
 
     This attribute only holds for the memory accessed via this pointer
     parameter. Other arbitrary accesses to the same memory via other pointers

>From ef47b3a7ddd1708b27633519bde5e89b765124f6 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Mon, 7 Apr 2025 13:54:01 +0200
Subject: [PATCH 2/2] Return poison from load

---
 llvm/docs/LangRef.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 8b423971b94db..1f5992801764c 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -1704,7 +1704,7 @@ Currently, only the following parameter attributes are defined:
        "initialized".
      * If the byte is stored with a volatile or atomic write, the behavior is
        undefined.
-     * If the byte is loaded, return an undef value.
+     * If the byte is loaded, return a poison value.
 
     Additionally, if the function returns normally, write an undef value to all
     bytes that are part of the range and have not been marked as "initialized".



More information about the llvm-commits mailing list