[llvm] [LangRef] Do not make align imply dereferenceability (PR #158062)

Thu Sep 11 05:55:11 PDT 2025

https://github.com/nikic created https://github.com/llvm/llvm-project/pull/158062

We currently specify that something like `load i8, align 16384` implies that the object is actually dereferenceable up to 16384 bytes, rather than only the one byte implied by the load type.

We should stop doing that, because it makes it invalid to infer alignments larger than the load/store type, which is something we do (and want to do).

There is some backend code that does make use of this property by widening accesses and extracting part of them. However, I believe we should be justifying that based on target-specific guarantees, rather than a generic IR property. (The reasoning goes something like this: Typically, memory protection has page granularity, so widening a load to the alignment will not trap, as long as the alignment is not larger than the page size, which is true for any practically interesting access size.)

Fixes https://github.com/llvm/llvm-project/issues/90446.

>From f86f767b266f3eddf8e6e667754e6073a23411c4 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Thu, 11 Sep 2025 14:45:59 +0200
Subject: [PATCH] [LangRef] Do not make align imply dereferenceability

We currently specify that something like `load i8, align 16384`
implies that the object is actually dereferenceable up to 16384
bytes, rather than only the one byte implied by the load type.

We should stop doing that, because it makes it invalid to infer
alignments larger than the load/store type, which is something we
do (and want to do).

There is some backend code that does make use of this property
by widening accesses and extracting part of them. However,
I believe we should be justifying that based on target-specific
guarantees, rather than a generic IR property. (The reasoning
goes something like this: Typically, memory protection has page
granularity, so widening a load to the alignment will not trap,
as long as the alignment is not larger than the page size, which
is true for any practically interesting access size.)

Fixes https://github.com/llvm/llvm-project/issues/90446.
---
 llvm/docs/LangRef.rst | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 45ae2327323d6..a98fd351e54cd 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -11239,11 +11239,9 @@ responsibility of the code emitter to ensure that the alignment information is
 correct. Overestimating the alignment results in undefined behavior.
 Underestimating the alignment may produce less efficient code. An alignment of
 1 is always safe. The maximum possible alignment is ``1 << 32``. An alignment
-value higher than the size of the loaded type implies memory up to the
-alignment value bytes can be safely loaded without trapping in the default
-address space. Access of the high bytes can interfere with debugging tools, so
-should not be accessed if the function has the ``sanitize_thread`` or
-``sanitize_address`` attributes.
+value higher than the size of the loaded type does *not* imply (without target
+specific knowledge) that memory up to the alignment value bytes can be safely
+loaded without trapping.
 
 The alignment is only optional when parsing textual IR; for in-memory IR, it is
 always present. An omitted ``align`` argument means that the operation has the
@@ -11379,12 +11377,10 @@ operation (that is, the alignment of the memory address). It is the
 responsibility of the code emitter to ensure that the alignment information is
 correct. Overestimating the alignment results in undefined behavior.
 Underestimating the alignment may produce less efficient code. An alignment of
-1 is always safe. The maximum possible alignment is ``1 << 32``. An alignment
-value higher than the size of the loaded type implies memory up to the
-alignment value bytes can be safely loaded without trapping in the default
-address space. Access of the high bytes can interfere with debugging tools, so
-should not be accessed if the function has the ``sanitize_thread`` or
-``sanitize_address`` attributes.
+1 is always safe. The maximum possible alignment is ``1 << 32``.  An alignment
+value higher than the size of the stored type does *not* imply (without target
+specific knowledge) that memory up to the alignment value bytes can be safely
+loaded without trapping.
 
 The alignment is only optional when parsing textual IR; for in-memory IR, it is
 always present. An omitted ``align`` argument means that the operation has the