[llvm] 2de812f - [LangRef] Always allow getelementptr inbounds with zero offset
Nikita Popov via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 6 05:46:50 PDT 2023
Author: Nikita Popov
Date: 2023-07-06T14:46:32+02:00
New Revision: 2de812f3f4e5ffe6557ebcd891517551b08954ea
URL: https://github.com/llvm/llvm-project/commit/2de812f3f4e5ffe6557ebcd891517551b08954ea
DIFF: https://github.com/llvm/llvm-project/commit/2de812f3f4e5ffe6557ebcd891517551b08954ea.diff
LOG: [LangRef] Always allow getelementptr inbounds with zero offset
Currently, our GEP specification has a special case that makes
gep inbounds (null, 0) legal. This patch proposes to expand this
special case to all gep inbounds (ptr, 0), where ptr is no longer
required to point to an allocated object.
This was previously discussed in some detail at
https://discourse.llvm.org/t/question-about-getelementptr-inbounds-with-offset-0/62533.
The motivation for this change is twofold:
* Rust relies on getelementptr inbounds with zero offset to be
legal for arbitrary pointers to support zero-sized types. The
current rules are unclear on whether this is legal or not
(saying that there is a zero-size "allocated object" at every
address may be consistent with our current rules, but more
clarity is desired here).
* The current semantics require us to drop the inbounds flag
when materializing zero-index GEPs, which is done by some
InstCombine transforms. Preserving the inbounds flag can
substantially improve optimization quality in some cases, as
illustrated in D154055.
As far as I know, the only analysis/transforms affected by this
semantics change are:
* A special-case for comparisons with null in CaptureTracking,
which is fixed by D154054. As far as I can tell, that special
case is not particularly valuable and should be recovered by
other transforms.
* Folding gep inbounds undef, idx to poison. We now need to fold
to undef instead (D154215).
Differential Revision: https://reviews.llvm.org/D154051
Added:
Modified:
llvm/docs/LangRef.rst
Removed:
################################################################################
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 52a34e49f0d68e..cf7f55c3c10f22 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -10923,14 +10923,12 @@ for the given testcase is equivalent to:
ret ptr %t5
}
-If the ``inbounds`` keyword is present, the result value of the
-``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the
-following rules is violated:
+If the ``inbounds`` keyword is present, the result value of a
+``getelementptr`` with any non-zero indices is a
+:ref:`poison value <poisonvalues>` if one of the following rules is violated:
* The base pointer has an *in bounds* address of an allocated object, which
- means that it points into an allocated object, or to its end. The only
- *in bounds* address for a null pointer in the default address-space is the
- null pointer itself.
+ means that it points into an allocated object, or to its end.
* If the type of an index is larger than the pointer index type, the
truncation to the pointer index type preserves the signed value.
* The multiplication of an index by the type size does not wrap the pointer
@@ -10945,6 +10943,11 @@ following rules is violated:
* In cases where the base is a vector of pointers, the ``inbounds`` keyword
applies to each of the computations element-wise.
+Note that ``getelementptr`` with all-zero indices is always considered to be
+``inbounds``, even if the base pointer does not point to an allocated object.
+As a corollary, the only pointer in bounds of the null pointer in the default
+address space is the null pointer itself.
+
These rules are based on the assumption that no allocated object may cross
the unsigned address space boundary, and no allocated object may be larger
than half the pointer index type space.
More information about the llvm-commits
mailing list