[llvm] c87c375 - [LangRef] Clarify GEP inbounds wrapping semantics

Nikita Popov via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 13 08:49:53 PST 2020


Author: Nikita Popov
Date: 2020-11-13T17:49:41+01:00
New Revision: c87c37509692c35c6923bd09b293eb550830ad85

URL: https://github.com/llvm/llvm-project/commit/c87c37509692c35c6923bd09b293eb550830ad85
DIFF: https://github.com/llvm/llvm-project/commit/c87c37509692c35c6923bd09b293eb550830ad85.diff

LOG: [LangRef] Clarify GEP inbounds wrapping semantics

Clarify the semantics of GEP inbounds, in particular with regard
to what it means for wrapping. This cleans up some confusion on
when it is legal to apply nuw/nsw flags to various parts of the
GEP calculation.

Differential Revision: https://reviews.llvm.org/D90708

Added: 
    

Modified: 
    llvm/docs/LangRef.rst

Removed: 
    


################################################################################
diff  --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 0e3ecf91d91f9..2c5e456e6c2f1 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -9791,17 +9791,30 @@ for the given testcase is equivalent to:
     }
 
 If the ``inbounds`` keyword is present, the result value of the
-``getelementptr`` is a :ref:`poison value <poisonvalues>` if the base
-pointer is not an *in bounds* address of an allocated object, or if any
-of the addresses that would be formed by successive addition of the
-offsets implied by the indices to the base address with infinitely
-precise signed arithmetic are not an *in bounds* address of that
-allocated object. The *in bounds* addresses for an allocated object are
-all the addresses that point into the object, plus the address one byte
-past the end. The only *in bounds* address for a null pointer in the
-default address-space is the null pointer itself. In cases where the
-base is a vector of pointers the ``inbounds`` keyword applies to each
-of the computations element-wise.
+``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the
+following rules is violated:
+
+*  The base pointer has an *in bounds* address of an allocated object, which
+   means that it points into an allocated object, or to its end. The only
+   *in bounds* address for a null pointer in the default address-space is the
+   null pointer itself.
+*  If the type of an index is larger than the pointer index type, the
+   truncation to the pointer index type preserves the signed value.
+*  The multiplication of an index by the type size does not wrap the pointer
+   index type in a signed sense (``nsw``).
+*  The successive addition of offsets (without adding the base address) does
+   not wrap the pointer index type in a signed sense (``nsw``).
+*  The successive addition of the current address, interpreted as an unsigned
+   number, and an offset, interpreted as a signed number, does not wrap the
+   unsigned address space and remains *in bounds* of the allocated object.
+   As a corollary, if the added offset is non-negative, the addition does not
+   wrap in an unsigned sense (``nuw``).
+*  In cases where the base is a vector of pointers, the ``inbounds`` keyword
+   applies to each of the computations element-wise.
+
+These rules are based on the assumption that no allocated object may cross
+the unsigned address space boundary, and no allocated object may be larger
+than half the pointer index type space.
 
 If the ``inbounds`` keyword is not present, the offsets are added to the
 base address with silently-wrapping two's complement arithmetic. If the


        


More information about the llvm-commits mailing list