[llvm] 8bd205b - [LangRef] Clarify the behavior of memory access instructions when pointers/sizes aren't well-defined

Juneyoung Lee via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 25 16:13:50 PDT 2020


Author: Juneyoung Lee
Date: 2020-09-26T08:13:27+09:00
New Revision: 8bd205bf1de486a32abd956390f6527da4c13e33

URL: https://github.com/llvm/llvm-project/commit/8bd205bf1de486a32abd956390f6527da4c13e33
DIFF: https://github.com/llvm/llvm-project/commit/8bd205bf1de486a32abd956390f6527da4c13e33.diff

LOG: [LangRef] Clarify the behavior of memory access instructions when pointers/sizes aren't well-defined

This is a patch to LangRef that clarifies the behavior of load/store/memset/memcpy/memmove when the pointers or sizes are not well-defined
as well.

MSan detects a case when e.g., only lower bits of address are garbage when `-msan-check-access-address` is enabled, and it does not directly conflict with this patch because a C program should not use a pointer with undef bits and reasonable optimizations do not convert a well-defined pointer into a pointer with undef bits.

This patch contains a definition of a well-defined value as well.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D87994

Added: 
    

Modified: 
    llvm/docs/LangRef.rst

Removed: 
    


################################################################################
diff  --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 923dae330d6d..117c298b5bd5 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3557,10 +3557,6 @@ uses with" concept would not hold.
 
 To ensure all uses of a given register observe the same value (even if
 '``undef``'), the :ref:`freeze instruction <i_freeze>` can be used.
-A value is frozen if its uses see the same value.
-An aggregate value or vector is frozen if its elements are frozen.
-The padding of an aggregate isn't considered, since it isn't visible
-without storing it into memory and loading it with a 
diff erent type.
 
 .. code-block:: llvm
 
@@ -3733,6 +3729,23 @@ Here are some examples:
 
     end:
 
+.. _welldefinedvalues:
+
+Well-Defined Values
+-------------------
+
+Given a program execution, a value is *well defined* if the value does not
+have an undef bit and is not poison in the execution.
+An aggregate value or vector is well defined if its elements are well defined.
+The padding of an aggregate isn't considered, since it isn't visible
+without storing it into memory and loading it with a 
diff erent type.
+
+A constant of a :ref:`single value <t_single_value>`, non-vector type is well
+defined if it is a non-undef constant. Note that there is no poison constant
+in LLVM.
+The result of :ref:`freeze instruction <i_freeze>` is well defined regardless
+of its operand.
+
 .. _blockaddress:
 
 Addresses of Basic Blocks
@@ -9248,6 +9261,12 @@ If the value being loaded is of aggregate type, the bytes that correspond to
 padding may be accessed but are ignored, because it is impossible to observe
 padding from the loaded aggregate value.
 
+If the pointer is not a well-defined value, all of its possible representations
+should be dereferenceable. For example, loading a byte from a pointer to an
+array of type ``[16 x i8]`` with offset ``undef & 31`` is undefined behavior.
+Loading a byte at offset ``undef & 15`` nondeterministically reads one of the
+bytes.
+
 Examples:
 """""""""
 
@@ -9339,6 +9358,12 @@ belong to the type, but they will typically be overwritten.
 If ``<value>`` is of aggregate type, padding is filled with
 :ref:`undef <undefvalues>`.
 
+If ``<pointer>`` is not a well-defined value, all of its possible
+representations should be dereferenceable. For example, storing a byte to a
+pointer to an array of type ``[16 x i8]`` with offset ``undef & 31`` is
+undefined behavior. Storing a byte to an offset ``undef & 15``
+nondeterministically stores to one of offsets from 0 to 15.
+
 Example:
 """"""""
 
@@ -12491,6 +12516,9 @@ argument.
 
 If "len" is 0, the pointers may be NULL, dangling, ``undef``, or ``poison``
 pointers. However, they must still be appropriately aligned.
+If "len" isn't a well-defined value, all of its possible representations should
+make the behavior of this ``llvm.memcpy`` defined, otherwise the behavior is
+undefined.
 
 .. _int_memcpy_inline:
 
@@ -12608,6 +12636,9 @@ the argument.
 
 If "len" is 0, the pointers may be NULL, dangling, ``undef``, or ``poison``
 pointers. However, they must still be appropriately aligned.
+If "len" isn't a well-defined value, all of its possible representations should
+make the behavior of this ``llvm.memmove`` defined, otherwise the behavior is
+undefined.
 
 .. _int_memset:
 
@@ -12663,6 +12694,9 @@ the argument.
 
 If "len" is 0, the pointer may be NULL, dangling, ``undef``, or ``poison``
 pointer. However, it must still be appropriately aligned.
+If "len" isn't a well-defined value, all of its possible representations should
+make the behavior of this ``llvm.memset`` defined, otherwise the behavior is
+undefined.
 
 '``llvm.sqrt.*``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^


        


More information about the llvm-commits mailing list