[PATCH] D87994: [LangRef] State that pointers and/or sizes of memory access instructions are well-defined

Juneyoung Lee via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Sep 20 15:27:18 PDT 2020


aqjune created this revision.
aqjune added reviewers: jdoerfert, efriedma, nikic, fhahn, spatel, eugenis, guiand.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
aqjune requested review of this revision.

This is a patch to LangRef suggesting that pointers to load/store should be
well-defined, and the size of memset/memcpy/memmove should be well-defined
as well.
If the size is non-zero, vice versa to the pointers.

This means that, for example, it is undefined behavior to do

  %p = alloca [8 x i8]
  %p2 = gep %p, (undef & 8)
  store 0, %p2

even if `undef & 8` is always less than 8.
This patch is beneficial for further no-undef analysis.

IIUC, this is consistent with MSan's assumption.
With `-msan-check-access-address` enabled,
it detects a case when e.g., only lower bits of address are garbage.

A remaining concern is whether there exists a case that this undef-bits offset
is useful.
I'll send a mail to llvm-dev for further discussion.

BTW, this patch contains slightly more diff that moves a definition of frozen
value to the end of poison part and etc.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D87994

Files:
  llvm/docs/LangRef.rst


Index: llvm/docs/LangRef.rst
===================================================================
--- llvm/docs/LangRef.rst
+++ llvm/docs/LangRef.rst
@@ -3553,10 +3553,6 @@
 
 To ensure all uses of a given register observe the same value (even if
 '``undef``'), the :ref:`freeze instruction <i_freeze>` can be used.
-A value is frozen if its uses see the same value.
-An aggregate value or vector is frozen if its elements are frozen.
-The padding of an aggregate isn't considered, since it isn't visible
-without storing it into memory and loading it with a different type.
 
 .. code-block:: llvm
 
@@ -3729,6 +3725,14 @@
 
     end:
 
+A value is well-defined, or *frozen*, if the value never has undef bit
+and is never poison.
+If a variable has a frozen value, its uses should always see the same
+defined value.
+An aggregate value or vector is frozen if its elements are frozen.
+The padding of an aggregate isn't considered, since it isn't visible
+without storing it into memory and loading it with a different type.
+
 .. _blockaddress:
 
 Addresses of Basic Blocks
@@ -9233,7 +9237,13 @@
 Semantics:
 """"""""""
 
-The location of memory pointed to is loaded. If the value being loaded
+The location of memory pointed to is loaded.
+The pointer should be well-defined. For example, one cannot load a pointer
+to an array of type ``[16 x i8]`` with offset ``undef & 16``, even if
+``undef & 16`` is always less than 16. If the pointer isn't well-defined,
+the behavior is undefined.
+
+If the value being loaded
 is of scalar type then the number of bytes read does not exceed the
 minimum number of bytes needed to hold all bits of the type. For
 example, loading an ``i24`` reads at most three bytes. When loading a
@@ -9325,7 +9335,13 @@
 """"""""""
 
 The contents of memory are updated to contain ``<value>`` at the
-location specified by the ``<pointer>`` operand. If ``<value>`` is
+location specified by the ``<pointer>`` operand.
+``<pointer>`` should be well-defined. For example, one cannot store a value to
+a pointer to an array of type ``[16 x i8]`` with offset ``undef & 16``, even if
+``undef & 16`` is always less than 16. If ``<pointer>`` isn't well-defined,
+the behavior is undefined.
+
+If ``<value>`` is
 of scalar type then the number of bytes written does not exceed the
 minimum number of bytes needed to hold all bits of the type. For
 example, storing an ``i24`` writes at most three bytes. When writing a
@@ -12485,8 +12501,10 @@
 to be aligned to some boundary, this can be specified as an attribute on the
 argument.
 
+"len" should be well-defined, otherwise the behavior is undefined.
 If "len" is 0, the pointers may be NULL, dangling, ``undef``, or ``poison``
 pointers. However, they must still be appropriately aligned.
+If "len" is not 0, the pointers should be well-defined.
 
 .. _int_memcpy_inline:
 
@@ -12602,8 +12620,10 @@
 aligned to some boundary, this can be specified as an attribute on
 the argument.
 
+"len" should be well-defined, otherwise the behavior is undefined.
 If "len" is 0, the pointers may be NULL, dangling, ``undef``, or ``poison``
 pointers. However, they must still be appropriately aligned.
+If "len" is not 0, the pointers should be well-defined.
 
 .. _int_memset:
 
@@ -12657,8 +12677,10 @@
 aligned to some boundary, this can be specified as an attribute on
 the argument.
 
+"len" should be well-defined, otherwise the behavior is undefined.
 If "len" is 0, the pointer may be NULL, dangling, ``undef``, or ``poison``
 pointer. However, it must still be appropriately aligned.
+If "len" is not 0, the pointer should be well-defined.
 
 '``llvm.sqrt.*``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D87994.293039.patch
Type: text/x-patch
Size: 3675 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200920/d658b633/attachment.bin>


More information about the llvm-commits mailing list