[PATCH] D88860: [LangRef] Describe why the pointer aliasing rules are currently unsound.

Eli Friedman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 5 16:04:11 PDT 2020


efriedma created this revision.
efriedma added reviewers: nlopes, aqjune, lebedev.ri, chandlerc, nikic, spatel.
Herald added a reviewer: jdoerfert.
Herald added a project: LLVM.
efriedma requested review of this revision.

I'd like some short description in LangRef we can point to when questions come up, rather than the complicated discussion in https://bugs.llvm.org/show_bug.cgi?id=34548 .


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D88860

Files:
  llvm/docs/LangRef.rst


Index: llvm/docs/LangRef.rst
===================================================================
--- llvm/docs/LangRef.rst
+++ llvm/docs/LangRef.rst
@@ -2554,6 +2554,43 @@
 which specialized optimization passes may use to implement type-based
 alias analysis.
 
+The inttoptr hole
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+The pointer aliasing rules currently don't have a consistent interpretation in
+LLVM. The issue is the semantics of ``inttoptr``, and loads that produce a
+pointer.  It's not clear what it means for a pointer value to "contribute".
+
+Suppose the strictest possible interpretation: all computations and control
+flow are relevant to whether a pointer value "contributes". Then if an integer
+is converted to a pointer, it depends on all pointers which have escaped at
+that point.  This is true even if we can prove the pointer value is equal to
+the address of some specific object. This makes a bunch of transforms LLVM
+currently performs illegal. For example, an inttoptr of a ptrtoint can't be
+simplified to the operand of the ptrtoint. Or if you have a store of a
+pointer, followed by a load of the same pointer, it can't be simplified to
+the operand of the store.
+
+There are various ways this could be relaxed. The most likely solution is some
+sort of invisible provenance indicator. At its core, this says that if a store
+writes a pointer value, then a load reads that pointer value, the load is only
+based on the value operand of the store that stored the value, not any other
+escaped pointer. This description leaves a lot of open questions regarding the
+interaction between pointer operations and non-pointer operations.
+
+Another possibility is that we could drop the notion of "based-on", and come
+up with some other approach for alias analysis focused around "inbounds".
+Suppose we had a stricter version of "inbounds" that didn't allow computing
+the address of the byte one past the end of an object.  Then we end up with a
+pretty simple model: pointers themselves are just integers, but GEPs still
+preserve something roughly equivalent to "based on".  The problem here is
+the current "inbounds" allows pointers one byte past the end of an object; that
+pointer could point to another object, so analyzing that is a lot more
+complicated.
+
+https://bugs.llvm.org/show_bug.cgi?id=34548 discusses various related issues
+in LLVM. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2311.pdf goes over
+related issues in the C standard.
+
 .. _volatile:
 
 Volatile Memory Accesses


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D88860.296323.patch
Type: text/x-patch
Size: 2521 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20201005/aece0c90/attachment.bin>


More information about the llvm-commits mailing list