[llvm] 3bbf7f5 - [Docs] Update opaque pointer docs (NFC)

Nikita Popov via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 14 08:42:51 PST 2022


Author: Nikita Popov
Date: 2022-01-14T17:42:43+01:00
New Revision: 3bbf7f5ed86f74ed8a054ba6b8c2871616500e10

URL: https://github.com/llvm/llvm-project/commit/3bbf7f5ed86f74ed8a054ba6b8c2871616500e10
DIFF: https://github.com/llvm/llvm-project/commit/3bbf7f5ed86f74ed8a054ba6b8c2871616500e10.diff

LOG: [Docs] Update opaque pointer docs (NFC)

Mention -opaque-pointers, write a bit more about migration pitfalls
and update the open issues.

Added: 
    

Modified: 
    llvm/docs/LangRef.rst
    llvm/docs/OpaquePointers.rst

Removed: 
    


################################################################################
diff  --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 33ff3a8e85dbe..738d20018c2ca 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -1199,7 +1199,7 @@ Currently, only the following parameter attributes are defined:
 
     The ``elementtype`` argument attribute can be used to specify a pointer
     element type in a way that is compatible with `opaque pointers
-    <OpaquePointers.html>`.
+    <OpaquePointers.html>`_.
 
     The ``elementtype`` attribute by itself does not carry any specific
     semantics. However, certain intrinsics may require this attribute to be

diff  --git a/llvm/docs/OpaquePointers.rst b/llvm/docs/OpaquePointers.rst
index 3c179a8272d27..8528ba4673f26 100644
--- a/llvm/docs/OpaquePointers.rst
+++ b/llvm/docs/OpaquePointers.rst
@@ -6,7 +6,7 @@ The Opaque Pointer Type
 =======================
 
 Traditionally, LLVM IR pointer types have contained a pointee type. For example,
-``i32 *`` is a pointer that points to an ``i32`` somewhere in memory. However,
+``i32*`` is a pointer that points to an ``i32`` somewhere in memory. However,
 due to a lack of pointee type semantics and various issues with having pointee
 types, there is a desire to remove pointee types from pointers.
 
@@ -29,7 +29,7 @@ actual underlying type in memory. In other words, the pointee type contains no
 real semantics.
 
 Lots of operations do not actually care about the underlying type. These
-operations, typically intrinsics, usually end up taking an ``i8 *``. This causes
+operations, typically intrinsics, usually end up taking an ``i8*``. This causes
 lots of redundant no-op bitcasts in the IR to and from a pointer with a
 
diff erent pointee type. The extra bitcasts take up space and require extra work
 to look through in optimizations. And more bitcasts increases the chances of
@@ -57,6 +57,35 @@ LLVM IR distinguished between unsigned and signed integer types. The transition
 from manifesting signedness in types to instructions happened early on in LLVM's
 life to the betterment of LLVM IR.
 
+Opaque Pointers Mode
+====================
+
+During the transition phase, LLVM can be used in two modes: In typed pointer
+mode (currently still the default) all pointer types have a pointee type and
+opaque pointers cannot be used. In opaque pointers mode, all pointers are
+opaque. The opaque pointer mode can be enabled using ``-opaque-pointers`` in
+LLVM tools like ``opt``, or ``-mllvm -opaque-pointers`` in clang.
+
+In opaque pointer mode, all typed pointers used in IR, bitcode, or created
+using ``PointerType::get()`` and similar APIs are automatically converted into
+opaque pointers. This simplifies migration and allows testing existing IR with
+opaque pointers.
+
+.. code-block:: llvm
+
+   define i8* @test(i8* %p) {
+     %p2 = getelementptr i8, i8* %p, i64 1
+     ret i8* %p2
+   }
+
+   ; Is automatically converted into the following if -opaque-pointers
+   ; is enabled:
+
+   define ptr @test(ptr %p) {
+     %p2 = getelementptr i8, ptr %p, i64 1
+     ret ptr %p2
+   }
+
 I Still Need Pointee Types!
 ===========================
 
@@ -87,73 +116,92 @@ indirectly.
 If you have use cases that this sort of fix doesn't cover, please email
 llvm-dev.
 
-Transition Plan
-===============
-
-LLVM currently has many places that depend on pointee types. Each dependency on
-pointee types needs to be resolved in some way or another. This essentially
-translates to figuring out how to remove all calls to
-``PointerType::getElementType`` and ``Type::getPointerElementType()``.
-
-Making everything use opaque pointers in one huge commit is infeasible. This
-needs to be done incrementally. The following steps need to be done, in no
-particular order:
-
-* Introduce the opaque pointer type
-
-  * Already done
-
-* Remove remaining in-tree users of pointee types
-
-  * There are many miscellaneous uses that should be cleaned up individually
+Migration Instructions
+======================
 
-  * Some of the larger use cases are mentioned below
+In order to support opaque pointers, two types of changes tend to be necessary.
+The first is the removal of all calls to ``PointerType::getElementType()`` and
+``Type::getPointerElementType()``.
 
-* Various ABI attributes and instructions that rely on pointee types need to be
-  modified to specify the type separately
+In the LLVM middle-end and backend, this is usually accomplished by inspecting
+the type of relevant operations instead. For example, memory access related
+analyses and optimizations should use the types encoded in the load and store
+instructions instead of querying the pointer type.
 
-  * This has already happened for all instructions like loads, stores, GEPs,
-    and various attributes like ``byval``
+Frontends need to be adjusted to track pointee types independently of LLVM,
+insofar as they are necessary for lowering. For example, clang now tracks the
+pointee type in the ``Address`` structure.
 
-  * More cases may be found as work continues
+While direct usage of pointer element types is immediately apparent in code,
+there is a more subtle issue that opaque pointers need to contend with: A lot
+of code assumes that pointer equality also implies that the used load/store
+type is the same. Consider the following examples with typed an opaque pointers:
 
-* Remove calls to and deprecate ``IRBuilder`` methods that rely on pointee types
-
-  * For example, some of the ``IRBuilder::CreateGEP()`` methods use the pointer
-    operand's pointee type to determine the GEP operand type
-
-  * Some methods are already deprecated with ``LLVM_ATTRIBUTE_DEPRECATED``, such
-    as some overloads of ``IRBuilder::CreateLoad()``
-
-* Allow bitcode auto-upgrade of legacy pointer type to the new opaque pointer
-  type (not to be turned on until ready)
-
-  * To support legacy bitcode, such as legacy stores/loads, we need to track
-    pointee types for all values since legacy instructions may infer the types
-    from a pointer operand's pointee type
-
-* Migrate frontends to not keep track of frontend pointee types via LLVM pointer
-  pointee types
-
-  * This is mostly Clang, see ``clang::CodeGen::Address::getElementType()``
-
-* Add option to internally treat all pointer types opaque pointers and see what
-  breaks, starting with LLVM tests, then run Clang over large codebases
-
-  * We don't want to start mass-updating tests until we're fairly confident that opaque pointers won't cause major issues
-
-* Replace legacy pointer types in LLVM tests with opaque pointer types
-
-Frontend Migration Steps
-========================
-
-If you have your own frontend, there are a couple of things to do after opaque
-pointer types fully work.
-
-* Don't rely on LLVM pointee types to keep track of frontend pointee types
-
-* Migrate away from LLVM IR instruction builders that rely on pointee types
+.. code-block:: llvm
 
-  * For example, ``IRBuilder::CreateGEP()`` has multiple overloads; make sure to
-    use one where the source element type is explicitly passed in, not inferred
-    from the pointer operand pointee type
+    define i32 @test(i32* %p) {
+      store i32 0, i32* %p
+      %bc = bitcast i32* %p to i64*
+      %v = load i64, i64* %bc
+      ret i64 %v
+    }
+
+    define i32 @test(ptr %p) {
+      store i32 0, ptr %p
+      %v = load i64, ptr %p
+      ret i64 %v
+    }
+
+Without opaque pointers, a check that the pointer operand of the load and
+store are the same also ensures that the accessed type is the same. Using a
+
diff erent type requires a bitcast, which will result in distinct pointer
+operands.
+
+With opaque pointers, the bitcast is not present, and this check is no longer
+sufficient. In the above example, it could result in store to load forwarding
+of an incorrect type. Code making such assumptions needs to be adjusted to
+check the accessed type explicitly:
+``LI->getType() == SI->getValueOperand()->getType()``.
+
+Frontends using the C API through an FFI interface should be aware that a
+number of C API functions are deprecated and will be removed as part of the
+opaque pointer transition::
+
+    LLVMBuildLoad -> LLVMBuildLoad2
+    LLVMBuildCall -> LLVMBuildCall2
+    LLVMBuildInvoke -> LLVMBuildInvoke2
+    LLVMBuildGEP -> LLVMBuildGEP2
+    LLVMBuildInBoundsGEP -> LLVMBuildInBoundsGEP2
+    LLVMBuildStructGEP -> LLVMBuildStructGEP2
+    LLVMConstGEP -> LLVMConstGEP2
+    LLVMConstInBoundsGEP -> LLVMConstInBoundsGEP2
+    LLVMAddAlias -> LLVMAddAlias2
+
+Additionally, it will no longer be possible to call ``LLVMGetElementType()``
+on a pointer type.
+
+Transition State
+================
+
+As of January 2022 large parts of LLVM support opaque pointers, but there are
+still some major open problems:
+
+* Bitcode already fully supports opaque pointers, and reading up-to-date
+  typed pointer bitcode in opaque pointers mode also works. However, we
+  currently do not support pointee type based auto-upgrade of old bitcode in
+  opaque pointer mode.
+
+* While clang has limited support for opaque pointers (sufficient to compile
+  CTMark on Linux), a major effort will be needed to systematically remove all
+  uses of ``getPointerElementType()`` and the deprecated ``Address()``
+  constructor.
+
+* We do not yet have a testing strategy for how we can test both typed and
+  opaque pointers during the migration. Currently, individual tests for
+  opaque pointers are being added, but the bulk of tests still uses typed
+  pointers.
+
+* Loop access analysis does not support opaque pointers yet, and is currently
+  the main source of assertion failurse in optimized builds.
+
+* Miscellanous uses of pointer element types remain everywhere.


        


More information about the llvm-commits mailing list