[PATCH] D126309: [docs][OpaquePtr] Add detail to motivations behind opaque pointers

Reid Kleckner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 25 11:59:08 PDT 2022


rnk added inline comments.


================
Comment at: llvm/docs/OpaquePointers.rst:51
 
-Lots of operations do not actually care about the underlying type. These
-operations, typically intrinsics, usually end up taking an ``i8*``. This causes
-lots of redundant no-op bitcasts in the IR to and from a pointer with a
-different pointee type. The extra bitcasts take up space and require extra work
-to look through in optimizations. And more bitcasts increase the chances of
-incorrect bitcasts, especially in regards to address spaces.
-
-Some instructions still need to know what type to treat the memory pointed to by
-the pointer as. For example, a load needs to know how many bytes to load from
-memory. In these cases, instructions themselves contain a type argument. For
-example the load instruction from older versions of LLVM
-
-.. code-block:: llvm
-
-  load i64* %p
-
-becomes
-
-.. code-block:: llvm
-
-  load i64, ptr %p
-
-A nice analogous transition that happened earlier in LLVM is integer signedness.
-There is no distinction between signed and unsigned integer types, rather the
-integer operations themselves contain what to treat the integer as. Initially,
-LLVM IR distinguished between unsigned and signed integer types. The transition
-from manifesting signedness in types to instructions happened early on in LLVM's
-life to the betterment of LLVM IR.
+Historically LLVM was some sort of type-safe subset of C. Having pointee types
+provided an extra layer of checks to make sure that the Clang frontend matched
----------------
I believe I provided this wording suggestion, but I think it needs work.

I did a bit of digging, and if you go back to the [original 2003 publication](https://llvm.org/pubs/2003-05-01-GCCSummit2003.html), it was explicit that the types were included with the intention that they would support optimization:

"The architecture that we propose is based on a new
language-independent low-level code representation
that preserves important type information from the
source code. ...
However, the linktime optimizer can only perform meaningful optimizations on the program if it has enough high-level
information about the program to prove that aggressive optimizations are safe. Because of this, the lowlevel code representation is typed (using a languageindependent constructive type system) and directly
exposes information about structure and array accesses to the optimizer.
...."

Originally, LLVM was a research project with a goal of enabling fancy optimizations (see the [DSA paper](https://llvm.org/pubs/2003-04-29-DataStructureAnalysisTR.html)).

As LLVM evolved into a production compiler, the community started to realize that the LLVM struct type system, or at least the way llvm-gcc used it, couldn't really be used as a sound basis for alias analysis. The DSA alias analysis was [removed from LLVM](https://lists.llvm.org/pipermail/llvm-dev/2006-December/007550.html) in 2006.

So with that in mind, here's a wording suggestion:

LLVM's type system was [originally designed](https://llvm.org/pubs/2003-05-01-GCCSummit2003.html) to support high-level optimization. However, years of LLVM implementation experience have demonstrated that the current pointee type system design does not effectively support optimization. Memory optimization algorithms, such as SROA, GVN, and AA, generally need to look through LLVM's struct types and reason about the underlying memory offsets. The community realized that pointee types are hindering LLVM development, rather than helping it.

Pointee types provide some value to frontends because the IR verifier uses types to detect straightforward type confusion bugs. However, frontends also have to deal with the complexity of inserting bitcasts everywhere that they might be required. The current community consensus is that the costs of pointee types outweight the benefits, and that they should be removed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126309/new/

https://reviews.llvm.org/D126309



More information about the llvm-commits mailing list