[llvm-dev] [RFC] Introducing the opaque pointer type
David Chisnall via llvm-dev
llvm-dev at lists.llvm.org
Tue May 11 02:19:38 PDT 2021
On 11/05/2021 07:59, pawel k. via llvm-dev wrote:
> I am very much beginner in opaque pointers but I am also minimalist too
> in a sense entities shouldnt be multiplied but rather divided where
> applicable.
>
> Can someone point me to article(s) describing what problems opaque
> pointers solve that cant be solved with forward declaractions and typed
> pointers etc?
>
> My first gutfeeling was when learning on idea of opaque pointers, theyre
> not much more than void* with all its issues from static analysis,
> compiler design, code readability, code quality, code security
> perspective. Can someone correct a newbie? Very open to change my mind.
There are a few problems with the current representation and they
largely mirror the old problem with signed vs unsigned integers in the
IR 15 years ago. In early versions of LLVM, integers were explicitly
signed. This meant that the IR was cluttered with bitcasts from signed
to unsigned integers, which slowed down analysis and didn't convey any
useful semantics. Worse, there were a bunch of things conflated, for
example does unsigned imply wrapping? Some time in the 2.x series (2.0?
My memory is fuzzy here), LLVM moved to just i{size} types for integer
and moved all of the semantics to the operations. It's now explicit
whether an operation is signed or unsigned, whether overflow wraps or
has undefined behaviour, and so on.
Pointers have a similar set of problems. Pointers carry a type, but
that type doesn't actually carry any semantics. There are a lot of
things that don't care about the type of the pointer, but they have no
way of specifying this and generally use i8*. This means that the IR is
full of bitcasts from {something}* to i8* and then back again.
This is particularly important for code that wants to use non-zero
address spaces, because a lot of code does casts via i8* and forgets to
change this to i8*-in-another-address-space.
The fact that a pointer is a pointer to some struct type currently
doesn't imply anything about whether the pointed-to data and it's
completely valid to bitcast a pointer to a random type and back again in
an optimisation. The real type info (where applicable) is carried by
TBAA metadata, dereferencability info by attributes, and so on.
TL;DR: The pointee type has no (or worse, misleading) semantics and
forces a load of bitcasts. Opaque pointers remove this.
David
More information about the llvm-dev
mailing list