[llvm-dev] Is pointer tagging defined behavior?

Sat Mar 26 05:32:27 PDT 2016

On Sat, Mar 26, 2016 at 2:58 PM, Russell Wallace via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Dynamic languages commonly use an implementation technique where you take
> a pointer to an object (aligned on eight bytes so the lower three bits are
> zero), cast to intptr_t, change the lower three bits to a tag value
> indicating the type of the object, then later test the tag value, remove
> the tag, cast back to a pointer and dereference the pointer.
>

That doesn't sound exactly right.

In the implementations I've seen, pointers always have tags with all 0
bits. So if the thing is actually a pointer you AND with 0x7 and find the
result is zero then you just go ahead and use the original value as a
pointer.

If the tag bits are nonzero then you don't have a pointer at all, you have
an integer or character or single float.

However. It's not out of the question that you might use some tag values to
indicate pointers to special kinds of objects that the runtime knows about,
such as strings or arrays. Even so, the tagged pointer is still guaranteed
to look like a pointer into somewhere in the first 8 bytes of the same
object (objects are never smaller than 8 bytes), so that's perfectly well
defined.

The only possible objection is that the pointer will be misaligned. I
believe you can find a discussion here in the last several months in which
it was stated that misaligned pointers are always ok on any machine,
provided that they are not dereferenced. In the case of a tagged pointer,
the pointer is always aligned before being dereferenced, either by masking,
or subtracting, or as an immediate offset (possibly combined with a field
offset).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160326/b389f02e/attachment.html>