[PATCH] D71499: Add builtins for aligning and checking alignment of pointers and integers

Thu Jan 2 06:32:52 PST 2020

lebedev.ri added reviewers: nlopes, aqjune.
lebedev.ri added a subscriber: nlopes.
lebedev.ri added a comment.

(would be good for @nlopes to comment, maybe i'm overweighting this..)

In D71499#1801119 <https://reviews.llvm.org/D71499#1801119>, @arichardson wrote:

> In D71499#1801104 <https://reviews.llvm.org/D71499#1801104>, @lebedev.ri wrote:
>
> > Looks ok to me now in principle.
> >  I have one more question about pointer variants though (see inline)
>

What i'm asking is:

- Are these builtins designed (as per `clang/docs/LanguageExtensions.rst`) to only be passed pointers in-bounds to the allocated memory chunk (`Logical pointer`*), or any random bag of bits casted to pointer type (`Physical Pointer`*)?
- If `Logical pointer`, are they designed to also produce `Logical pointer`? Or `Physical Pointer`?

> I am not sure the GEP can be inbounds since I have seen some cases
>  where aligning pointers is used to get a pointer to a different object.
>  I most cases it should be in-bounds (even when used to implement `malloc()`),
>  but I have seen some cases where aligning pointers is used to get a pointer
>  to a different object.

Object as in C++ object?
I'm specifically talking about memory region/chunk, as in what is returned by `malloc()`.

> For example, some versions WebKit align pointers down by 64k to get a pointer to a structure that holds metadata for all objects allocated inside that region.

But that entire region is still a single memory region/chunk,
not just some other random memory chunk that *happens* to be close nearby?

> I am not sure what happens for those cases if we add inbounds (miscompilation?), so I haven't added it here.
>  I guess we could add it if alignment is a constant and is less than the object size, but there might already be a pass to infer if a GEP is inbounds?

I'm pushing on this because these intrinsics are supposed to be better than hand-rolled variants
(and in their current form with a single non-inbounds GEP they already are infinitly better
than ptrtoint+inttoptr pair), but if we go with current design (return `Physical pointer`),
(which is less optimizer friendly than `gep inbounds` which would requre/produce `Logical pointer`s),
we'll be stuck with it..

Since there is no such builtin currently (certainly not in clang,
i don't see one in GCC), we **do** get to dictate it's semantics.
We don't **have** to define it to be most lax/UB-free (`ptrtoint`+`inttoptr` or non-`inbounds` `gep`),
but we //can// define it to be most optimizer-friendly (`gep inbounds`).

- https://web.ist.utl.pt/nuno.lopes/pubs/llvmmem-oopsla18.pdf

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71499/new/

https://reviews.llvm.org/D71499