[PATCH] D64128: [CodeGen] Generate llvm.ptrmask instead of inttoptr(and(ptrtoint, C)) if possible.

Florian Hahn via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jul 4 07:26:17 PDT 2019


fhahn added a comment.

Thanks for the quick responses and the helpful comments. Thank you very much Hal, for summarizing the argument from previous discussions. My initial understanding indeed was that by generating ptrmask directly for C/C++ expressions, we can circumvent the issues that come with ptrtoint/inttoptr in LLVM.

One key point that might not be too clear is that the question should be whether `(T*) ((intptr_t) x & N)` points to the same underlying object as `x`,// iff the mask `N` preserves all 'relevant' bits of the pointer `x`//. I am not sure if 'relevant' bits is the best term, but I use it to refer to all bits that do not have to be zero due to alignment requirements or pointer size restrictions. With that in mind, let me try to cover the possible cases in terms of C++'s  safely-derived pointers, depending on `x`. (I'm referencing http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3690.pdf)

1. if `x` is a safely-derived pointer, then mask is a no-op and the result of the expression is a safely-derived pointer; as `x` was safely-derived, all bits that are masked out must already be 0, so according to 3.7.4.3.3, `reinterpret_cast<void*>(x)` should be equal to `reinterpret_cast<void*>(((intptr_t) x & N))`.

2. if `x` is not a safely-derived pointer, but it becomes one after masking: then `x` must be the result of a series of bitwise operations, that only modify the bits masked out later by `N`. Otherwise the whole series of bitwise operations including the masking would violate `3.7.4.3.3 - the result of an additive or bitwise operation, one of whose operands is an integer representation of a safely-derived pointer value P, if that result converted by reinterpret_cast<void*> would compare equal to a safely-derived pointer computable from reinterpret_cast<void*>(P)`

3. if `x` is not a safely-derived pointer and the mask does not turn it into a safely-derived pointer: in that case, the masking should again not change the safely-derived property, and both would be invalid under strict pointer safety.

I think the key case is 2., where the mask operation is the last step in a series of bitwise operations, taking an integer representation of a safely-derived pointer value `P` and after masking we get `P` again. E.g. packing/unpacking bits of a tagged pointer `(P | 1) & ~1`. After writing all that down, there seems to be one problem though: technically we have a series of bitwise operations and the intermediate values are not integer values of safely-derived pointers. One could argue that the bitwise operations together cancel out each other and are a no-op, resulting in the original pointer.

Does this summary make sense?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64128/new/

https://reviews.llvm.org/D64128





More information about the cfe-commits mailing list