[PATCH] D59065: [BasicAA] Simplify inttoptr(and(ptrtoint(X), C)) to X, if C preserves all significant bits.

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 1 05:46:07 PDT 2019


fhahn added a comment.

In D59065#1449442 <https://reviews.llvm.org/D59065#1449442>, @sanjoy wrote:

> For instance:
>
>   // Let's say we know malloc(64) will always return a pointer that is 8 byte
>   // aligned.
>  
>   int8* ptr0 = malloc(64);
>   int8* ptr1 = malloc(64);
>  
>   int8* ptr0_end = ptr0 + 64;
>  
>   // I'm not sure if this comparison is well defined in C++, but it is well
>   // defined in LLVM IR:
>   if (ptr0_end != ptr1) return;
>  
>   intptr ptr0_end_i = (intptr)ptr0_end;
>  
>   intptr ptr0_end_masked = ptr0_end_i & -8;
>  
>   // I think the transform being added in this comment will fire below since it is
>   // doing inttoptr(and(ptrtoint(ptr0_end), -8)).
>  
>   int8* aliases_ptr0_and_ptr1 = (int8*)ptr0_end_masked;
>
>
> Right now `aliases_ptr0_and_ptr1` aliases both `ptr0` and `ptr1` (we can GEP backwards from it to access `ptr0` and forwards from it to access `ptr1`).  But if we replace it with `ptr0_end` then it can be used to access `ptr0` only.


Ah thanks, together with @aqjune 's response, I think I now know what I was missing. If we have something like

  int8_t* obj1 = malloc(4);
  int8_t* obj2 = malloc(4);
  int p = (intptr_t)(obj1 + 4);
  
  if (p != (intptr_t) obj2) return;
   
  *(int8_t*)(intptr_t)(obj1 + 4) = 0;   // <- here we alias ob1 and obj2?

I thought the information obtained via the control flow, `p` aliases both `obj1` and `obj2`, is limited to the uses of `p`, but do I understand correctly that this is not the case and the information leaks to all equivalent expressions (that is for the snippet above, without GVN or any common code elimination)? If that is the case, then an intrinsic as suggested by @atrick would help circumvent that issue. If it is not the case and the information that `p` aliases `obj1` and `obj2` is limited to uses of `p`, then I think the restrictions in place should be sufficient to rule out your example (assuming we use integer comparisons for the pointers)

>> In the example, the original pointer (`%addr = load %struct.zot*, %struct.zot** %loc, align 8`) is not dereference directly and the use case I am looking at is tagged pointers, where the inttoptr(and(ptrtoint(), C) roundtrip is required to get a valid pointer.  So the original pointer might not be dereferenceable directly, but logically (ignoring the bits irrelevant for the pointer value) it should still point to the same object.  Does that make sense to you?
> 
> That seems problematic for another reason:  IIUC you're saying `Alias(inttoptr(ptrtoint(X) & -8), A)` == `Alias(X, A)`.  But `X` is an illegal pointer so it does not alias anything (reads and writes on that pointer is illegal)?

Agreed, I think we would need to make this explicit in the langref.  X is illegal, if you consider all bits of the pointer. But the address space and alignment limit the relevant bits of the pointer, so I suppose we could specify that for logical pointers, only the bits in the limited range identify the pointed-to object.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59065/new/

https://reviews.llvm.org/D59065





More information about the llvm-commits mailing list