[LLVMdev] Canonicalization of ptrtoint/inttoptr and getelementptr

Tue Sep 9 21:27:37 PDT 2014

On 09/08/2014 04:22 PM, Dan Gohman wrote:
> An object can be allocated at virtual address 5 through extra-VM means 
> (eg. mmap), and then one can (creatively) interpret the return value 
> of @f as being associated with whatever %A was associated with *and* 
> 5. The return value of @g can only be associated with exactly the same 
> set that %A was associated with. Consequently, it's not always safe to 
> replace @f with @g.
Dan, I'm trying to follow your logic here and am not arriving at the 
same conclusion.  Can you point out the flaw in my reasoning here?

define i8* @f(i8* %A) {
%pti = ptrtoint i8* %A to i64  <-- %pti is not a pointer and is thus not 
based on anything
%add = add i64 %pti, 5  <-- %add is not a pointer and is thus not based 
on anything, it is "associated with" the memory pointed to by %A
--- In particular, "5" is NOT a "an integer constant ... returned from a 
function not defined within LLVM".  It is not returned by a function.  
As a result the pointer value of 5 is not associated with any address 
range.
%itp = inttoptr i64 %add to i8*  %itp is based on %pti only
ret i8* %itp}

I'm guessing the key difference in our reasoning is about the constant 
5.  :)  I'm also guessing that you have an example in mind which 
motivates the need for 5 to be considered associated with the address 
range.  Could you expand on why?

>
> It looks a little silly to say this in the case of the integer 
> constant 5, and there are some semantic gray areas around extra-VM 
> allocation, but the same thing happens if the add were adding a 
> dynamic integer value, and then it's difficult to find a way to 
> separate that case from the constant 5 case.
>
> In any case, the general advice is that people should prefer to use 
> getelementptr to begin with. LLVM's own optimizers were converted to 
> use getelementptr instead of ptrtoint+add+inttoptr even when they have 
> to do raw byte arithmetic.
It would be nice to be able to canoncalize ptrtoint+add+inttoptr to 
geps.  Having seemingly reasonable-looking legal IR that simply doesn't 
optimize is not the best introduction for new frontend authors.  :)
>
>
> On Sat, Aug 30, 2014 at 6:01 PM, David Majnemer 
> <david.majnemer at gmail.com <mailto:david.majnemer at gmail.com>> wrote:
>
>     Consider the two functions bellow:
>
>     define i8* @f(i8* %A) {  %pti = ptrtoint i8* %A to i64 %add = add
>     i64 %pti, 5  %itp = inttoptr i64 %add to i8* ret i8* %itp}
>     define i8* @g(i8* %A) {
>       %gep = getelementptr i8* %A, i64 5  ret i8* %gep}
>     What, if anything, prevents us from canonicalizing @f to @g?I've
>     heard that this might be in violation of
>     http://llvm.org/docs/LangRef.html#pointeraliasing but I don't see how.
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/6b13621d/attachment.html>