[LLVMdev] Canonicalization of ptrtoint/inttoptr and getelementptr
Philip Reames
listmail at philipreames.com
Tue Sep 9 21:27:37 PDT 2014
On 09/08/2014 04:22 PM, Dan Gohman wrote:
> An object can be allocated at virtual address 5 through extra-VM means
> (eg. mmap), and then one can (creatively) interpret the return value
> of @f as being associated with whatever %A was associated with *and*
> 5. The return value of @g can only be associated with exactly the same
> set that %A was associated with. Consequently, it's not always safe to
> replace @f with @g.
Dan, I'm trying to follow your logic here and am not arriving at the
same conclusion. Can you point out the flaw in my reasoning here?
define i8* @f(i8* %A) {
%pti = ptrtoint i8* %A to i64 <-- %pti is not a pointer and is thus not
based on anything
%add = add i64 %pti, 5 <-- %add is not a pointer and is thus not based
on anything, it is "associated with" the memory pointed to by %A
--- In particular, "5" is NOT a "an integer constant ... returned from a
function not defined within LLVM". It is not returned by a function.
As a result the pointer value of 5 is not associated with any address
range.
%itp = inttoptr i64 %add to i8* %itp is based on %pti only
ret i8* %itp}
I'm guessing the key difference in our reasoning is about the constant
5. :) I'm also guessing that you have an example in mind which
motivates the need for 5 to be considered associated with the address
range. Could you expand on why?
>
> It looks a little silly to say this in the case of the integer
> constant 5, and there are some semantic gray areas around extra-VM
> allocation, but the same thing happens if the add were adding a
> dynamic integer value, and then it's difficult to find a way to
> separate that case from the constant 5 case.
>
> In any case, the general advice is that people should prefer to use
> getelementptr to begin with. LLVM's own optimizers were converted to
> use getelementptr instead of ptrtoint+add+inttoptr even when they have
> to do raw byte arithmetic.
It would be nice to be able to canoncalize ptrtoint+add+inttoptr to
geps. Having seemingly reasonable-looking legal IR that simply doesn't
optimize is not the best introduction for new frontend authors. :)
>
>
> On Sat, Aug 30, 2014 at 6:01 PM, David Majnemer
> <david.majnemer at gmail.com <mailto:david.majnemer at gmail.com>> wrote:
>
> Consider the two functions bellow:
>
> define i8* @f(i8* %A) { %pti = ptrtoint i8* %A to i64 %add = add
> i64 %pti, 5 %itp = inttoptr i64 %add to i8* ret i8* %itp}
> define i8* @g(i8* %A) {
> %gep = getelementptr i8* %A, i64 5 ret i8* %gep}
> What, if anything, prevents us from canonicalizing @f to @g?I've
> heard that this might be in violation of
> http://llvm.org/docs/LangRef.html#pointeraliasing but I don't see how.
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/6b13621d/attachment.html>
More information about the llvm-dev
mailing list