[LLVMdev] Canonicalization of ptrtoint/inttoptr and getelementptr

David Majnemer david.majnemer at gmail.com
Wed Sep 10 18:22:06 PDT 2014


LLVM doesn't appear to respect this.

consider:
target datalayout = "e-i64:64-f80:128-n8:16:32:64-S128"

define i8* @h(i8* %x, i8* %y) {
  %pti = ptrtoint i8* %y to i64
  %sub = sub i64 0, %pti
  %gep = getelementptr i8* %x, i64 %sub
  ret i8* %gep
}

run it with -instcombine and we get:

define i8* @h(i8* %x, i8* %y) {
  %pti = ptrtoint i8* %y to i64
  %1 = ptrtoint i8* %x to i64
  %2 = sub i64 %1, %pti
  %gep = inttoptr i64 %2 to i8*
  ret i8* %gep
}


On Wed, Sep 10, 2014 at 6:16 PM, Philip Reames <listmail at philipreames.com>
wrote:

>
> On 09/10/2014 02:55 PM, Kevin Modzelewski wrote:
>
>
> On Tue, Sep 9, 2014 at 9:27 PM, Philip Reames <listmail at philipreames.com>
> wrote:
>
>>
>>  I'm guessing the key difference in our reasoning is about the constant
>> 5.  :)  I'm also guessing that you have an example in mind which motivates
>> the need for 5 to be considered associated with the address range.  Could
>> you expand on why?
>>
>>
>  Can't speak for Dan, but in Pyston we certainly make use of these types
> of constructs to embed JIT-time constants (say, an interned string, or a
> reference to the current function object) into the function being compiled.
>  Heuristically, we can all see the different of intent between "ptr + 5"
> and "load (int*)0x2aaaaa0000", but it seems like it'd be difficult to come
> up with reasonable rules that would separate them.
>
>   All of the cases I've seen in JITed code can be dealt with
> differently.  By emitting a global variable and then using the "link time"
> address resolution to map it to the right address, you get the same effect
> while remaining entirely within the well defined part of the IR.  I don't
> see this case as being worth restricting an otherwise reasonable
> optimization.
>
> One problem with Dan's interpretation of the current rules is that this
> otherwise legal transform becomes problematic:
> %addr = inttoptr 0x2aaaaa0005 to %i32*
> ===>
> %tmp = add i32 0x2aaaaa0000, i32 5
> %addr = inttoptr %tmp to %i32*
>
> We probably wouldn't do this at the IR level, but we definitely do perform
> this transform in the backends.  There's no reason it *shouldn't* be valid
> at the IR level either.
>
> Philip
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140910/cb5d27c5/attachment.html>


More information about the llvm-dev mailing list