[LLVMdev] Canonicalization of ptrtoint/inttoptr and getelementptr

Dan Gohman dan433584 at gmail.com
Thu Sep 11 14:20:36 PDT 2014


On Wed, Sep 10, 2014 at 3:16 PM, Philip Reames <listmail at philipreames.com>
wrote:

>
> On 09/10/2014 02:55 PM, Kevin Modzelewski wrote:
>
>
> On Tue, Sep 9, 2014 at 9:27 PM, Philip Reames <listmail at philipreames.com>
> wrote:
>
>>
>>  I'm guessing the key difference in our reasoning is about the constant
>> 5.  :)  I'm also guessing that you have an example in mind which motivates
>> the need for 5 to be considered associated with the address range.  Could
>> you expand on why?
>>
>>
>  Can't speak for Dan, but in Pyston we certainly make use of these types
> of constructs to embed JIT-time constants (say, an interned string, or a
> reference to the current function object) into the function being compiled.
>  Heuristically, we can all see the different of intent between "ptr + 5"
> and "load (int*)0x2aaaaa0000", but it seems like it'd be difficult to come
> up with reasonable rules that would separate them.
>
>   All of the cases I've seen in JITed code can be dealt with
> differently.  By emitting a global variable and then using the "link time"
> address resolution to map it to the right address, you get the same effect
> while remaining entirely within the well defined part of the IR.  I don't
> see this case as being worth restricting an otherwise reasonable
> optimization.
>
> One problem with Dan's interpretation of the current rules is that this
> otherwise legal transform becomes problematic:
> %addr = inttoptr 0x2aaaaa0005 to %i32*
> ===>
> %tmp = add i32 0x2aaaaa0000, i32 5
> %addr = inttoptr %tmp to %i32*
>
> We probably wouldn't do this at the IR level, but we definitely do perform
> this transform in the backends.  There's no reason it *shouldn't* be valid
> at the IR level either.
>

I don't quite follow your example here. However, there's a key difference
between what happens in the backends and what happens at the mid-level IR:
The mid-level IR does serious alias analysis. The backends do a much more
limited form of alias analysis, and they supplement it by calling back into
the middle-end for the hard stuff. There's no question that a properly
formed ptrtoint+add+inttoptr computes the exact same bits as a
corresponding getelementptr, on all platforms and in all circumstances. The
difference is in the extra aliasing rules that are activated when the
getelementptr instruction is used. The backends don't have a reason to
preserve those rules, so they don't bother.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140911/8099acc6/attachment.html>


More information about the llvm-dev mailing list