[PATCH] D37419: Teach scalar evolution to handle inttoptr/ptrtoint

Hal Finkel via llvm-commits llvm-commits at lists.llvm.org
Sun Sep 3 18:42:19 PDT 2017


On 09/03/2017 08:29 PM, Daniel Berlin via llvm-commits wrote:
>
>
> On Sun, Sep 3, 2017 at 6:16 PM, Sanjoy Das 
> <sanjoy at playingwithpointers.com 
> <mailto:sanjoy at playingwithpointers.com>> wrote:
>
>     On Sun, Sep 3, 2017 at 4:53 PM, Daniel Berlin <dberlin at dberlin.org
>     <mailto:dberlin at dberlin.org>> wrote:
>     >> > I also don't understand "  // It's tempting to handle
>     inttoptr and
>     >> > ptrtoint
>     >> > as no-ops, however this can
>     >> >   // lead to pointer expressions which cannot safely be
>     expanded to
>     >> > GEPs,
>     >> >   // because ScalarEvolution doesn't respect the GEP aliasing
>     rules when
>     >> >   // simplifying integer expressions."
>     >> >
>     >> > 1. What are "the gep aliasing rules", exactly?
>     >> > https://llvm.org/docs/LangRef.html#pointeraliasing
>     <https://llvm.org/docs/LangRef.html#pointeraliasing> does not mention
>     >> > special
>     >> > rules for gep :)
>     >>
>     >> I think the rule it is talking about is that reads and writes
>     on a GEP
>     >> are allowed to only read from and write to the allocation that is
>     >> passed in as the first parameter to the GEP
>     >
>     > When we had the discussion about "inrange" and "inbounds", (and
>     elsewhere,
>     > like field accesses) it was raised repeatedly that it's just
>     fine to gep
>     > outside what would be the normal "range" of a pointer, because
>     you can't
>     > ever really tell what it is.
>     > IE a gep of a given pointer can access anything it wants.
>     >
>     > In fact, it seemed generally accepted that you can happily GEP
>     from a
>     > pointer to anywhere you want unless there's an inrange or
>     inbounds on it.
>
>     I think there are two issues here.
>
>     Non-inbounds GEPs to outside the base allocation is allowed by
>     themselves, and produce a bitwise value. 
>
>
> Sure
>
>     Inbounds GEPs to outside the
>     base allocation produces poison.
>
>     So this is allowed:
>
>       %p0 = malloc(16);
>       %p1 = malloc(16);
>       %gep = gep i8* %p0, i64 18
>       %cmp = icmp eq %gep, %p1
>       br i1 %cmp, ...
>
>     but it would not be allowed if the GEP was inbounds (we'd be branching
>     on poison).
>
>
> Sure
>
>     *However* (this is the "and issue reads/writes" bit in my previous
>     email) for reading / writing through the result of a GEP the GEP must
>     *also* follow the pointer aliasing rules.  Of course the GEP must not
>     have produced poison either, that's necessary but not sufficient for
>     safe memory operations.
>
>     > So i'm not sure what you mean by "the allocation", since none of
>     these
>     > allocations are  defined in terms of llvm memory semantics (IE
>     they aren't
>     > alloca's, they are malloc/frees).
>     > If that's not the case, we are not being consistent (IE we're losing
>     > optimizations we shouldn't, and we have correctness issues we
>     don't think we
>     > do).
>
>     [snip]
>
>     > This set of rules, without anything more, implies we have both
>     correcntess
>     > and optimization issues :)
>
>     The semantic in "pointer aliasing rules" is already implemented --
>     BasicAA returns NoAlias for %p1 and %gep even though @subtract could
>     have returned an offset that will guarantee that at runtime %p1 ==
>     %gep:
>
> This is definitely news to me.

Sanjoy's description of the current behavior is accurate. I see two ways 
of looking at this:

  1. This is a bug.
  2. To allow this kind of inter-object addressing you need to use 
inttoptr/ptrtoint.

I don't believe that it is possible for the current behavior to manifest 
as a problem for C/C++ (and this is likely the same for most 
higher-level languages).

  -Hal

>
>     ```
>     target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>     target triple = "x86_64-unknown-linux-gnu"
>
>     define void @f(i64 %arg, i64 %arg1) {
>     bb:
>       %p0 = tail call noalias i8* @malloc(i64 %arg) #3
>       %p1 = tail call noalias i8* @malloc(i64 %arg) #3
>       %offset = tail call i64 @subtract(i8* %p1, i8* %p0)
>       %gep = getelementptr i8, i8* %p0, i64 %offset  ;; not inbounds
>       ret void
>     }
>
>     declare noalias i8* @malloc(i64)
>     declare i64 @subtract(i8*, i8*)
>     ```
>
>     I believe this follows from the GetUnderlyingObject and
>     isIdentifiedObject related logic in BasicAAResult::aliasCheck.
>
>     And I'm not aware of any transforms in LLVM today that contradict the
>     pointer aliasing rules.
>
>
> I assume you meant "do the wrong thing here". Which i still believe is 
> definitely wrong, FWIW, but i'm too lazy to go searching through 
> bugzilla some more today, so i'm going to let it go.
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170903/be9efae9/attachment.html>


More information about the llvm-commits mailing list