[PATCH] D68342: [Analysis] Don't assume that overflow can't happen in EmitGEPOffset

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 2 11:20:30 PDT 2019


lebedev.ri added a comment.

In D68342#1691781 <https://reviews.llvm.org/D68342#1691781>, @miyuki wrote:

> >> I also thought that "in bounds address of an allocated object" has something to do with the type used in the GEP instruction, but that's not how Clang interprets it.
> >>  E.g. for the following code
> >> 
> >>   int read(int *buf) {
> >>     buf -= 2;
> >>     return *buf;
> >>   }
> >>    
> >> 
> >> It generates the following:
> >> 
> >>   define dso_local i32 @_Z4readPi(i32* nocapture readonly %buf) local_unnamed_addr #0 {
> >>   entry:
> >>     %add.ptr = getelementptr inbounds i32, i32* %buf, i64 -2
> >>     %0 = load i32, i32* %add.ptr, align 4, !tbaa !2
> >>     ret i32 %0
> >>   }
> > 
> > I do not understand this point, could you elaborate please?
>
> Nothing in the code implies that `buf` points to a single i32 value rather than to an element in an array of i32, but Clang nevertheless adds `inbounds`.


I don't see how that follows?
Quote from http://eel.is/c++draft/expr.add#4:

  4     When an expression J that has integral type is added to or subtracted
        from an expression P of pointer type, the result has the type of P.
  (4.1) If P evaluates to a null pointer value and J evaluates to 0,
        the result is a null pointer value.
  (4.2) Otherwise, if P points to an array element i of an array object x with n
        elements ([dcl.array]), the expressions P + J and J + P
        (where J has the value j) point to the (possibly-hypothetical) array
        element i+j of x if 0≤i+j≤n and the expression P - J points to the 
        (possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
  (4.3) Otherwise, the behavior is undefined.

(see also C `6.5.6p8`)

Which quite precisely maps to LangRef

> "If the inbounds keyword is present, the result value of the getelementptr is a poison value if the base pointer is not an in bounds address of an allocated object <...>"

So clang is perfectly correct here.

> If InstCombine tried to get an offset from the `getelementptr inbounds i32, i32* %buf, i64 -2` instruction  it would generate a `mul nuw i64 4, -2` instruction, which wraps.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68342/new/

https://reviews.llvm.org/D68342





More information about the llvm-commits mailing list