[llvm-dev] getelementptr inbounds with offset 0

Robin Kruppe via llvm-dev llvm-dev at lists.llvm.org
Mon Feb 25 04:50:02 PST 2019

On Mon, 25 Feb 2019 at 13:11, Bruce Hoult via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> LLVM has no idea whether the address computed by GEP is actually
> within a legal object. The "inbounds" keyword is just you, the
> programmer, promising LLVM that you know it's ok and that you don't
> care what happens if it is actually out of bounds.
> https://llvm.org/docs/GetElementPtr.html#what-happens-if-an-array-index-is-out-of-bounds

Hi Bruce,

it's not true in general that LLVM has no idea about (or doesn't care
about) object sizes. It can infer object size and other things from
allocas, global variables, and calls to built-in functions such as
malloc(). In the case of Rust we even have an out of tree patch to teach
LLVM the same for Rust's (global) heap allocation functions. You can see
this information being computed in lib/Analysis/MemoryBuiltins.cpp.

More importantly, the question is *what* actually is being promised to
LLVM, more specifically, what the definitions of the terms "out of bounds"
and "object" are in this context. It is easy enough to answer intuitively
in many specific cases whether a GEP should be considered "out of bounds",
but in the cases Ralf described, where offsets and "object sizes" are equal
to 0, it is not so clear-cut and depends on tricky matters such as whether
zero-sized allocations exist. We (Rust developers) very much care what
happens in those cases (it should be a NOP), so it's important to check
whether that is compatible with the Rust compiler emitting inbounds GEPs.

It is true that in practice in many cases LLVM won't be able to determine
conclusively whether an object exists or not and what its bounds are, but
that doesn't answer the question.


> On Sun, Feb 24, 2019 at 9:05 AM Ralf Jung via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> >
> > Hi all,
> >
> > What exactly are the rules for `getelementptr inbounds` with offset 0?
> >
> > In Rust, we are relying on the fact that if we use, for example,
> `inttoptr` to
> > turn `4` into a pointer, we can then do `getelementptr inbounds` with
> offset 0
> > on that without LLVM deducing that there actually is any dereferencable
> memory
> > at location 4.  The argument is that we can think of there being a
> zero-sized
> > allocation. Is that a reasonable assumption?  Can something like this be
> > documented in the LangRef?
> >
> > Relatedly, how does the situation change if the pointer is not created
> "out of
> > thin air" from a fixed integer, but is actually a dangling pointer
> obtained
> > previously from `malloc` (or `alloca` or whatever)?  Is getelementptr
> inbounds`
> > with offset 0 on such a pointer a NOP, or does it result in `poison`?
> And if
> > that makes a difference, how does that square with the fact that, e.g.,
> the
> > integer `0x4000` could well be inside such an allocation, but doing
> > `getelementptr inbounds` with offset 0 on that would fall under the first
> > question above?
> >
> > Kind regards,
> > Ralf
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190225/8fb65bbe/attachment.html>

More information about the llvm-dev mailing list