[llvm-dev] GEP with a null pointer base
James Y Knight via llvm-dev
llvm-dev at lists.llvm.org
Thu Jul 6 13:19:19 PDT 2017
So, I actually think there's two separate parts of this.
1. Pointer math is required to stay within an object if the GEP has the
"inbounds" attribute. Nothing is inbounds of the null pointer, so a GEP
instruction with a null pointer, and non-zero offsets, is invalid on its
face, even without the load.
Now, you can tell clang to stop putting inbounds on pointer arithmetic GEPs
by using -fwrapv or -fno-strict-overflow. Doing that doesn't fix the issue,
because it's not actually the GEP that's being optimized in this particular
case.
2. Separately, LLVM's pointer aliasing rules prohibit (even without the GEP
marked inbounds) constructing a pointer to a random object with a GEP. (
http://llvm.org/docs/LangRef.html#pointeraliasing). This permission is used
in InstCombineLoadStoreAlloca.cpp canSimplifyNullLoadOrGEP to eliminate the
load based on a null pointer GEP, despite the non-zero offset to a
not-inbounds GEP.
To the first, asking people who do crazy stuff like this to compile with
-fno-strict-overflow is reasonable...they're invoking C-level undefined
behavior by doing arbitrary pointer-math, so they should use the flag that
tells the compiler to make arbitrary pointer-math not be UB.
And to the second, it seems to me that it would be fairly reasonable change
to make the latter optimization dependent on the GEP being marked inbounds.
(Even if the alias analysis still doesn't promise you anything here.)
On Thu, Jul 6, 2017 at 3:17 PM, Kaylor, Andrew <andrew.kaylor at intel.com>
wrote:
> > glibc does accept patches...or are you talking about two separate
> instances of this problem, both in glibc and something else?
>
>
>
> I originally saw this in a benchmark (which it may be possible to get
> changed) and only afterward found the glibc idiom.
>
>
>
> The most recent glibc code is a bit more complicated than I represented
> below. If you look up obstack.h you can see what’s there now. Basically,
> they’re computing a pointer alignment that may be based on some non-null
> base pointer or may be based on null, depending on the target
> architecture. I’m sure it’s possible to make it more sensible, but I’ve
> never interacted with the glibc community so I’m not sure how open they
> would be to changing this. I suspect I’d need a more compelling argument
> than clang-compatibility (or maybe even standards compatibility).
>
>
>
> *From:* James Y Knight [mailto:jyknight at google.com]
> *Sent:* Thursday, July 06, 2017 12:06 PM
> *To:* Kaylor, Andrew <andrew.kaylor at intel.com>
> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] GEP with a null pointer base
>
>
>
>
>
>
>
> On Thu, Jul 6, 2017 at 2:06 PM, Kaylor, Andrew via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> I've got a problem that I would like some input on. The problem basically
> boils down to a program that I am compiling, whose source I don't control,
> doing something like this:
>
>
>
> p = (char*)0 + n
>
>
>
> where 'n' is an intptr_t-sized value that the program knows is actually a
> valid address for a pointer.
>
>
>
> clang translates this as
>
>
>
> %p = getelementptr inbounds i8, i8* null, i64 %n
>
>
>
> So far, so good. The problem is that while LLVM seems to consider the
> above IR to be valid, we officially do not allow dereferencing a pointer
> constructed in this way (if I’m reading the rules correctly).
> Consequently, if this GEP ever gets close enough to a load using the
> pointer, InstCombine will eliminate the GEP and the load.
>
>
>
> I've been told that the '(char*)0 + n' construct is invalid according to
> the C standard. However, this pattern appears in the glibc malloc
> implementation, so I'd like to be able to handle it anyway.
>
>
>
> glibc does accept patches...or are you talking about two separate
> instances of this problem, both in glibc and something else?
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170706/62e8b22f/attachment.html>
More information about the llvm-dev
mailing list