[llvm-dev] Is it ok to allocate > half of address space?
Alexander Cherepanov via llvm-dev
llvm-dev at lists.llvm.org
Wed Nov 8 11:26:40 PST 2017
On 11/08/2017 08:24 PM, Nuno Lopes via llvm-dev wrote:
> I was looking into the semantics of GEP inbounds and some BasicAA rules
> and I'm wondering if it's valid in LLVM IR to allocate more than half of
> the address space with a global variable or an alloca.
> If that's a scenario want to consider, then we have problems :)
>
> Consider this C code (32 bits):
> #include <string.h>
>
> char obj[0x80000008];
>
> char f() {
> char *p = obj + 0x79999999;
I guess you mean 0x7fffffff here.
> char *q = obj + 0x80000000;
> *q = 1;
> memcpy(p, "abcd", 4);
> return *q;
> }
>
>
> Clearly the stores alias, and the memcpy should override the value
> written by "*q = 1".
>
> I dunno if this is legal in C or not, but the IR produced by clang looks
> like (32 bits):
>
> @obj = common global [2147483656 x i8] zeroinitializer, align 1
>
> define signext i8 @f() {
> store i8 1, i8* getelementptr inbounds (i8, i8* getelementptr
> inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0), i32
> -2147483648), align 1
> call void @llvm.memcpy.p0i8.p0i8.i32(i8* getelementptr inbounds
> ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 2040109465), i8*
> getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i32 0, i32 0), i32 4,
> i32 1, i1 false)
> %1 = load i8, i8* getelementptr inbounds (i8, i8* getelementptr
> inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0), i32
> -2147483648), align 1
> ret i8 %1
> }
>
> With -O2, the store to q gets forwarded, and so we get "ret i8 1".
> So, BasicAA concluded that p and q don't alias. The culprit is an
> overflow in BasicAAResult::isGEPBaseAtNegativeOffset().
>
> So my question is do we care about this use case where a single
> allocation can take more than half of the address space?
Yeah, I'm curious about it too. One of the complications is that the
compiler doesn't control all the situation -- the size of the allocation
could be read by the program from outside and the allocation could be
done by a libc (and glibc will happily allocate more than half the
address space).
There is a good discussion of various related topics in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999 .
--
Alexander Cherepanov
More information about the llvm-dev
mailing list