[llvm-dev] Is it ok to allocate > half of address space?

Alexander Cherepanov via llvm-dev llvm-dev at lists.llvm.org
Wed Nov 8 11:26:40 PST 2017


On 11/08/2017 08:24 PM, Nuno Lopes via llvm-dev wrote:
> I was looking into the semantics of GEP inbounds and some BasicAA rules 
> and I'm wondering if it's valid in LLVM IR to allocate more than half of 
> the address space with a global variable or an alloca.
> If that's a scenario want to consider, then we have problems :)
> 
> Consider this C code (32 bits):
> #include <string.h>
> 
> char obj[0x80000008];
> 
> char f() {
>    char *p = obj + 0x79999999;

I guess you mean 0x7fffffff here.

>    char *q = obj + 0x80000000;
>    *q = 1;
>    memcpy(p, "abcd", 4);
>    return *q;
> }
> 
> 
> Clearly the stores alias, and the memcpy should override the value 
> written by "*q = 1".
> 
> I dunno if this is legal in C or not, but the IR produced by clang looks 
> like (32 bits):
> 
> @obj = common global [2147483656 x i8] zeroinitializer, align 1
> 
> define signext i8 @f() {
>    store i8 1, i8* getelementptr inbounds (i8, i8* getelementptr 
> inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0), i32 
> -2147483648), align 1
>    call void @llvm.memcpy.p0i8.p0i8.i32(i8* getelementptr inbounds 
> ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 2040109465), i8* 
> getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i32 0, i32 0), i32 4, 
> i32 1, i1 false)
>    %1 = load i8, i8* getelementptr inbounds (i8, i8* getelementptr 
> inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0), i32 
> -2147483648), align 1
>    ret i8 %1
> }
> 
> With -O2, the store to q gets forwarded, and so we get "ret i8 1".
> So, BasicAA concluded that p and q don't alias. The culprit is an 
> overflow in BasicAAResult::isGEPBaseAtNegativeOffset().
> 
> So my question is do we care about this use case where a single 
> allocation can take more than half of the address space?
Yeah, I'm curious about it too. One of the complications is that the 
compiler doesn't control all the situation -- the size of the allocation 
could be read by the program from outside and the allocation could be 
done by a libc (and glibc will happily allocate more than half the 
address space).

There is a good discussion of various related topics in 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999 .

-- 
Alexander Cherepanov


More information about the llvm-dev mailing list