[llvm-dev] Is it ok to allocate > half of address space?
Friedman, Eli via llvm-dev
llvm-dev at lists.llvm.org
Wed Nov 8 14:06:19 PST 2017
On 11/8/2017 9:24 AM, Nuno Lopes via llvm-dev wrote:
> Hi,
>
> I was looking into the semantics of GEP inbounds and some BasicAA
> rules and I'm wondering if it's valid in LLVM IR to allocate more than
> half of the address space with a global variable or an alloca.
> If that's a scenario want to consider, then we have problems :)
>
> Consider this C code (32 bits):
> #include <string.h>
>
> char obj[0x80000008];
>
> char f() {
> char *p = obj + 0x79999999;
> char *q = obj + 0x80000000;
> *q = 1;
> memcpy(p, "abcd", 4);
> return *q;
> }
>
>
> Clearly the stores alias, and the memcpy should override the value
> written by "*q = 1".
>
> I dunno if this is legal in C or not, but the IR produced by clang
> looks like (32 bits):
>
> @obj = common global [2147483656 x i8] zeroinitializer, align 1
>
> define signext i8 @f() {
> store i8 1, i8* getelementptr inbounds (i8, i8* getelementptr
> inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0),
> i32 -2147483648), align 1
> call void @llvm.memcpy.p0i8.p0i8.i32(i8* getelementptr inbounds
> ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 2040109465),
> i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i32 0, i32 0),
> i32 4, i32 1, i1 false)
> %1 = load i8, i8* getelementptr inbounds (i8, i8* getelementptr
> inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0),
> i32 -2147483648), align 1
> ret i8 %1
> }
>
> With -O2, the store to q gets forwarded, and so we get "ret i8 1".
> So, BasicAA concluded that p and q don't alias. The culprit is an
> overflow in BasicAAResult::isGEPBaseAtNegativeOffset().
>
> So my question is do we care about this use case where a single
> allocation can take more than half of the address space?
>
Accoding to LangRef, your IR currently has undefined behavior: the rules
for "inbounds" GEPs say that indexes are treated as signed values. And
solving that would involve changing the way we represent GEPs in IR, so
I think you can consider that out of scope.
Assuming we're not dealing with inbounds GEPs (e.g. you pass -fwrapv to
clang), I don't see any particular reason to disallow allocations more
than half the address-space.
-Eli
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
More information about the llvm-dev
mailing list