[LLVMdev] Question about intrinsic function llvm.objectsize

Arnold Schwaighofer aschwaighofer at apple.com
Wed Feb 27 07:41:00 PST 2013


On Feb 27, 2013, at 4:05 AM, Nuno Lopes <nunoplopes at sapo.pt> wrote:

> Hi,
> 
> Regarding the definition of object for @llvm.objectsize, it is identical to gcc's __builtin_object_size().  So it's not wrong; it's just the way it was defined to be.
> 
> Regarding the BasicAA's usage of these functions, I'm unsure.  It seems to me that isObjectSmallerThan() also expects the same definition, but I didn't review the code carefully.
> When you do a load from a certain memory address, basicAA is interested to know if that load will overflow the buffer bounds, which means that no aliasing can occur (or it's an UB operation).
> Therefore I don't think basicAA cares about the size of the whole object, but just the remaining part of it (size-offset).  But again, I could be wrong here.

No, it cares about the size of the whole object in the code quoted below.
const Value *O1 = GetUnderlyingObject(V1, TD);

But that is fine. See answers below.

> 
> 
> Quoting Shuxin Yang <shuxin.llvm at gmail.com>:
> 
>> Hi,
>> 
>>  In the following instruction sequence, llvm.objectsize.i64(p) returns 6 (the entire *.ll is attached to the mail).
>> Is this correct? Shouldn't the "object" refer to the entire block of memory being allocated?
>> 

No, the ""llvm.objectsize intrinsic" is defined as the size of the object pointed to by the first argument, in your case "p+50". 
To implement the llvm.objectsize intrinsic we call getObjectSize on the "p+50" pointer.

See http://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html for intended use.

>>  (char*) p = malloc(56)
>>  llvm.objectisize.i32(p+50);
>> 
>> Thanks
>> Shuxin
>> 
>> 
>> 
>>  This question is related to PR14988 (failure in bootstrap build with LTO).
>> Part of the reason is that
>> the compiler interpret the pretty vague team "object" in different ways -- the "object" suggested by
>> line 359 @ figure 1 is just a part of the "object" implied by Figure 2.
>> 

Well, it depends on which object you are talking about an object pointed to by a pointer or an "underlying object". Both are valid things to talk about in varying contexts.

>>  I try to fix the problem PR14988 by replacing line 359 with "Size = Offset.getZextValue()"; It does fix the problem.\

That does not sound right. The size of an object pointed to by a pointer is the size of the underlying object minus any offsets.

>> However, I'm wondering in what situations the Offset != 0. So I put an assertion right before line 359,

The offset is not null if the "Ptr" passed does not point to an underlying object.

>> saying "assert(Offset != 0)". It catches two cases in SingleSource test suite. I investigate one of them,
>> the assertion is triggered when the llvm::getObjectSize() is called by instcombine (instead of alias analyzer)

Yes, this observation makes sense. In this case there is an offset (you are passing "ptr+offset") and you want this behavior (llvm.objectsize is defined as such). See below.
>> in an attempt to replace llvm.objectsize() with a constant. I think the way  llvm::getObjectSize() interpret "object" is wrong.
>> 

No, it is behaving as expected depending on where you call it from. See below.
>> 
>> Figure 1
>> cat -n lib/Analysis/MemoryBuiltins.cpp
>>  344 bool llvm::getObjectSize(const Value *Ptr, uint64_t &Size, const DataLayout *TD,
>>  345                          const TargetLibraryInfo *TLI, bool RoundToAlign) {
>>  346   if (!TD)
>>  347     return false;
>>  348
>>  349   ObjectSizeOffsetVisitor Visitor(TD, TLI, Ptr->getContext(), RoundToAlign);
>>  350   SizeOffsetType Data = Visitor.compute(const_cast<Value*>(Ptr));
>>  351   if (!Visitor.bothKnown(Data))
>>  352     return false;
>>  353
>>  354   APInt ObjSize = Data.first, Offset = Data.second;
>>  355   // check for overflow
>>  356   if (Offset.slt(0) || ObjSize.ult(Offset))
>>  357     Size = 0;
>>  358   else
>>  359     Size = (ObjSize - Offset).getZExtValue(); ??? What the hack is "object"??????
>>  360   return true;
>>  361 }
>> 

The object is the object pointed to by "Ptr"; depending on where you call the "getObjectSize" function from these might be different things: an underlying object or an object "based on" (http://llvm.org/docs/LangRef.html#pointer-aliasing-rules) an underlying object.

In the "llvm.objectsize" context we pass an object "based on p" to getObjectSize: "p+50". In the basicaa context, we wanna know whether an access is beyond the bounds of an underlying object (undefined behavior land) so we pass the underlying object (which in your example would be the "p" returned from malloc) to the getObjectSize function.

In the first case (passing "p+50" to getObjectSize) ObjSize should be 56 and the Offset will be 50 yielding 6 in the second case your ObjSize will be 56 and the offset is zero because basicaa passed the underlying object "p".

I have to agree though the term "object" is somewhat confusing.

>> Figure 2
>> cat -n lib/Analysis/BasicAliasAnalysis.cpp
>> 1205   // If the size of one access is larger than the entire object on the other
>> 1206   // side, then we know such behavior is undefined and can assume no alias.
>> 1207   if (TD)
>> 1208     if ((V1Size != UnknownSize && isObjectSmallerThan(O2, V1Size, *TD, *TLI)) ||
>> 1209         (V2Size != UnknownSize && isObjectSmallerThan(O1, V2Size, *TD, *TLI)))
>> 1210       return NoAlias;

Best,
Arnold



More information about the llvm-dev mailing list