[llvm-dev] Problem with __builtin_object_size when it depends on a condition

Thu Mar 17 04:06:18 PDT 2016

I'm not trying to add runtime checks, I want to transform 
llvm.objectsize into constant in a different place.

Here is the .ll prints that might explain better what I'm doing. This is 
for example with inlining enabled. In case
that inlining is disabled, the same transformations happen except for 
main. Main only has a call to foo().

Combine redundant instruction for function foo() will calculate minimum 
or maximum value depends on second argument,
and put it in third argument. We must save this value to third argument 
so that inliner has a chance to eliminate
the condition. If we replace with constant in foo() then we get wrong 
value once inlined in main.

*** IR Dump After Simplify the CFG ***
; Function Attrs: nounwind readnone uwtable
define i64 @foo(i32 %flag) #0 {
entry:
   %chararray = alloca [30 x i8], align 16
   %chararray2 = alloca [10 x i8], align 1
   %0 = getelementptr inbounds [30 x i8], [30 x i8]* %chararray, i64 0, 
i64 0
   call void @llvm.lifetime.start(i64 30, i8* %0) #5
   %1 = getelementptr inbounds [10 x i8], [10 x i8]* %chararray2, i64 0, 
i64 0
   call void @llvm.lifetime.start(i64 10, i8* %1) #5
   %tobool = icmp eq i32 %flag, 0
   %cptr.0 = select i1 %tobool, i8* %0, i8* %1
   %2 = call i64 @llvm.objectsize.i64.p0i8.i32(i8* %cptr.0, i1 true, i32 0)
   call void @llvm.lifetime.end(i64 10, i8* %1) #5
   call void @llvm.lifetime.end(i64 30, i8* %0) #5
   ret i64 %2
}
*** IR Dump After Combine redundant instructions ***
; Function Attrs: nounwind readnone uwtable
define i64 @foo(i32 %flag) #0 {
entry:
   %chararray = alloca [30 x i8], align 16
   %chararray2 = alloca [10 x i8], align 1
   %0 = getelementptr inbounds [30 x i8], [30 x i8]* %chararray, i64 0, 
i64 0
   call void @llvm.lifetime.start(i64 30, i8* %0) #5
   %1 = getelementptr inbounds [10 x i8], [10 x i8]* %chararray2, i64 0, 
i64 0
   call void @llvm.lifetime.start(i64 10, i8* %1) #5
   %tobool = icmp eq i32 %flag, 0
   %cptr.0 = select i1 %tobool, i8* %0, i8* %1
   %2 = call i64 @llvm.objectsize.i64.p0i8.i32(i8* %cptr.0, i1 true, i32 10)
   call void @llvm.lifetime.end(i64 10, i8* %1) #5
   call void @llvm.lifetime.end(i64 30, i8* %0) #5
   ret i64 %2
}

Combining redundant instructions for main calculates correct value (30) 
because foo() is inlined and now  the condition is
eliminated. If we didn't leave the object size in foo(), the value would 
be 10.

*** IR Dump After Simplify the CFG ***
; Function Attrs: nounwind uwtable
define i32 @main() #3 {
entry:
   %chararray.i = alloca [30 x i8], align 16
   %0 = getelementptr inbounds [30 x i8], [30 x i8]* %chararray.i, i64 
0, i64 0
   call void @llvm.lifetime.start(i64 30, i8* %0) #5
   %1 = call i64 @llvm.objectsize.i64.p0i8.i32(i8* %0, i1 true, i32 10) #5
   call void @llvm.lifetime.end(i64 30, i8* %0) #5
   %call1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([5 x 
i8], [5 x i8]* @.str, i64 0, i64 0), i64 %1)
   ret i32 0
}
*** IR Dump After Combine redundant instructions ***
; Function Attrs: nounwind uwtable
define i32 @main() #3 {
entry:
   %call1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([5 x 
i8], [5 x i8]* @.str, i64 0, i64 0), i64 30)
   ret i32 0
}

Codegen prepare will finaly replace the llvm.object size with constant 
in function foo.

*** IR Dump After Partially inline calls to library functions ***
; Function Attrs: nounwind readnone uwtable
define i64 @foo(i32 %flag) #0 {
entry:
   %chararray = alloca [30 x i8], align 16
   %chararray2 = alloca [10 x i8], align 1
   %0 = getelementptr inbounds [30 x i8], [30 x i8]* %chararray, i64 0, 
i64 0
   call void @llvm.lifetime.start(i64 30, i8* %0) #5
   %1 = getelementptr inbounds [10 x i8], [10 x i8]* %chararray2, i64 0, 
i64 0
   call void @llvm.lifetime.start(i64 10, i8* %1) #5
   %tobool = icmp eq i32 %flag, 0
   %cptr.0 = select i1 %tobool, i8* %0, i8* %1
   %2 = call i64 @llvm.objectsize.i64.p0i8.i32(i8* %cptr.0, i1 true, i32 10)
   call void @llvm.lifetime.end(i64 10, i8* %1) #5
   call void @llvm.lifetime.end(i64 30, i8* %0) #5
   ret i64 %2
}
*** IR Dump After CodeGen Prepare ***
; Function Attrs: nounwind readnone uwtable
define i64 @foo(i32 %flag) #0 {
entry:
   %chararray = alloca [30 x i8], align 16
   %chararray2 = alloca [10 x i8], align 1
   %0 = bitcast [30 x i8]* %chararray to i8*
   call void @llvm.lifetime.start(i64 30, i8* %0) #5
   %1 = bitcast [10 x i8]* %chararray2 to i8*
   call void @llvm.lifetime.start(i64 10, i8* %1) #5
   %tobool = icmp eq i32 %flag, 0
   %cptr.0 = select i1 %tobool, i8* %0, i8* %1
   call void @llvm.lifetime.end(i64 10, i8* %1) #5
   call void @llvm.lifetime.end(i64 30, i8* %0) #5
   ret i64 10
}

On 16.03.2016. 19:23, George Burgess IV wrote:
> From your email, ISTM that you're proposing that we insert runtime 
> checks to determine the object size, which breaks our guarantee to 
> always lower to a constant. Is this correct? If not, can you please 
> provide an example of the new IR for the function you gave?
>
> Either way, is there a reason that we can't just use the second flag 
> to objectsize here? That flag was designed with cases like this in mind:
>
> int foo(int cond) {
>   char small[10], large[30];
>   void *what = cond ? small : large;
>   // if X = 0 or 1, hand back 30 (max possible size), for 2 or 3, hand 
> back 10 (minimum possible size)
>   // if X = 0 or 1, this gets lowered to @llvm.objectsize(..., 0), if 
> X = 2 or 3, it gets lowered to @llvm.objectsize(..., 1)
>   return __builtin_object_size(what, X);
> }
>
> FWIW, If you're looking for something more accurate than this, and 
> you're willing to have calculations at runtime in exchange for the 
> accuracy, I'd recommend looking into the machinery involved with 
> the ObjectSizeOffsetEvaluator (if you haven't already). I'm not 
> familiar with any of it, but you may find it interesting. :)
>
> On Wed, Mar 16, 2016 at 9:39 AM, Strahinja Petrovic 
> <strahinja.petrovic at rt-rk.com <mailto:strahinja.petrovic at rt-rk.com>> 
> wrote:
>
>     Optimizer doesn't know how to calculate the object size when it
>     finds condition that cannot be eliminated. There is example:
>
>     -----------------------------------------------
>     #include<stdlib.h>
>     #define STATIC_BUF_SIZE 10
>     #define LARGER_BUF_SIZE 30
>
>     size_t foo(int flag) {
>       char *cptr;
>       char chararray[LARGER_BUF_SIZE];
>       char chararray2[STATIC_BUF_SIZE];
>       if(flag)
>         cptr = chararray2;
>        else
>         cptr = chararray;
>
>       return  __builtin_object_size(cptr, 2);
>     }
>
>     int main() {
>       size_t ret;
>       ret = foo(0);
>       printf("\n%d\n", ret);
>       return 0;
>     }
>     ----------------------------------------------
>      If you try to compile this example with clang (trunk version)
>     with option -fno-inline the result will be -1. Without option
>     -fno-inline result will be correct (30). When foo function is
>     inlined into main, condition is eliminated and compiler knows to
>     calculate correct object size. Compiler should be able to
>     calculate object size in both cases (with/without inlining foo
>     function). In case when condition can't be eliminated compiler
>     should calculate object size depending on second argument of
>     __builtin_object_size function (taking minimum or maximum value
>     from condition). In this example, the result should be 10 with
>     -fno-inline.
>
>      If I replace the llvm.objectsize with the constant in foo()
>     depending on the second argument, the result will be correct with
>     -fno-inline (10), but incorrect without the flag. This is because
>     foo() is inlined, the condition can be eliminated, and the result
>     should be 30.
>
>      I resolved this problem by adding third argument in
>     llvm.objectsize intrinsic. When I calculate the result based on
>     condition, I put it in the third argument. If there is no
>     inlining, condition will not get eliminated, and this will be the
>     final result. When there is inlining, condition will be 
>     eliminated after inlining and the llvm.objectsize will be replaced
>     with a constant.
>      With this approach, I get the correct result in both cases (with
>     and without -fno-inline).
>
>      I would like to have a discussion about this approach because I
>     am changing the IR (because of adding third argument to
>     __builtin_object_size function). Do you have some comments ?
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160317/7f47a07b/attachment.html>