[PATCH] D14933: Add the allocsize attribute to LLVM

Tue Feb 16 20:03:16 PST 2016

Hi George,

Have you considered using operand bundles to express arbitrary
allocation size information?  With operand bundles, the representation
would be something like

  define i8* @checked_malloc(i32 %N) {
        %alloc_size = add i32 %N, 8
        %1 = call i8* @object_malloc(i32 %N) [ "alloc-size"(i32 %alloc_size) ]
        %2 = call i32 @llvm.objectsize.i32.p0i8(i8* %1)
        %3 = icmp lt %2, %N
        br i1 %3, label %Trap, label %Return
        ...
  }

and you'd fold (objectsize (call fn(...) [ "alloc-size"(N) ])) to (N).
Once you've decided that there are likely no more opportunities to
simplify llvm.objectsize calls, you can drop the "alloc-size" operand
bundles from all calls (this can possibly be folded into CGP or some
other pass, to avoid re-traversing the whole module).

The caveat, of course, is that in this scheme you won't be able to
fold (objectsize (call allocation_fn)) where the call to allocation_fn
was initially an indirect call that got devirtualized, since you need
to know the call sites that will call allocation functions when
emitting the calls.

-- Sanjoy

On Tue, Feb 16, 2016 at 7:11 PM, George Burgess IV via llvm-commits
<llvm-commits at lists.llvm.org> wrote:
> george.burgess.iv added a comment.
>
> Let me modify your code sample a bit, so it represents what the transformation would really look like if we wanted to support arbitrary expressions.
>
>>    define i8* @checked_malloc(i32 %N) {
>
>>       %1 = call i8* @object_malloc(i32 %N)
>
>>       %2 = call {i32, i32} @llvm.objectsize.and.object.offset(i8* %1)
>
>>       %3 = call {i32, i32} @__autogenerated_get_object_size_of_object_malloc(i32 %N)
>
>>       %c = icmp  eq %2, %3
>
>>       call void @llvm.assume(i1 %c)
>
>>       %4 = extractelement {i32, i32} %2, 0
>
>>       %5 = icmp lt %4, %N
>
>>       br i1 %3, label %Trap, label %Return
>
>>       ...
>
>>   }
>
>>
>
>
> So let's walk through what needs to happen here:
>
> - If we don't run a special piece of code (`ObjectSizeOffsetEvaluator`), then the call to `@__autogenerated_get_object_size_of_object_malloc` must go away without a trace.
> - If we do run `ObjectSizeOffsetEvaluator` on `%1`, `@__autogenerated_get_object_size_of_object_malloc` must stay, as it will be used
>
> Because of this, we either have to get rid of `@__autogenerated_get_object_size_of_object_malloc` before inlining, or we need to mark it `noinline` so it's not impossible to remove from functions. The former case means we lose our `allocsize` *really* early, and the latter means we probably get slower code (especially if the function amounts to `return {N - 8, 8};`)
>
> Also, we'd need a highly restricted user function constant folder to make this work, just like in http://reviews.llvm.org/D16390.
>
> Plus, `@llvm.assume` says on the tin that it makes optimizing things harder, so this makes optimizing any function that calls an allocation function harder. Ouch. Okay, let's try something else.
>
>   define i8* @checked_malloc(i32 %N) {
>     %1 = call i8* @object_malloc(i32 %N)
>     call void @llvm.assert_object_size_is(i8* %1, {i32, i32}(i32)* @__autogenerated_get_object_size_of_object_malloc, i32 %N)
>     ...
>   }
>
> This seems more reasonable. Here are the problems it poses, as I see it:
>
> - It still makes computing `@llvm.objectsize` more complex (instead of chasing a pointer through bitcasts/GEPs/... to its definition, you need to check at every GEP/bitcast/... for uses of the pointer that are calls to this intrinsic.
> - We still need a special user function constant folder.
> - We still need to rely on optimizations happening to `@__autogenerated_get_object_size_of_object_malloc` in order to have a chance of constant folding it.
> - We waste time optimizing `@__autogenerated_get_object_size_of_object_malloc`, a function that may never even hit an object file.
> - We would need to magically turn the intrinsic into an actual function call in some cases. If this happens late, the function won't get inlined. Like said above, if the function amounts to `return {N - 8, 8};`, that's not good.
> - `@__autogenerated_get_object_size_of_object_malloc` isn't allowed to have any side-effects (because the intrinsic would presumably be treated as though it has no side-effects, much unlike a `call` instruction).
> - We need to keep around anything passed to `@__autogenerated_get_object_size_of_object_malloc` until the intrinsic is deleted, so it may suffer from similar optimization problems as `@llvm.assume`.
> - We still need to emit this at every callsite to any allocation function.
>
> ...And this is the best alternate design I can think of. Does this match what you were thinking? If so, I'm not at all convinced that it's strictly better than what I'm presenting here, which prevents no optimizations, is very lightweight, doesn't require special handling everywhere, doesn't need a custom constant folder, etc.
>
> --------
>
> Honestly, and this is me speaking from a position of complete ignorance as to what the actual real-life use case of this would be, in what case are tiny wrapper functions unacceptable? The only things I have to pull from are here: http://lists.llvm.org/pipermail/cfe-dev/2012-June/022272.html -- can you tell me why the below alternatives don't work just as well? Or give me an example where it wouldn't be possible to write a similar wrapper? I'm unable to think of one.
>
>   // Original
>   char *my_strdup(char *str) __attribute__((alloc_size_ex(strlen(str)+1)));
>
>   // Modified
>   extern char *_my_strdup_impl(const char *str, int N) __attribute__((alloc_size(1)));
>   inline char *my_strdup(const char *str) {
>     size_t total_len = strlen(str)+1;
>     return _my_strdup_impl(str, total_len);
>   }
>
>   // ------------------
>
>   // Original
>   void *my_complex_alloc(int n, int size, int add) __attribute__((alloc_size_ex(n * size + add)));
>
>   // Modified -- _impl returns the start of the buffer
>   extern void *_my_complex_alloc_impl(int total_bytes, int n, int size, int add) __attribute__((alloc_size(0)));
>   inline void *my_complex_alloc(int n, int size, int add) {
>     return _my_complex_alloc_impl(n * size + add, n, size, add);
>   }
>
>   // ------------------
>   // Original
>   char *middle(int size) __attribute__((alloc_size_ex(size, size/2)));
>
>   // Modified - Again, _middle_impl returns the start of the buffer
>   extern char *_middle_impl(int size) __attribute__((alloc_size(0)));
>   inline char *middle(int size) { return _middle_impl(size) + size/2; }
>
>
> http://reviews.llvm.org/D14933
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits

-- 
Sanjoy Das
http://playingwithpointers.com