[PATCH] D106426: [FuncSpec] Support specialising recursive functions

Fri Jul 23 07:21:23 PDT 2021

SjoerdMeijer added a comment.

Thanks for elaborating on this!
See my comments inline.

In D106426#2899586 <https://reviews.llvm.org/D106426#2899586>, @ChuanqiXu wrote:

> Let me give a more formal description.
> (1) Call the set of new generated functions in i-th iteration with `Fs[i]`, and `Fs[0]` would be the set of functions initially. And the number of functions generated in  i-th iterations would be `|Fs[i]|`.
> (2) The penalty to specialize a function would be `Penalty(F) * NumSpecialized[i]`. `NumSpecialized[i]` would be `Fs[1] + Fs[2] + ... + Fs[I]`.
> (3) The bonus to specialize a function would be `Bonus(F, ArgNo)`. For normal recursive function `F` and the specialized one `SF`, it's normal that `Bonus(F, ArgNo) == Bonus(SF, ArgNo)` and `Penalty(F) == Penalty(SF)`. It means that `SF` may be specialized again if `NumSpecialized[I]` may not be large enough.
>
> Then we could find that now the total number of specialized function would be controlled by `NumSpecialized` which increases linearly.

100% agreed so far. This is indeed how things work at the moment to recursively/linearly specialise recursive functions.

> But here is the problem that the implementation `Bonus(F, ArgNo)` would increase exponentially with the depth of loops. It shows that we may get in trouble if we met a recursive function with a deep loop.
> And the things we want to do is to add the iteration time `I` to the cost model of `Penalty(F, i)`.

Ok, thanks, I am going to look into this!

> For example,
>
>   ; opt -function-specialization -func-specialization-max-iters=100  -S %s
>   @Global = internal constant i32 1, align 4
>   
>   define internal void @recursiveFunc(i32* nocapture readonly %arg) {
>     %temp = alloca i32, align 4
>     %arg.load = load i32, i32* %arg, align 4
>     %arg.cmp = icmp slt i32 %arg.load, 10000
>     br i1 %arg.cmp, label %loop1, label %ret.block
>   
>   loop1:
>     br label %loop2
>   
>   loop2:
>     br label %loop3
>   
>   loop3:
>     br label %loop4
>   
>   loop4:
>     br label %block6
>   
>   block6:
>     call void @print_val(i32 %arg.load)
>     %arg.add = add nsw i32 %arg.load, 1
>     store i32 %arg.add, i32* %temp, align 4
>     call void @recursiveFunc(i32* nonnull %temp)
>     br label %loop4.end
>   
>   loop4.end:
>     %exit_cond1 = call i1 @exit_cond()
>     br i1 %exit_cond1, label %loop4, label %loop3.end
>   
>   loop3.end:
>     %exit_cond2 = call i1 @exit_cond()
>     br i1 %exit_cond2, label %loop3, label %loop2.end
>   
>   loop2.end:
>     %exit_cond3 = call i1 @exit_cond()
>     br i1 %exit_cond3, label %loop2, label %loop1.end
>   
>   loop1.end:
>     %exit_cond4 = call i1 @exit_cond()
>     br i1 %exit_cond4, label %loop1, label %ret.block
>   
>   ret.block:
>     ret void
>   }
>   
>   define i32 @main() {
>     call void @recursiveFunc(i32* nonnull @Global)
>     ret i32 0
>   }
>   
>   declare dso_local void @print_val(i32)
>   declare dso_local i1 @exit_cond()
>
> I guess I would be happy if `recursiveFunc ` would get specialized less than 4 times even when we set  `func-specialization-max-iters` to 100.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106426/new/

https://reviews.llvm.org/D106426