[PATCH] D106426: [FuncSpec] Support specialising recursive functions

Fri Jul 23 03:17:18 PDT 2021

ChuanqiXu added a comment.

In D106426#2899320 <https://reviews.llvm.org/D106426#2899320>, @SjoerdMeijer wrote:

> First of all, this is an extra option that people need to buy into, so changes won't bother anyone.

Agreed. But it's not a good reason to do things arbitrarily if it is not a default behavior.

> About the suspected problem:
>
>> I think the problem may come from there is no restriction between iterations.
>
> I don't think this is the case. In next iterations, exactly the same cost-model and heuristics are applied. Thus, I expect that if a function was not specialised before, it won't be specialised in a next iteration. Except, for recursive functions that need to be triggered by a very specific optimisation first between iterations.

Yeah, the recursive functions is the problem.

> Okay, if we want to do this here, then we need clarity on the exact problem,

Let me give a more formal description.
(1) Call the set of new generated functions in i-th iteration with `Fs[i]`, and `Fs[0]` would be the set of functions initially. And the number of functions generated in  i-th iterations would be `|Fs[i]|`.
(2) The penalty to specialize a function would be `Penalty(F) * NumSpecialized[i]`. `NumSpecialized[i]` would be `Fs[1] + Fs[2] + ... + Fs[I]`.
(3) The bonus to specialize a function would be `Bonus(F, ArgNo)`. For normal recursive function `F` and the specialized one `SF`, it's normal that `Bonus(F, ArgNo) == Bonus(SF, ArgNo)` and `Penalty(F) == Penalty(SF)`. It means that `SF` may be specialized again if `NumSpecialized[I]` may not be large enough.

Then we could find that now the total number of specialized function would be controlled by `NumSpecialized` which increases linearly.
But here is the problem that the implementation `Bonus(F, ArgNo)` would increase exponentially with the depth of loops. It shows that we may get in trouble if we met a recursive function with a deep loop.
And the things we want to do is to add the iteration time `I` to the cost model of `Penalty(F, i)`.

For example,

  ; opt -function-specialization -func-specialization-max-iters=100  -S %s
  @Global = internal constant i32 1, align 4

  define internal void @recursiveFunc(i32* nocapture readonly %arg) {
    %temp = alloca i32, align 4
    %arg.load = load i32, i32* %arg, align 4
    %arg.cmp = icmp slt i32 %arg.load, 10000
    br i1 %arg.cmp, label %loop1, label %ret.block

  loop1:
    br label %loop2

  loop2:
    br label %loop3

  loop3:
    br label %loop4

  loop4:
    br label %block6

  block6:
    call void @print_val(i32 %arg.load)
    %arg.add = add nsw i32 %arg.load, 1
    store i32 %arg.add, i32* %temp, align 4
    call void @recursiveFunc(i32* nonnull %temp)
    br label %loop4.end

  loop4.end:
    %exit_cond1 = call i1 @exit_cond()
    br i1 %exit_cond1, label %loop4, label %loop3.end

  loop3.end:
    %exit_cond2 = call i1 @exit_cond()
    br i1 %exit_cond2, label %loop3, label %loop2.end

  loop2.end:
    %exit_cond3 = call i1 @exit_cond()
    br i1 %exit_cond3, label %loop2, label %loop1.end

  loop1.end:
    %exit_cond4 = call i1 @exit_cond()
    br i1 %exit_cond4, label %loop1, label %ret.block

  ret.block:
    ret void
  }

  define i32 @main() {
    call void @recursiveFunc(i32* nonnull @Global)
    ret i32 0
  }

  declare dso_local void @print_val(i32)
  declare dso_local i1 @exit_cond()

I guess I would be happy if `recursiveFunc ` would get specialized less than 4 times even when we set  `func-specialization-max-iters` to 100.

> I guess it's up to me to show with performance numbers if there is a problem

I prefer to analysis problems from the side of cost mode instead of experiencing the large work load all the time. Since there are too many parameters and the patterns are very complex. It should be common that problems in codes may be missed. Although it is very common to fix bugs,  I think it would be better to avoid problems if we noticed them.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106426/new/

https://reviews.llvm.org/D106426