[PATCH] D93838: [SCCP] Add Function Specialization pass

Thu May 6 03:44:50 PDT 2021

SjoerdMeijer added a comment.

In D93838#2741479 <https://reviews.llvm.org/D93838#2741479>, @ChuanqiXu wrote:

> The benefits of 505.mcf_r comes from the specialization for `spec_qsort`. Here is the signature of `spec_qsort`:
>
>   void
>   spec_qsort(void *a, size_t n, size_t es, cmp_t *cmp)
>
> Here the `cmp_t*` is a function pointer. And there are lots of uses of cpp in `spec_qsort`. And `spec_qsort` is called in two places in `505.mcf_r`:
>
>   spec_qsort(arcs_pointer_sorted[thread], new_arcs_array[thread], sizeof(arc_p),
>                   (int (*)(const void *, const void *))arc_compare);
>   spec_qsort(perm + 1, basket_sizes[thread], sizeof(BASKET*),
>               (int (*)(const void *, const void *))cost_compare);
>
> Both arc_compare and cost_compare are global functions. So here we can get the reason why function specialization benefits 505.mcf_r. It is converting the indirect call to direct call by function specialization and the direct call would be inlined further.
>
> It looks like this pattern is usual in our daily work codes. However, I wonder what if there is multiple call site for `spec_qsort` with multiple global functions. It looks like the code now can't handle this situation, which is more usual in projects. I didn't ask for the change of cost model. I think we can made it in the future. This is just a sharing.

Many thanks for sharing. With my infrastructure/workflow problems (mostly) sorted to run and evaluate things, I have seen exactly the same, so can confirm this.

My baseline is trunk, without this patch applied, in LTO mode. So that is using the new pass manager, and as this new pass wasn't added to its LTO pipeline, I didn't see this triggering. But with that fixed and this patch applied, I noticed to the 30% gain with 30% extra compile time. This was on an older and noisier AArch64 system, but the trend was clear and especially the increased compile-times very obvious and consistent . I will also run this on a newer system, but I am still setting this up.

PS. About LTO, I didn't see this triggering in non-LTO mode on MCF. That's why I am only looking at LTO at the moment.

Now that I have solid LTO numbers and compile-times, I am going to look at compile-times.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93838/new/

https://reviews.llvm.org/D93838