[llvm-dev] [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data

Snehasish Kumar via llvm-dev llvm-dev at lists.llvm.org
Fri Aug 7 10:21:39 PDT 2020


Hi Wenlei,

Thanks for your interest :)

On Fri, Aug 7, 2020 at 12:40 AM Wenlei He <wenlei at fb.com> wrote:

> Cool stuff ā€“ nice to see a late splitting pass in LLVM.
>
>
>
> > Full Propeller optimizations include function splitting and layout
> optimizations, however it requires an additional round of profiling using
> perf on top of the peak (FDO/CSFDO + ThinLTO) binary. In this work we
> experiment with applying function splitting using the instrumented profile
> in the build instead of adding an additional round of profiling.
>
>
>
> Iā€™d expect propeller or BOLT to be more effective at doing this due to
> better post-inline profile. Of course the usability advantage of not
> needing a separate profile is very practical, but just wondering did you
> see profile quality getting in the way here?
>
Yes, currently the pass is quite sensitive to profile quality. For e.g, the
current default is to split only blocks with zero profile count. Using a
binary choice is more effective than a count based threshold.

>
>
> > uses existing instrumentation based FDO or CSFDO profile information.
>
>
>
> Similarly, with instrumentation FDO alone, the post-inline profile may not
> be accurate, so for this splitting, is it more effective when used with
> CSFDO? Was the evaluation result from FDO or CSFDO?
>
Yes, CSFDO profiles are more effective. The SPEC and clang bootstrap
numbers are FDO based however our internal benchmarks are built with CSFDO
and improvement when using CSFDO profiles > FDO profiles.

>
>
> Also wondering does this work with Sample FDO, and do you have numbers
> that you can share when used with Sample FDO?
>
We are still working on refining the pass for Sample FDO. The initial
version up for review degrades performance when used with sample profiles.
We have some further refinements planned which improves it (performance
neutral) however more investigation is needed to understand the differences
between sampled profiles and instrumented profiles late in codegen. We are
invested in ensuring this works well for sampled profiles.

>
>
> Thanks,
>
> Wenlei
>

Regards,
Snehasish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200807/46d4fde1/attachment.html>


More information about the llvm-dev mailing list