[PATCH] D110847: [CSSPGO] Unblock optimizations with pseudo probe instrumentation part 3.

Tue Oct 12 09:09:12 PDT 2021

hoy added a comment.

In D110847#3058762 <https://reviews.llvm.org/D110847#3058762>, @wenlei wrote:

> In D110847#3055368 <https://reviews.llvm.org/D110847#3055368>, @hoy wrote:
>
>> In D110847#3035672 <https://reviews.llvm.org/D110847#3035672>, @wenlei wrote:
>>
>>> Thanks for the changes to reduce probe overhead. As we chatted off patch, some information and verification of this change's impact on profile quality would be helpful.
>>
>> The original patch contained several general changes, e.g, flipping the default values of those skipping APIs, which were observed to affect profile quality. After narrowing down the offending changes that are in SimplifyCFG, reverting them brought back the profile quality while still reduced the probe overhead in a decent amount.

> What are the specific optimizations that were critical for profile quality in SimplifyCFG? On the other hand, it looks to me that some of the other unblocked optimizations like InstCombine could also have some impact on profile quality.  
> The change looks good, but I think it'd also be nice to have some insight and reasoning as to why certain optimizations have be blocked for profile quality, beyond experiment results which could vary from one workload to another.

Good question. In general SimplifyCFG is more disruptive to the flow graph by folding, merging or removing blocks. Such flow graph changes affected the execution frequency of blocks, and samples collected on the new blocks have a problem to correlate to original blocks. A particular case is

  /// If this basic block is simple enough, and if a predecessor branches to us
  /// and one of our successors, fold the block into the predecessor and use
  /// logical operations to pick the right destination.
  bool llvm::FoldBranchToCommonDest(BranchInst *BI, DomTreeUpdater *DTU,

Compared to SimplifyCFG, InstCombine is more like optimizing on instruction level. Moving an instruction around or folding instructions are likely having a smaller impact on the shape of flow graph and block frequencies.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110847/new/

https://reviews.llvm.org/D110847