[PATCH] D54175: [PGO] context sensitive PGO

Rong Xu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 12 09:41:01 PST 2018


xur added a comment.

In https://reviews.llvm.org/D54175#1291940, @vsk wrote:

> Hi Rong, at a high-level, I like what this patch is doing. I'll try to leave in-depth comments by next week -- please ping the review otherwise.


Sounds good. Looking forward to your reviews.

>> These suboptimal profile can greatly affect some downstream optimizations, in particular for machine basic block placement optimization.
> 
> If it's possible to get these numbers, it'd be interesting to know how the improvements from this patch compare to link-time or post-link block ordering tools (Bolt).

We mainly use a key google benchmark for performance evaluation. Note that this benchmark has been highly tuned. A performance gain of 0.5% is considered significant.

Our experiments shows 1.5% to 2% from CSPGO.  The improvement is mainly from machine basicblock placement. CSPGO also sets the accurate function entry count (for the cases that inline are cross module). That will make the hot-text section more dense. This can be seen from instruction heat map. But the performance improvement was small.

BOLT can have similar performance boosts from  basic block reordering if it's applied to regular PGO binary (i.e. ~2%)
BOLT can improve the CSPGO binary by 0.5% and that is from function splitting, which is not currently enabled in llvm.


https://reviews.llvm.org/D54175





More information about the llvm-commits mailing list