[llvm-dev] Next steps for optimization remarks?

Fri Jul 14 10:22:11 PDT 2017

On Fri, Jul 14, 2017 at 10:10 AM, Adam Nemet <anemet at apple.com> wrote:
>
>
> On Jul 14, 2017, at 8:21 AM, Davide Italiano via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> On Mon, Jun 19, 2017 at 4:13 PM, Brian Gesiak via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>
> Hello all,
>
> In https://www.youtube.com/watch?v=qq0q1hfzidg, Adam Nemet (cc'ed) describes
> optimization remarks and some future plans for the project. I had a few
> follow-up questions:
>
> 1. As an example of future work to be done, the talk mentions expanding the
> set of optimization passes that emit remarks. However, the Clang User Manual
> mentions that "optimization remarks do not really make sense outside of the
> major transformations (e.g.: inlining, vectorization, loop optimizations)."
> [1] I am wondering: which passes exist today that are most in need of
> supporting optimization remarks? Should all passes emit optimization
> remarks, or are there indeed passes for which optimization remarks "do not
> make sense"?
>
> 2. I tried running llvm/utils/opt-viewer/opt-viewer.py to produce an HTML
> dashboard for the optimization remark YAML generated from a large C++
> program. Unfortunately, the Python script does not finish, even after over
> an hour of processing. It appears performance has been brought up before by
> Bob Haarman (cc'ed), and some optimizations have been made since. [2] I
> wonder if I'm passing in bad input (6,000+ YAML files -- too many?), or if
> there's still some room to speed up the opt-viewer.py script? I tried the
> C++ implementation as well, but that never completed either. [3]
>
> Overall I'm excited to make greater use of optimization remarks, and to
> contribute in any way I can. Please let me know if you have any thoughts on
> my questions above!
>
>
> Hi,
> I've been asked at $WORK to take a look at `-opt-remarks` , so here
> are a couple of thoughts.
>
> 1) When LTO is on, the output isn't particularly easy to read. I guess
> this can be mitigated with some filtering approach, I and Simon
> discussed it offline.
>
>
> Can you please elaborate?
>

The issue is twofold:
1) With LTO, the number of remarks generated skyrockets because whole
module visibility makes IPO more effective (i.e. you end up inlining
much more etc..). As a side effect, more aggressive inlining/IPCP
expose more intraprocedural optimizations which in turn generates more
remarks.
2) As pointed out earlier, DI is not always reliable.

>
>
> 2) Yes, indeed `opt-viewer` takes forever for large testcases to
> process. I think that it could lead to exploring a better
> representation than YAML which is, indeed, a little slow to parse. To
> be honest, I'm torn about this.
> YAML is definitely really convenient as we already use it somewhere in
> tree, and it has an easy textual repr. OTOH, it doesn't seem to scale
> that nicely.
>
>
> Agreed.  We now have a mitigation strategy with -pass-remarks-hotness-threshold but this is something that we may have to solve in the long run.
>

At some point, I guess we might just slowly moving away from
>
>
> 3) There are lots of optimizations which are still missing from the
> output, in particular PGO remarks (including, e.g. branch info
> probabilities which still use the old API as far as I can tell
> [PGOInstrumentation.cpp])
>
>
> Yes, how about we file bugs for each pass that still uses the old API (I am looking at ICP today) and then we can split up the work and then finally remove the old API?
>

That sounds like a plan.

> Also on exposing PGO info, I have a patch that adds a pass I call HotnessDecorator.  The pass emits a remark for each basic block.  Then opt-viewer is made aware of these and the remarks are special-cased to show hotness for a line unless there is already a remark on the line.  The idea is that since we only show hotness as part of the remark if a block does not contain a remark we don’t see its hotness.  E.g.:
>
>

Yes, feel free to post for review once you have it ready.
>
>
> 4) `opt-remarks` heavily relies on the fidelity of the DebugLoc
> attached to instructions. Things get a little hairy at -O3 (or with
> -flto) because there are optimizations bugs so transformations don't
> preserve debuginfo. This is not entirely orthogonal but something can
> be worked on in parallel (bonus point, this would also help SamplePGO
> & debuginfo experience). With `-flto` the problem gets amplified more,
> as expected.
>
> 5) I found a couple of issue when trying the support, but I'm actively
> working on them.
> https://bugs.llvm.org/show_bug.cgi?id=33773
> https://bugs.llvm.org/show_bug.cgi?id=33776
>
> That said, I think optimization remarks support is coming along nicely.
>
>
> Yes, I’ve been really happy with the progress.  Thanks for all the help from everybody!

At some point, I guess we might just consider the HTML generated
report as a fallback and having the opt-remarks more integrated in the
developer's workflow.
I personally use Visual studio daily to compile clang and it would be
nice to have remarks there as a plugin. I can imagine something
similar happening for XCode/CLion/Emacs etc..

Thanks,

--
Davide