[llvm] [SHT_LLVM_BB_ADDR_MAP] Add a new PGOAnalysisMap feature to emit dynamic instruction count (PR #119303)

Tue Dec 10 20:01:58 PST 2024

boomanaiden154 wrote:

> There isn't a strict limitation, but in one of my test, there is an 100MB+ size overhead, which is not negligible to user. Then people may come to argue that if it’s possible to keep only what’s necessary(we don't enable SHT_LLVM_BB_ADDR_MAP  by default), like here only emit the final instruction count number.

Ack. It's definitely a significant size overhead, but if there are no hard limits, I think it might be hard to justify using something as specific as a function-level dynamic instruction count as a new member for the BBAddrMap. Especially if the end goal is some level of cost modeling.

> That said, assuming the offline tool/the fancy cost mode can get more accurate number or be particularly convenient for use, I’m confident to convince service user to accept it(I need to study more :) ).

It might be able to get a more accurate number. We haven't done any validation of `llvm-cm`. We ended up going with collecting instruction traces and performing modeling on those rather than looking at functions with PGO data. It's something that could be compared though.

> Relatedly, could you clarify the reasoning behind using an offline tool instead of the calculation directly in the compiler? Given that all the necessary info should be already available in the compiler. Is the calculation very time-consuming or just for development convenience(as I can see, it needs a lot of iterations for model training stuffs). I’m wondering if this eventually would be ported into the compilation pipeline(say once the model is pretty accurate).

We used an offline tool because we don't really know yet what works in terms of cost modeling, so putting something directly in the compiler didn't make too much sense for experimentation. The info is definitely available in the compiler, but we found it easier for our workflows to have an offline tool do the modeling based on the information rather than trying to do it in the compiler. `llvm-cm` isn't used for model training currently.

We could look at using it in the compilation pipeline eventually, and it wouldn't be too difficult to wire up, but we're quite a ways off from that currently. On the Gematria front, the current push is getting more accurate models.

As a side note, we've also been doing experimentation with trace based cost modeling for register allocation and have achieved pretty good results for that. We're looking at starting to open source tooling in the coming weeks for that.

https://github.com/llvm/llvm-project/pull/119303