<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jul 26, 2017, at 9:36 AM, Mehdi AMINI <<a href="mailto:joker.eph@gmail.com" class="">joker.eph@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class="gmail_extra"><br class="Apple-interchange-newline"><br class=""><div class="gmail_quote">2017-07-26 9:31 GMT-07:00 Quentin Colombet<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:qcolombet@apple.com" target="_blank" class="">qcolombet@apple.com</a>></span>:<br class=""><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><br class=""><div class=""><span class=""><blockquote type="cite" class=""><div class="">On Jul 25, 2017, at 10:36 PM, Mehdi AMINI <<a href="mailto:joker.eph@gmail.com" target="_blank" class="">joker.eph@gmail.com</a>> wrote:</div><br class="m_6269528974377576643Apple-interchange-newline"><div class=""><div dir="ltr" class=""><br class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">2017-07-24 16:14 GMT-07:00 Quentin Colombet via llvm-dev<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>></span>:<br class=""><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><div dir="auto" style="word-wrap: break-word;" class="">Hi River,</div><div dir="auto" style="word-wrap: break-word;" class=""><br class=""><div class=""><span class=""><blockquote type="cite" class=""><div class="">On Jul 24, 2017, at 2:36 PM, River Riddle <<a href="mailto:riddleriver@gmail.com" target="_blank" class="">riddleriver@gmail.com</a>> wrote:</div><br class="m_6269528974377576643m_-3236575054360144803Apple-interchange-newline"><div class=""><div dir="ltr" class="">Hi Quentin,<div class=""> I appreciate the feedback. When I reference the cost of Target Hooks it's mainly for maintainability and cost on a target author. We want to keep the intrusion into target information minimized. The heuristics used for the outliner are the same used by any other IR level pass seeking target information, i.e TTI for the most part. I can see where you are coming from with "<span style="font-size: 12.8px;" class="">having heuristics solely focused on code size do not seem realistic", but I don't agree with that statement.</span></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">If you only want code size I agree it makes sense, but I believe, even in Oz, we probably don’t want to slow the code by a big factor for a couple bytes. That’s what I wanted to say and what I wanted to point out is that you need to have some kind of model for the performance to avoid those worst cases. Unless we don’t care :).</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">That's why we have threshold though, don't we? </div></div></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">When I see threshold, I think magic number and I don’t like it that.</div></div></div></blockquote><div class=""><br class=""></div><div class="">Fair, but heuristic is the best we have when we don't want to optimize for a single metric or we can't have a perfect modeling of the world.</div><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><div class=""><span class=""><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class="">Also the IR makes it easy to connect to PGO, which allows to focus the outlining on "cold" regions and preserve good performance.</div><div class="">River: did you consider this already? Having a good integration with PGO could make this part of the default optimization pipeline (i.e. having a mode where we outline only the knowingly "cold" code).</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><div dir="auto" style="word-wrap: break-word;" class=""><div class=""><span class=""><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><span style="font-size: 12.8px;" class="">I think there is a disconnect on heuristics. The only user tunable parameters are the lower bound parameters(to the cost model), the actual analysis(heuristic calculation) is based upon TTI information. </span></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">I don’t see how you can get around adding more hooks to know how a specific function prototype is going to be lowered (e.g., i64 needs to be split into two registers, fourth and onward parameters need to be pushed on the stack and so on). Those change the code size benefit.</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">How is the inliner doing? How are we handling Oz there?</div><div class="">If we are fine living with approximation for the inliner, why wouldn't the same work for an outliner?</div></div></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">Unlike inlining, outlining does not expose optimization opportunities.</div></div></div></blockquote><div class=""><br class=""></div><div class="">I would expect that getting the cold code out of the way would help with locality / caching of the hot-path. I remember Amaury even working on getting cold *basic block* in a different section without outlining them in a function.</div><div class=""><br class=""></div><div class="">But I guess what you mean is that as long as we're focusing solely on getting the smallest possible binary ever, you may be closer to "perfect modeling" very late in the pipeline.</div></div></div></div></div></blockquote><div><br class=""></div><div>No, I mean in terms of enabling other optimizations in the pipeline like vectorizer. Outliner does not expose any of that.</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><br class=""></div><div class=""><br class=""></div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><div class=""><span class=""><blockquote type="cite" class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><div dir="auto" style="word-wrap: break-word;" class=""><div class=""><span class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><span style="font-size: 12.8px;" class="">There are several comparison benchmarks given in the "More detailed performance data" of the original RFC. It includes comparisons to the Machine Outliner when possible(I can't build clang on Linux with Machine Outliner). I welcome any and all discussion on the placement of the outliner in LLVM.<br class=""></span></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">My fear with a new framework is that we are going to split the effort for pushing the outliner technology forward and I’d like to avoid that if at all possible.</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">It isn't clear to me that implementing it at the MachineLevel was the right trade-off in the first place.</div></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">Fair enough. it has the advantage of not rely on heuristic for its cost model though.</div><span class=""><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class="">I'm not sure a full comparative study was performed and discussed upstream at the time where the MachineIR outliner was implemented? If so it wouldn't be fair to ask this to River now.</div></div></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">I am not asking that :).</div></div></div></blockquote><div class=""><br class=""></div><div class="">OK great :)</div><div class=""><br class=""></div><div class=""><br class=""></div><div class="">-- </div><div class="">Mehdi</div><div class=""> </div></div><br class=""></div></div><div class="gmail_extra" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class=""><div class="gmail_quote">2017-07-26 9:31 GMT-07:00 Quentin Colombet<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:qcolombet@apple.com" target="_blank" class="">qcolombet@apple.com</a>></span>:<br class=""><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><br class=""><div class=""><span class=""><blockquote type="cite" class=""><div class="">On Jul 25, 2017, at 10:36 PM, Mehdi AMINI <<a href="mailto:joker.eph@gmail.com" target="_blank" class="">joker.eph@gmail.com</a>> wrote:</div><br class="m_6269528974377576643Apple-interchange-newline"><div class=""><div dir="ltr" class=""><br class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">2017-07-24 16:14 GMT-07:00 Quentin Colombet via llvm-dev<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>></span>:<br class=""><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><div dir="auto" style="word-wrap: break-word;" class="">Hi River,</div><div dir="auto" style="word-wrap: break-word;" class=""><br class=""><div class=""><span class=""><blockquote type="cite" class=""><div class="">On Jul 24, 2017, at 2:36 PM, River Riddle <<a href="mailto:riddleriver@gmail.com" target="_blank" class="">riddleriver@gmail.com</a>> wrote:</div><br class="m_6269528974377576643m_-3236575054360144803Apple-interchange-newline"><div class=""><div dir="ltr" class="">Hi Quentin,<div class=""> I appreciate the feedback. When I reference the cost of Target Hooks it's mainly for maintainability and cost on a target author. We want to keep the intrusion into target information minimized. The heuristics used for the outliner are the same used by any other IR level pass seeking target information, i.e TTI for the most part. I can see where you are coming from with "<span style="font-size: 12.8px;" class="">having heuristics solely focused on code size do not seem realistic", but I don't agree with that statement.</span></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">If you only want code size I agree it makes sense, but I believe, even in Oz, we probably don’t want to slow the code by a big factor for a couple bytes. That’s what I wanted to say and what I wanted to point out is that you need to have some kind of model for the performance to avoid those worst cases. Unless we don’t care :).</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">That's why we have threshold though, don't we? </div></div></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">When I see threshold, I think magic number and I don’t like it that.</div><span class=""><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class="">Also the IR makes it easy to connect to PGO, which allows to focus the outlining on "cold" regions and preserve good performance.</div><div class="">River: did you consider this already? Having a good integration with PGO could make this part of the default optimization pipeline (i.e. having a mode where we outline only the knowingly "cold" code).</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><div dir="auto" style="word-wrap: break-word;" class=""><div class=""><span class=""><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><span style="font-size: 12.8px;" class="">I think there is a disconnect on heuristics. The only user tunable parameters are the lower bound parameters(to the cost model), the actual analysis(heuristic calculation) is based upon TTI information. </span></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">I don’t see how you can get around adding more hooks to know how a specific function prototype is going to be lowered (e.g., i64 needs to be split into two registers, fourth and onward parameters need to be pushed on the stack and so on). Those change the code size benefit.</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">How is the inliner doing? How are we handling Oz there?</div><div class="">If we are fine living with approximation for the inliner, why wouldn't the same work for an outliner?</div></div></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">Unlike inlining, outlining does not expose optimization opportunities.</div><span class=""><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><br class=""></div><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><div dir="auto" style="word-wrap: break-word;" class=""><div class=""><span class=""><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><span style="font-size: 12.8px;" class="">There are several comparison benchmarks given in the "More detailed performance data" of the original RFC. It includes comparisons to the Machine Outliner when possible(I can't build clang on Linux with Machine Outliner). I welcome any and all discussion on the placement of the outliner in LLVM.<br class=""></span></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">My fear with a new framework is that we are going to split the effort for pushing the outliner technology forward and I’d like to avoid that if at all possible.</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">It isn't clear to me that implementing it at the MachineLevel was the right trade-off in the first place.</div></div></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">Fair enough. it has the advantage of not rely on heuristic for its cost model though.</div><span class=""><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class="">I'm not sure a full comparative study was performed and discussed upstream at the time where the MachineIR outliner was implemented? If so it wouldn't be fair to ask this to River now.</div></div></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">I am not asking that :).</div><span class="HOEnZb"><font color="#888888" class=""><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><br class=""></div><div class="">-- </div><div class="">Mehdi</div></div></div></div></div></blockquote></font></span></div></div></blockquote></div></div></div></blockquote></div><br class=""></body></html>