<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div>Hi Hal,</div><div><br class=""></div><div>Just a disclaimer, I’m rearranging your comments below, I hope you don’t mind:</div><div><br class=""></div><div><blockquote type="cite" class=""><div class=""><blockquote type="cite" cite="mid:FC5BF5FC-A6C4-47E5-B803-3FB42C1CA71F@nondot.org" class=""><div class="">lots of loop optimizers exist that don’t use red/green trees.</div></blockquote><p class="">I've worked with many compilers, some from vendors known for having strong loop optimizers, and none of them would meet our goals for performance, robustness, and modeling accuracy (e.g., not regressing performance in many cases). Our goal here is to significantly improve on the state of the art.<br class=""></p></div></blockquote><div class=""><div class=""><p class="">Agreed, that is the right goal, no disagreement on that! I also agree that phase ordering is a very difficult problem to solve, and that red/green trees are one interesting approach to help with them.</p><div class=""><br class=""></div></div></div></div><div><blockquote type="cite" class=""><div class=""><blockquote type="cite" cite="mid:FC5BF5FC-A6C4-47E5-B803-3FB42C1CA71F@nondot.org" class=""><div class="">Furthermore, my experience is that specialized IRs never get the investment in (e.g.). testing, location propagation for debugging optimized code, textual round tripping, pass management, and the multitude of other things that are required for a world class compiler implementation.</div></blockquote><p class="">I agree, but I think that these are well-known requirements within this ecosystem. What this means also depends on context (it might be more helpful to think of this more like SCEV than like a custom IR).</p></div></blockquote></div><div>SCEV is a great example of the problem that I’m talking about here. SCEV cannot be easily tested, because the IR doesn’t round trip to a file-check-able textual representation, and cannot have passes applied to it. This is barely acceptable for SCEV, because there are workarounds for the simple cases, but would be extremely problematic for something as critical as the IR for a loop optimizer as a whole. SCEV also doesn’t propagate location information, and has several other problems.</div><div><br class=""></div><div>The IR for a loop framework is going to require other duplications and have other problems, e.g. you’ll want a pass manager etc.</div><div><br class=""></div><div><blockquote type="cite" class=""><div class="">On Feb 5, 2020, at 4:51 PM, Hal Finkel <<a href="mailto:hfinkel@anl.gov" class="">hfinkel@anl.gov</a>> wrote:</div><div class=""><div class=""><blockquote type="cite" cite="mid:FC5BF5FC-A6C4-47E5-B803-3FB42C1CA71F@nondot.org" class=""><div class="">If I understand your claims, you are claiming both
that red/green trees are essential for a loop optimizer, and
that this essential nature “justifies” the cost of reinventing
an entire compiler infrastructure is lower than the benefit of
using (e.g.) MLIR to do this. I haven’t seen evidence of either
point:</div></blockquote><p class="">Michael and I have discussed this offilne, and I think that more
quantitative information here would be helpful. One phrasing might
be: For a reasonable test suite of programs, for the loops we
might optimize, how many different loop-nest variants do we need
to represent for costing purposes (also note that, depending on
profiling information and/or heuristics, we may need to cost at
multiple trip-count values)? Given that, what is the relative cost
of creating deep copies of the loop nests and transforming them
vs. the cost of using a red/green tree? Does that cost seem
significant?</p><p class="">We don't want to invest significant effort into a infrastructure
that we know ahead of time won't scale sufficiently.<br class="">
</p></div></div></blockquote><div>Again, as I mentioned above, I completely agree with your goals. That said, I have a few concerns about this:</div><div><br class=""></div><div>1) My understanding is that this approach is more of a research project that needs to be proven, than an implemented design the entire LLVM community should back as “the plan". If you’d like to pursue this, then I think the right way to go is to do some significant implementation work to build it out, see how it works, get experience with it, then propose inclusion as “the” loop optimizer when it can be measured empirically. This is effectively what Polly did.</div><div><br class=""></div><div>2) On MLIR, in addition to being a nice technical solution to the IR data structures and infrastructure backing it, a large slice of the MLIR community has exactly the same goals as you propose. They are literally building loop transformation technology driving state of the art forward, and they also have phase ordering issues :-). These folks have lots of advanced thoughts on this, and a lot of expertise in vectorization, memory hierarchy optimizations, accelerators, HPC, and low-level code generation etc. Leveraging this work makes sense to me, and duplicating it seems wasteful.</div><div><br class=""></div><div>3) On the technical point design of red/green trees, my personal opinion is that this isn’t solving the important part of the problem. Red/green trees (as far <a href="https://llvm.org/devmtg/2018-10/slides/Kruse-LoopTransforms.pdf" class="">as I’m aware</a>) are a memory saving optimization. However, “solving" the phase ordering issues require exploring an exponential (e.g. in depth of transformations / pass pipeline) search space problem (which is expensive in cycles, not just memory), and it relies on having a very good cost model for the generated code (which is a universally hard problem). Furthermore, on the cost model point, the performance of the code depends on the later “non loop optimizations” as well, which are not going to get changed to use this new IR.</div><div><br class=""></div><div>4) While red/green trees are interesting, there are other approaches to tackle these problems. However, they are also researchy and need to be proven. I’m happy to explain my thoughts on this if you are interested.</div><div><br class=""></div><div><br class=""></div><div>Overall, I am super enthusiastic about new research directions, but I think they should be done along the lines of Polly, where the “new thing” gets built and can then be quantitatively evaluated based on its merits in practice, rather than being an up-front decision based on ideas. I am also a bit concerned that this effort is highly duplicative of work already happening in the MLIR community, but duplication isn’t necessarily bad when the path forward isn’t perfectly clear. In either case, I’d love to see more cross pollination between the efforts!</div><div><br class=""></div><div>-Chris</div><div><br class=""></div><div><br class=""></div></div><div><br class=""></div><div><br class=""></div><br class=""></body></html>