<div dir="ltr">Hi all,<div><br></div><div>I like all the ideas so far. Here are my thoughts:</div><div><br></div><div>I think that fundamentally users of LLVM should be able to opt-in to more aggressive or intensive computation at compile time if they wish. Users' needs differ, and while a 33% increase in clang LTO is absolutely out of the question for some people, for those developing microcontrollers or HPC applications that may well be irrelevant. Either the volume of code expected is significantly smaller or they're happy to trade off compile time for expensive server time. That does not mean that we shouldn't strive for a solution that can be acceptable by all users. On the other hand making something opt-in makes it non-default, and that increases the testing surface.</div><div><br></div><div>Tangentially I think that LLVM currently doesn't have the right tuning knobs to allow the user to select their desired tradeoff. We have one optimization flag -O{s,z,0,1,2,3} which encodes both optimization *goal* (a point on the pareto curve between size and speed) and amount of effort to expend at compile time achieving that goal. Anyway, that's besides the point.</div><div><br></div><div>I like Justin's idea of removing IR from the backend to free up memory. I think it's a very long term project though, one that requires significant (re)design; alias analysis access in the backend would be completely broken and BasicAA among others depends on seeing the IR at query time. We'd need to work out a way of providing alias analysis with no IR present. I don't think that is feasible for the near future.</div><div><br></div><div>So my suggestion is that we go with Matthias' idea - do the small amount of refactoring needed to allow MachineModulePasses on an opt-in basis. The knobs to enable that opt-in might need some more bikeshedding.</div><div><br></div><div>Cheers,</div><div><br></div><div>James</div></div><br><div class="gmail_quote"><div dir="ltr">On Tue, 19 Jul 2016 at 08:21 Justin Bogner <<a href="mailto:mail@justinbogner.com">mail@justinbogner.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">James Molloy via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> writes:<br>

> In LLVM it is currently not possible to write a Module-level pass (a pass that<br>

> modifies or analyzes multiple MachineFunctions) after DAG formation. This<br>

> inhibits some optimizations[1] and is something I'd like to see changed.<br>

><br>

> The problem is that in the backend, we emit a function at a time, from DAG<br>

> formation to object emission. So no two MachineFunctions ever exist at any one<br>

> time. Changing this necessarily means increasing memory usage.<br>

><br>

> I've prototyped this change and have measured peak memory usage in the worst<br>

> case scenario - LTO'ing llc and clang. Without further ado:<br>

><br>

>   llvm-lto llc:   before: 1.44GB maximum resident set size<br>

>                   after:  1.68GB (+17%)<br>

><br>

>   llvm-lto clang: before: 2.48GB maximum resident set size<br>

>                   after:  3.42GB (+33%)<br>

><br>

> The increases are very large. This is worst-case (non-LTO builds would see the<br>

> peak usage of the backend masked by the peak of the midend) but still - pretty<br>

> big. Thoughts? Is this completely no-go? is this something that we *just need*<br>

> to do? Is crippling the backend architecture to keep memory down justified? Is<br>

> this something we could enable under an option?<br>

<br>

Personally, I think this price is too high. I think that if we want to<br>

enable machine module passes (which we probably do) we need to turn<br>

MachineFunction into more of a first class object that isn't just a<br>

wrapper around IR.<br>

<br>

This can and should be designed to work something like Pete's solution,<br>

where we get rid of the IR and just have machine level stuff in memory.<br>

This way, we may still increase the memory usage here, but it should be<br>

far less dramatic.<br>

<br>

You'll note that doing this also has tangential benefits - it should be<br>

helpful for simplifying MIR and generally improving testability of the<br>

backends.<br>

</blockquote></div>