[PATCH] D30920: Do not pass -Os and -Oz to the Gold plugin

Fri Mar 24 12:43:33 PDT 2017

hfinkel added a comment.

In https://reviews.llvm.org/D30920#703305, @mehdi_amini wrote:

> The fundamental difference, is that Os/Oz especially are treated as `optimizations directive` that are independent of the pass pipeline: instructing that "loop unroll should not increase size" is independent of *where* is loop unroll inserted in the pipeline.

> The issue with https://reviews.llvm.org/owners/package/1/ vs https://reviews.llvm.org/owners/package/2/ is that the *ordering* of the passes changes, not only the threshold to apply.

Maybe we should stop here and ask: Is this really a *fundamental* difference? Or is this just a difference in how to handle Os/Oz today? Is there a fundamental reason why we might not want a different order for Oz vs. https://reviews.llvm.org/owners/package/2/ if we want one for 02 vs 03?

My view is that, no, there is no fundamental difference. Why? Because each optimization level has a meaning, and that meaning can almost always be applied per function.

- `Oz` - Make the binary as small as possible
- `Os` - Make the binary small while making only minor performance tradeoffs
- `O0` - Be fast, and maybe, maximize the ability to debug
- `O1` - Make the code fast while making only minor debugability tradeoffs
- `O2` - Make the code fast - perform only transformations that will speed up the code with near certainty
- `O3` - Make the code fast - perform transformations that will speed up the code with high probability

Believing that we can implement this model primarily through pass scheduling has proved false in the past (except for -O0 without LTO) and won't be any more true in the future. We essentially have one optimization pipeline, and I see no reason to assume this will change. It seems significantly more effective to have the passes become optimization-level aware than to implement optimization though changes in the pipeline structure itself. Especially in the context of the new pass manager, where the cost of scheduling a pass that will exit early should be small, I see no reason to alter the pipeline at all. For one thing, many optimizations are capable of performing several (or many) different transformations, and these often don't all fall into the same categories. In that sense, CGSCC- and function-scope passes are not really different than function/loop-scope passes.

As such, I think that we should tag functions/modules with optimization levels so that users can control the optimization level effectively even in the case of LTO. We could even add pragmas allowing users to control this at finer granularity, and that would be a positive thing (I dislike that we currently force users to put functions in different files to get different optimization levels - my users find this annoying too). We need to give users a simple model to follow: optimization is associated with compiling (even if we do, in fact, delay it until link time).

> Also this wouldn't work with an SCC pass, as different functions in the SCC can have different level, it gets quite tricky. It also becomes problematic for any module level pass: `Dead Global Elimination` would need to leave out global variable that comes from a module compiled with https://reviews.llvm.org/owners/package/1/, same with `Merge Duplicate Global Constants` (I think the issue with `GlobalVariable` affects also Os/Oz by the way).

Yes, we should do this. I don't understand why this is tricky. Actually, I think having these kinds of decisions explicit in the code of the transformations would be a positive development.

https://reviews.llvm.org/D30920