[PATCH] D45225: [WIP] Add IR function attributes to represent codegen optimization level

Wed Apr 4 13:57:11 PDT 2018

efriedma added inline comments.

================
Comment at: lib/IR/Module.cpp:91

+CodeGenOpt::Level Module::getCodeGenOptLevel() const {
+  if (FunctionList.empty())
----------------
mcrosier wrote:
> efriedma wrote:
> > mcrosier wrote:
> > > efriedma wrote:
> > > > This doesn't make any sense.  If it's a per-function attribute, we should be computing it per function.  The maximum level set on any function doesn't have any useful meaning.
> > > The goal here is to approximate the codegen optimization level during LTO based on function level attributes.  Per the RFC, after r324557 the codegen optimization level for LTO is always CodeGenOpt::Default and is not configurable AFAICT.  However, the codegen optimization level has implications on how the codegen pipelines works (e.g., fast-isel vs selection dag, greedy vs. fast register allocator), which is at least one reason to try an approximate the codegen opt level (see the use of this function in lib/LTO/LTO.cpp).
> > > 
> > > On why I pick the max.. my thought is that if we're compiling with LTO the user is probably willing to aggressively optimize, so the heuristic chooses the highest optimization level...
> > > 
> > > 
> > > 
> > Whether we use fast-isel vs. selectiondag or greedy vs. fast regalloc are choices we could, and should, make at a per-function level.  Actually, we already choose between fast-isel vs. selectiondagisel on a per-function level, depending on whether optnone is specified.
> > 
> > I think the right design here looks very different: "optaggressive" should just be a hint that we want a different codesize/efficiency tradeoff, like "optsize".  Then we should fix optimizations to use that hint instead of the global OptLevel.  (I only count four places in lib/CodeGen/ and lib/Target/AArch64/ which actually distinguish between -O2 and -O3, so this shouldn't be difficult to implement.)
> I completely agree that part of the fix is to have the optimizers use the function attributes as hints, rather than use the TargetPass optimization level.  However, this doesn't address the fact that the compilation pipeline is (currently) based on some global notion of the optimization level (e.g., several passes, including RA, are either excluded or behave differently because of CodeGenOpt::None and for AArch64 the SeparateConstOffsetFromGEPPass is only run with CodeGenOpt::Aggressive).  This is what this bit of logic is trying to accomplish (albeit in a rather displeasing way).  In the long term, that might not be how the codegen pass pipeline works, but TBH I'm not interested in pursuing that fix...
> 
> Perhaps just adding the function attributes and then modifying the function-level passes to use the function attributes rather than the TargetPass optimization level is a reasonable first step.  This does at least fix the particular case I care about (i.e., enabling tail duplication during block placement).
Of course you don't have to fix everything at once; I understand some bits are sort of complicated.  I just want to make sure we're moving in the right direction; we should be deprecating CodeGenOpt::Level, not building more infrastructure around it.

https://reviews.llvm.org/D45225