[LLVMdev] Making optimization passes do less
Chris Lattner
sabre at nondot.org
Tue May 27 23:25:16 PDT 2008
On May 26, 2008, at 4:23 AM, Matthijs Kooijman wrote:
> I'm currently struggling with a few optimization passes that change
> stuff I
> don't want to be changed.
Hehe ok.
> However, for the most part those passes (InstructionCombining
> and SimplifyCFG currently) do stuff that I do want, so disabling them
> alltogether doesn't help me much.
Ok.
> The problem arises because the architecture I'm compiling for is quite
> non-standard. In particular, it has the ability to execute a lot of
> instructions in parallel, but at the same time can't execute
> everything you
> throw at it.
Ok, that is odd :)
> My problem with SimplifyCFG is the following: Whenever the if and
> else branch
> start with the same instruction, it gets hoisted up into the
> predecessor
> block. For my architecture, instructions in different blocks can't
> be run in
> parallel, so this optimization makes code either very inefficient or
> not
> compile at all.
There are two different issues here. Passes like instcombine and
simplifycfg [which is really "basic block combine" :) ] do two things:
1. They make changes that are clear wins, e.g. deleting unconditional
branches and noop instrs.
2. They change code into more canonical form.
Merging repeated instructions is an important canonicalization because
it can unlock other optimizations. The fact that your target doesn't
like code in this form is not a good reason for simplifycfg to stop
doing it. :)
> InstructionCombining has this habit of removing unneeded bits from
> constants.
> For example, if I do i & 63, where i is a loop counter that is
> always even,
> this gets replaced by i & 62. Which gives, of course, the same
> results when
> interpreted, but our backend cannot just use any constant as an &
> mask (in
> particular, it can only use a limited amount of them).
Sure, this is another example of canonicalization. Are you using the
LLVM code generator? It has support for handling this specifically.
ARM and Alpha in particular have special instructions that only work
with very specific and masks. If you write a pattern/instruction that
matches (and myreg, 255) for example, this will match a dag node for
"(and myreg, 16)" if the code generator knows that the other bits are
already zero.
> I'd very much prefer to
> preserve the original value from the source here (I also assume that
> this
> optimization is in place to help further optimizations, because I
> can't really
> see any use of this change on regular architectures...).
This is folly. If the user wrote the code in the "optimized" form
that instcombine transforms it into, your code generator should still
produce the optimized instructions. You're trading one missed
optimization for another one.
> I've been thinking a bit on how to achieve this, and I see a few
> options
> :None of these options seem too attractive to me, what do others
> think? Is
> there some other option I'm missing here?
I really don't like any of these options. The best ways to go are:
1) teach your code generator how to do these optimizations, reversing
the cases that you care about.
2) if #1 isn't feasible, write a canonicalization prepass (like
codegen prepare) that transforms code from the "canonical optimizer
form" into a happy form for your target.
-Chris
More information about the llvm-dev
mailing list