[PATCH] Review for hoisting and sinking of equivalent memory instruction (Instruction Merge Pass)
Gerolf Hoflehner
ghoflehner at apple.com
Thu Jun 19 14:27:34 PDT 2014
For the original implementation(within GVN) I didn’t see compile-time issues. It should not have changed now that the code is in a separate pass, but I’ll collect data for the latest version.
The biggest gain on the llvm test suite was a 2-3% gain in mcf.
-Gerolf
On Jun 19, 2014, at 7:13 AM, Daniel Berlin <dberlin at dberlin.org> wrote:
> Speaking of which, maybe I missed it, but do you have any numbers for
> compile time or performance impact on real program compilation?
>
>
> On Wed, Jun 18, 2014 at 8:55 PM, Gerolf Hoflehner <ghoflehner at apple.com> wrote:
>> Thanks Daniel & Tobias.
>> I think at this point limiting the number of checks and loads makes sense to
>> play it safe for compile-time.
>>
>>
>> -Gerolf
>>
>>
>>
>>
>> On Jun 18, 2014, at 1:47 PM, Daniel Berlin <dberlin at dberlin.org> wrote:
>>
>> On Wed, Jun 18, 2014 at 12:56 PM, Tobias Grosser <tobias at grosser.es> wrote:
>>
>> On 18/06/2014 21:47, Daniel Berlin wrote:
>>
>>
>> FWIW: There is no easy way to do this O(n) for stores in LLVM, due to
>> the lack of something like memory-ssa (otherwise, you could sink to
>> the nearest common dominator of all immediate uses, as we do for GCC)
>> You can do it O(n),or much closer, in LLVM for loads, like this:
>>
>> Assuming GVN and PRE has been run, all loads that can be determined to
>> be identical should look identical (if not, our GVN is seriously
>> busted :P) in their operands[1]
>>
>> pending = hash table of <block, load operand> -> list of load instructions
>>
>> for each load in the diamond:
>> calculate sink location as nearest common dominator of:
>> for each dependency according to memdep, the block of that
>> dependency
>> for each RHS operand, the defining block of that operand.
>> pending[<block, load operands>].insert(load instruction)
>>
>> for each entry in pending:
>> if (list.size() > 1)
>> Perform merge and hoist to end of common dominator block
>>
>>
>> This also would be even easier if GVN produced a value number or value
>> handle for each thing, like GCC's (then it wouldn't matter if they
>> looked identical, only if they calculate the same value), but c'est la
>> vie.
>>
>> [1] The only case this wouldn't be true is if the load was defined by
>> operands in the diamond, in which case you couldn't hoist it out of
>> the diamond anyway without a real load PRE determining whether you
>> could move/recalculate the operands.
>>
>>
>>
>> Thanks Daniel!
>>
>> Gerolf, even if we don't get this algorithm to a linear run-time, does it
>> make sense to bound the number of checks such that we don't get a quadratic
>> increase in compile time for those corner cases, but that we just don't
>> optimize them?
>>
>>
>> +1
>> Little passes like this, created because other infrastructure
>> currently sucks and needs serious work, seem somewhat inevitable as
>> temporary things, but they do eat at compile time, so it's always good
>> to do what you can to limit impact, even if it means not catching
>> everything.
>>
>>
>>
>> Tobias
>>
>>
More information about the llvm-commits
mailing list