[PATCH] [x86] generalize reassociation optimization in machine combiner to 2 instructions

Fri Jun 19 11:46:43 PDT 2015

Phab is slow/down, so sending email to list...

On Thu, Jun 18, 2015 at 7:57 PM, Gerolf Hoflehner <ghoflehner at apple.com>
wrote:

> The code looks pretty good, but I'd like to understand better why the new
> code investigates more patterns.
>
...

> Allowing more than one pattern was part of the original design. What
> confused/confuses me is that in your old code you checked if operands had
> to be commuted in Root and Prev. But now the code only checks Root and
> potentially investigates two code sequences instead of one. Isn't that more
> expensive? And given that the order of the operands in Prev is not checked
> now, should there be a change in reassociateOps() addressing that?
>
>
Yes, it's definitely more expensive to investigate two patterns rather than
the one we did previously.

Before this patch, we would only match a 3 instruction sequence:
A = ? add ?
B = A add X
C = B add Y

So we could use the pattern match itself to determine if we need to commute
the operands of the instruction that defines 'B' (aka 'Prev' in the code).

With this patch, we no longer assume anything about the instruction that
defines 'A'. So we have to test both operands of Prev to see if there's a
benefit to doing any reassociation. So for example, in the additional test
case which has a fdiv: the pattern matcher does not know that one operand
of Prev is being produced by a slow fdiv while the other is ready to go
(the other operand is an input parameter to the function).

We could add code to the pattern matcher to look at 'A' again, but I think
this would end up being exactly like the core logic in the MachineCombiner
pass: we need to determine which operand of Prev is on the critical path
and then decide if it's worth commuting the operands.

Here's the machine inst sequence that I'm referring to if that makes it any
clearer:

    %vreg3<def> = COPY %XMM3; FR32:%vreg3
    %vreg2<def> = COPY %XMM2; FR32:%vreg2
    %vreg1<def> = COPY %XMM1; FR32:%vreg1
    %vreg0<def> = COPY %XMM0; FR32:%vreg0
    %vreg4<def> = VDIVSSrr %vreg0, %vreg1; FR32:%vreg4,%vreg0,%vreg1
    %vreg5<def> = VADDSSrr %vreg2, %vreg4<kill>; FR32:%vreg5,%vreg2,%vreg4
    %vreg6<def> = VADDSSrr %vreg3, %vreg5<kill>; FR32:%vreg6,%vreg3,%vreg5

Maybe there's some shortcut MI query that I can use to prune the search
space? In this example, we have a copied operand (%vreg2) feeding into the
first add. Would it be safe to simply ignore that possibility (ie, assume
the copy does not add to the critical path)?

>
>
>
>
http://reviews.llvm.org/D10460
>
> EMAIL PREFERENCES
>   http://reviews.llvm.org/settings/panel/emailpreferences/
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150619/cfb0b7f8/attachment.html>