[llvm-bugs] [Bug 30657] New: DAG-Level reassociation makes expression trees deeper

Mon Oct 10 10:54:46 PDT 2016

https://llvm.org/bugs/show_bug.cgi?id=30657

            Bug ID: 30657
           Summary: DAG-Level reassociation makes expression trees deeper
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Common Code Generator Code
          Assignee: unassignedbugs at nondot.org
          Reporter: mkuper at google.com
                CC: llvm-bugs at lists.llvm.org
    Classification: Unclassified

As part of the DAG combiner's efforts to perform constant-folding, we have the
following two combines:

// (op (op x, c1), y) -> (op (op x, y), c1) iff x+c1 has one use
// (op x, (op y, c1)) -> (op (op x, y), c1) iff x+c1 [sic] has one use

These combines take constants and bring them closer to the expression tree's
root. This works well if we have several constants in the expression tree, but
makes the tree deeper for no good reason if there's only one constant - which
serves to reduce available ILP.

For example, consider:

define i32 @better(i32 %a, i32 %b) {
  %r1 = xor i32 %a, 14
  %r2 = xor i32 %b, 11
  %r3 = xor i32 %r1, %r2
  ret i32 %r3
}

define i32 @worse(i32 %a, i32 %b, i32 %c) {
  %r1 = xor i32 %a, 14
  %r2 = xor i32 %b, %c
  %r3 = xor i32 %r1, %r2
  ret i32 %r3
}

With the current reassociation step we get, with bin/llc -mtriple=x86_64
-mcpu=nehalem: 
(Note that without specifying a CPU we get the same bad re-association from the
machine combiner, but that's justifiable, since without scheduling information,
it can't make a reasonable decision):

better:
    xorl    %esi, %edi
    xorl    $5, %edi
    movl    %edi, %eax
    retq

worse:
    xorl    %edx, %esi
    xorl    %edi, %esi
    xorl    $14, %esi
    movl    %esi, %eax
    retq

Without reassociation, we get:

better:
    xorl    $14, %edi
    xorl    $11, %esi
    xorl    %edi, %esi
    movl    %esi, %eax
    retq

worse:
    xorl    $14, %edi
    xorl    %edx, %esi
    xorl    %edi, %esi
    movl    %esi, %eax
    retq

So, for the 'worse' case, we now have a 3-instruction critical path, instead of
a 2-instruction one.

I'm not really sure what we should do about this. The three options I see:

1) Stop doing re-association in DAG. We already perform reassociation in the
IR, and should come out of the IR with good, balanced trees. Lowering may
introduce additional constants, but at least the common case of GEP lowering
doesn't seem to create patterns that require this transformation.

2) Do better re-association in DAG - try to push all constants together, but,
say, towards, the rightmost leaf, instead of towards the root. I don't quite
see how to do this with local transformations, though.

3) Leave the DAG combine as is, and let the MachineCombiner undo this
per-target. I think that's not very appealing. While, in a sense, this is the
MachineCombiner's job - what the DAG combiner does seems decidedly
"non-canonical" - it's intentionally making the expression tree worse. 
More practically speaking, fixing this post-selection will not be very easy -
for example, on x86, we may need to undo LEA selection.

Any thoughts?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20161010/59f7dde4/attachment-0001.html>