[llvm-bugs] [Bug 26063] New: Significant performance regression with r256890

Thu Jan 7 06:20:54 PST 2016

https://llvm.org/bugs/show_bug.cgi?id=26063

            Bug ID: 26063
           Summary: Significant performance regression with r256890
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Common Code Generator Code
          Assignee: unassignedbugs at nondot.org
          Reporter: james.molloy at arm.com
                CC: dan433584 at gmail.com, llvm-bugs at lists.llvm.org
    Classification: Unclassified

Created attachment 15577
  --> https://llvm.org/bugs/attachment.cgi?id=15577&action=edit
Reproducer to show the actual output difference

We've noticed a 17% regression in an important third-party benchmark, and have
bisected it to:

Author: Dan Gohman <dan433584 at gmail.com>
Date:   Wed Jan 6 00:43:06 2016 +0000

    [SelectionDAGBuilder] Set NoUnsignedWrap for inbounds gep and load/store
offsets.

    In an inbounds getelementptr, when an index produces a constant
non-negative
    offset to add to the base, the add can be assumed to not have unsigned
overflow.

    This relies on the assumption that addresses can't occupy more than half
the
    address space, which isn't possible in C because it wouldn't be possible to
    represent the difference between the start of the object and
one-past-the-end
    in a ptrdiff_t.

    Setting the NoUnsignedWrap flag is theoretically useful in general, and is
    specifically useful to the WebAssembly backend, since it permits stronger
    constant offset folding.

    Differential Revision: http://reviews.llvm.org/D15544

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256890
91177308-0d34-0410-b5e6-96231b3b80d8

A reproducer is attached in "reproduce.ll", which, when compiled with llc -O3
goes from this (snipped out the loop body only) before r256890:

.LBB0_1:                                @ %while.body
                                        @ =>This Inner Loop Header: Depth=1
        bl      f
        ldrb    r1, [r4, #1]!
        cmp     r0, #1
        cmpne   r1, #0
        bne     .LBB0_1

To this after r256890:

.LBB0_1:                                @ %while.body
                                        @ =>This Inner Loop Header: Depth=1
        bl      f
        mov     r1, r0
        add     r0, r4, #1
        cmp     r1, #1
        beq     .LBB0_3
@ BB#2:                                 @ %while.body
                                        @   in Loop: Header=BB0_1 Depth=1
        ldrb    r1, [r4, #1]
        mov     r4, r0
        cmp     r1, #0
        bne     .LBB0_1

What appears to be happening is that two different GEPs of the same base and
offset are created - one is inbounds and the other not:

  %incdec.ptr = getelementptr inbounds i8, i8* %a.addr.06, i32 1
  %scevgep = getelementptr i8, i8* %a.addr.06, i32 1

Previously, the SDAG nodes created for these would have been identical and they
would have been commoned, which in this case provides further scope for
optimization by ARM's load/store addressing modes. But now, two different SDAG
nodes are created and this cannot happen. Therefore the code is pessimized.

The file "minimal-reproducer.ll" contains this code pattern extracted, with a
bit of unoptimizable control flow to force the required basic block structure.

Unfortunately when running llc on the minimal reproducer the correct result
appears. This is because ISel happens to select the same instruction for both
GEPs (an ADDri) and this is immediately CSE'd. But looking at the debug output
I can see that two different yet identical (because fast-math flags aren't
printed!) SDAG nodes are kept all the way through legalization and DAG combine.

I'm not sure what the actual fix is here - there are many places where this
could technically be fixed. But as-is this patch causes pretty nasty
regressions.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20160107/cf564d33/attachment.html>