[llvm-dev] Problem with LSR

Thu Sep 24 09:47:19 PDT 2015

On Thu, Sep 24, 2015 at 9:13 AM, Anton Nadolskiy via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Anyone?
>

To back you up with a repost from another thread: LSR is counter
productive for easy to understand cases on x86.  Take this simple
array copy for example where my assessment seems similar yours.

void copy(int* src, int* dst, unsigned int size) {
    for (unsigned int i = 0; i < size; ++i)
        dst[i] = src[i];
}

The hot spot for i686 contains 6 instructions with LSR:

8b 32          mov    (%edx),%esi
89 31          mov    %esi,(%ecx)
83 c1 04     add    $0x4,%ecx
83 c2 04     add    $0x4,%edx
48              dec    %eax
75 f3          jne    11 <copy+0x11>

but only 5 instructions *without* LSR.

8b 3c b2      mov    (%edx,%esi,4),%edi
89 3c b1      mov    %edi,(%ecx,%esi,4)
46               inc    %esi
39 c6          cmp    %eax,%esi
75 f5           jne    14 <copy+0x14>

My interpretation of this problem was that LSR starts analysis with an
SCEV that counts down toward zero even in an upward counting loop.  A
more natural fit would be an initial IV going from 0 to some limit,
but there does not seem to be a way to express this sort of SCEV.  The
initial condition puts an artificially high cost on solutions with
upward counting registers since none match the "free" down counting
SCEV.  Notice the weird 'dec %eax' showing up as an artifact.  The
best solution with base-index scale 4 addressing was considered by
LSR, but incorrectly deemed to be too expensive.