[Thumb1] Fix and re-enable the load/store optimization pass

Tue Aug 12 07:07:19 PDT 2014

Hi all,

This patch-set aims to fix PR19972
(http://llvm.org/bugs/show_bug.cgi?id=19972).

The Thumb-1 specific parts of the load/store optimizer in its previous
iteration had some correctness issues:

-          Base register liveness information wasn't correctly transferred

-          It didn't handle stores of the base register after writeback
correctly

-          It was possible to generate illegal instructions such as STMs
including the base register

-          It would try to undo the base register writeback using a SUBS in
unsafe conditions. SUBS sets the conditions flags, so cases like this (taken
from a failing LNT program) could occur, merged from two LDRs:

        cmp     r5, #1

        ldm     r6!, {r0, r4}

        subs    r6, #8

        blt     .LBB177_10

I've addressed these bugs in the patches as follows:

0001: Correctly transfer kill flags on the base register. Since the MemOps
list may be in a different order, we need to transfer the kill flags of the
last instruction as it occurred in the original instruction stream, not the
last index of MemOps. The test case "merge-ldr-call.ll" from patch 0003 is
affected by this, however this patch on its own should not affect codegen.

0002: Make the merging more conservative on Thumb1. For now, we only merge
LDRs/STRs when it's clearly safe to do so: if there is no writeback (LDM
into base register), or the base register is dead after the merged
instruction (so the writeback doesn't need to be handled). Tests for each of
the cases are in patch 0003.

Since this has the side-effect of not allowing merges of more than one
LDM/STM of the same base register in a row, I've also modified the memcpy
inliner for Thumb1 to use a library call if copying more than a single
LDM/STM can handle (16 bytes). I'm working on a patch to remove this
limitation again, but it's not quite ready for review yet. This patch on its
own is always a clear win both in code size and performance however, so I
think it's worth committing.

0003: Re-enable the pass, and add test cases for the previous two patches.
There are three new regression tests created from simple test programs that
were miscompiled previously, and I've added -verify-machineinstrs to each of
the tests as well.

I've tested the patches with SPEC2000, SPEC2006, LNT and EEMBC and
everything looks fine on v6M and v7, but any additional testing or
validation is of course welcome! Please let me know what you think.

Cheers

Moritz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140812/38529553/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ARM-load-store-optimizer-Compute-BaseKill-correctly.patch
Type: application/octet-stream
Size: 2528 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140812/38529553/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-ARM-load-store-optimizer-Thumb1-only-merge-when-it-s.patch
Type: application/octet-stream
Size: 7774 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140812/38529553/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Revert-Disable-the-load-store-optimization-pass-for-.patch
Type: application/octet-stream
Size: 9837 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140812/38529553/attachment-0002.obj>