[Patch][ARM] Fix and enable the Load/Store optimisation pass for Thumb1

Moritz Roth Moritz.Roth at arm.com
Wed May 14 03:33:27 PDT 2014


Hi all,

this is a set of patches to add support for Thumb1 targets in the
Load/Store optimisation pass, and re-enable that pass as well as inline
memcpy expansion.
Below is a short description of each patch:

0001 - This patch fixes a few comment typos and other style issues I
addressed while working on this. It's fairly small, and there is no
intended functionality change.

0002 - This patch re-enables the Load/Store optimisation pass for
Thumb1-only targets. Since the actual change to the algorithm isn't in
this patch yet, the pass simply returns and does nothing if invoked for
such a target. Essentially, the place where the pass is disabled for
Thumb1 is just moved down into the actual pass, so patch 0003 can easily
make it *actually* do something. Again, there is no intended
functionality change.

0003 - This is the main patch - it adds support in the Load/Store
optimisation pass to correctly generate Thumb1 LDMIA/STMIA instructions
and fully enables the pass.
The reason this was disabled before is that the current algorithm always
generates non-writeback Load/Store multiples first, and then tries to
merge any applicable base register updates into the LDM/STM. Thumb1 only
has LDM/STM with base register writeback, so this approach doesn't
really work there. In a nutshell, my patch directly generates the Thumb1
tLDMIA[_UPD] and tSTMIA_UPD instructions. It then scans over the current
block and tries to update any future instructions that read the base
register with the new offset added from the writeback. If this isn't
possible, the base register is reset right before the next instruction
that uses it. The later (base-writeback merge) stages of the pass aren't
applicable to Thumb1, so they're not executed.

This is a rather large patch and there are many details I've left out
here. I'll put a more detailed description of the changes on Phabricator
for review shortly.
There is no intended functionality change for non-Thumb1 targets. I've
added some tests to check that the pass is working - but note that there
is another set of test cases for this (and memcpy expansion) in patch
0004. There's also a fix for a failing test where two instructions were
being merged by the algorithm.

0004 - This patch re-enables inline memcpy expansion for Thumb1. It was
disabled for Thumb1 since the Load/Store optimisation pass was disabled.
There are also test cases to make sure that small memcpys are inlined,
and that the resulting chains of LDR/STR are merged correctly into
LDM/STM (see patch 0003). This patch should only be applied once 0003 is
commited.

Finally, regarding code size / performance impact: This patch has an
impact on certain benchmarks that do lots of memcpy. By itself, it seems
to give a ~7% improvement in Dhrystone. Together with some trickery to
make clang align global strings at word boundaries (this allows a
further memcpy to be inlined), there's a ~25% overall speed-up.

Cheers
Moritz

PS: Sorry for the disclaimer, still working on getting that removed from
my work email account.

-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2548782
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-Re-enable-inline-memcpy-expansion-for-Thumb1.patch
Type: text/x-patch
Size: 6033 bytes
Desc: 0004-Re-enable-inline-memcpy-expansion-for-Thumb1.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140514/1e5887d4/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Fix-the-Load-Store-optimization-pass-to-work-with-Th.patch
Type: text/x-patch
Size: 24264 bytes
Desc: 0003-Fix-the-Load-Store-optimization-pass-to-work-with-Th.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140514/1e5887d4/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Enable-the-Load-Store-optimization-pass-for-Thumb1-b.patch
Type: text/x-patch
Size: 3113 bytes
Desc: 0002-Enable-the-Load-Store-optimization-pass-for-Thumb1-b.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140514/1e5887d4/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Fix-a-few-comment-typos-and-style-issues.patch
Type: text/x-patch
Size: 4887 bytes
Desc: 0001-Fix-a-few-comment-typos-and-style-issues.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140514/1e5887d4/attachment-0003.bin>


More information about the llvm-commits mailing list