[all-commits] [llvm/llvm-project] d97cf1: [ARM][LowOverheadLoops] Remove dead loop update in...
sjoerdmeijer via All-commits
all-commits at lists.llvm.org
Wed Dec 11 02:24:00 PST 2019
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: d97cf1f88902026b6ebe7fb9d844a285c3b113c5
https://github.com/llvm/llvm-project/commit/d97cf1f88902026b6ebe7fb9d844a285c3b113c5
Author: Sjoerd Meijer <sjoerd.meijer at arm.com>
Date: 2019-12-11 (Wed, 11 Dec 2019)
Changed paths:
M llvm/include/llvm/CodeGen/MachineLoopUtils.h
M llvm/include/llvm/CodeGen/ReachingDefAnalysis.h
M llvm/lib/CodeGen/MachineLoopUtils.cpp
M llvm/lib/CodeGen/ReachingDefAnalysis.cpp
M llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
A llvm/test/CodeGen/Thumb2/LowOverheadLoops/dont-remove-loop-update.mir
A llvm/test/CodeGen/Thumb2/LowOverheadLoops/dont-remove-loop-update2.mir
A llvm/test/CodeGen/Thumb2/LowOverheadLoops/dont-remove-loop-update3.mir
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/fast-fp-loops.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vector-arith-codegen.ll
Log Message:
-----------
[ARM][LowOverheadLoops] Remove dead loop update instructions.
After creating a low-overhead loop, the loop update instruction was still
lingering around hurting performance. This removes dead loop update
instructions, which in our case are mostly SUBS instructions.
To support this, some helper functions were added to MachineLoopUtils and
ReachingDefAnalysis to analyse live-ins of loop exit blocks and find uses
before a particular loop instruction, respectively.
This is a first version that removes a SUBS instruction when there are no other
uses inside and outside the loop block, but there are some more interesting
cases in test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll which
shows that there is room for improvement. For example, we can't handle this
case yet:
..
dlstp.32 lr, r2
.LBB0_1:
mov r3, r2
subs r2, #4
vldrh.u32 q2, [r1], #8
vmov q1, q0
vmla.u32 q0, q2, r0
letp lr, .LBB0_1
@ %bb.2:
vctp.32 r3
..
which is a lot more tricky because r2 is not only used by the subs, but also by
the mov to r3, which is used outside the low-overhead loop by the vctp
instruction, and that requires a bit of a different approach, and I will follow
up on this.
Differential Revision: https://reviews.llvm.org/D71007
More information about the All-commits
mailing list