[all-commits] [llvm/llvm-project] c352e7: [ARM][MVE] Tail-predication: remove the BTC + 1 ov...
sjoerdmeijer via All-commits
all-commits at lists.llvm.org
Tue Aug 25 06:38:36 PDT 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: c352e7fbda2f48c285eca61d2509780f648443ee
https://github.com/llvm/llvm-project/commit/c352e7fbda2f48c285eca61d2509780f648443ee
Author: Sjoerd Meijer <sjoerd.meijer at arm.com>
Date: 2020-08-25 (Tue, 25 Aug 2020)
Changed paths:
M llvm/lib/Target/ARM/MVETailPredication.cpp
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/basic-tail-pred.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/clear-maskedinsts.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/cond-vector-reduce-mve-codegen.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/extending-loads.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/fast-fp-loops.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/nested.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/reductions.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-const.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-intrinsic-add-sat.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-intrinsic-fabs.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-intrinsic-round.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-intrinsic-sub-sat.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-widen.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-reduce.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/varying-outer-2d-reduction.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vector-arith-codegen.ll
M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vector-reduce-mve-tail.ll
M llvm/test/CodeGen/Thumb2/mve-fma-loops.ll
M llvm/test/CodeGen/Thumb2/mve-gather-scatter-tailpred.ll
M llvm/test/CodeGen/Thumb2/mve-vecreduce-loops.ll
Log Message:
-----------
[ARM][MVE] Tail-predication: remove the BTC + 1 overflow checks
This adapts tail-predication to the new semantics of get.active.lane.mask as
defined in D86147. This means that:
- we can remove the BTC + 1 overflow checks because now the loop tripcount is
passed in to the intrinsic,
- we can immediately use that value to setup a counter for the number of
elements processed by the loop and don't need to materialize BTC + 1.
Differential Revision: https://reviews.llvm.org/D86303
More information about the All-commits
mailing list