[PATCH] D85737: ARM][MVE] tail-predication: overflow checks for backedge taken count

Sjoerd Meijer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 11 08:19:16 PDT 2020


SjoerdMeijer created this revision.
SjoerdMeijer added reviewers: samparker, efriedma, dmgreen.
Herald added subscribers: danielkiss, javed.absar, hiraditya, kristof.beyls.
Herald added a project: LLVM.
SjoerdMeijer requested review of this revision.

This pick ups the work on the overflow checks for get.active.lane.mask, which ensure that it is safe to insert the VCTP intrinisc that enables tail-predication. For a 2d auto-correlation kernel and its inner loop j:

  M = Size - i;
  for (j = 0; j < M; j++)
    Sum += Input[j] * Input[j+i];

For this inner loop, the SCEV backedge taken count (BTC) expression is:

  (-1 + (sext i16 %Size to i32)),+,-1}<nw><%for.body>

and LoopUtil cannotBeMaxInLoop couldn't calculate a bound on this, thus "BTC cannot be max" could not be determined. So overflow behaviour had to be assumed in the loop tripcount expression that uses the BTC. As a result tail-predication had to be forced (with an option) for this case.

This change solves that by using ScalarEvolution's helper getConstantMaxBackedgeTakenCount which is able to determine the range of BTC, thus can determine it is safe, so that we no longer need to force tail-predication as reflected in the changed test cases.


https://reviews.llvm.org/D85737

Files:
  llvm/lib/Target/ARM/MVETailPredication.cpp
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-reduce.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/varying-outer-2d-reduction.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D85737.284739.patch
Type: text/x-patch
Size: 21023 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200811/0be034bd/attachment.bin>


More information about the llvm-commits mailing list