[all-commits] [llvm/llvm-project] 6716e7: [ARM][MVE] tail-predication: overflow checks for b...

Wed Aug 12 01:36:44 PDT 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 6716e7868ec31668f8200e05d8550d3099b2c697
      https://github.com/llvm/llvm-project/commit/6716e7868ec31668f8200e05d8550d3099b2c697
  Author: Sjoerd Meijer <sjoerd.meijer at arm.com>
  Date:   2020-08-12 (Wed, 12 Aug 2020)

  Changed paths:
    M llvm/lib/Target/ARM/MVETailPredication.cpp
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-reduce.ll
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/varying-outer-2d-reduction.ll

  Log Message:
  -----------
  [ARM][MVE] tail-predication: overflow checks for backedge taken count.

This pick ups the work on the overflow checks for get.active.lane.mask,
which ensure that it is safe to insert the VCTP intrinisc that enables
tail-predication. For a 2d auto-correlation kernel and its inner loop j:

  M = Size - i;
  for (j = 0; j < M; j++)
    Sum += Input[j] * Input[j+i];

For this inner loop, the SCEV backedge taken count (BTC) expression is:

  (-1 + (sext i16 %Size to i32)),+,-1}<nw><%for.body>

and LoopUtil cannotBeMaxInLoop couldn't calculate a bound on this, thus "BTC
cannot be max" could not be determined. So overflow behaviour had to be assumed
in the loop tripcount expression that uses the BTC. As a result
tail-predication had to be forced (with an option) for this case.

This change solves that by using ScalarEvolution's helper
getConstantMaxBackedgeTakenCount which is able to determine the range of BTC,
thus can determine it is safe, so that we no longer need to force tail-predication
as reflected in the changed test cases.

Differential Revision: https://reviews.llvm.org/D85737