[PATCH] D79175: [ARM][MVE] Tail-Predication: use @llvm.get.active.lane.mask to get the BTC

Sjoerd Meijer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 8 09:22:39 PDT 2020


SjoerdMeijer updated this revision to Diff 269250.
SjoerdMeijer added a comment.

Short story:

`isKnownNonNegativeInLoop` is unfortunately not able to give an answer for this expression, and as a result most/all loops would be rejected. I have added a FIXMEs, and am using `isKnownNegativeInLoop` as that is at least able to catch some cases (the test cases with constant values) and is probably better than nothing. I have tried several SCEV helpers, but just none of them seem to support this expression. I think teaching SCEV about this expression is a separate issue. @efriedma, @samparker : please let me know what you think, and what you think the order of events should be.

Longer story:

> I wasn't trying to imply you shouldn't use isKnownNonNegativeInLoop, if that's appropriate.

Thanks for confirming. I indeed got confused, briefly went onto the wrong track, but rediscovered isKnownNonNegativeInLoop and experimented further with that.

While evaluating this expression and if it is non-negative:

  (((ElementCount + (VectorWidth - 1)) / VectorWidth) - TripCount

and dumping `KnownNonNegative` information for the intermediate expressions, I see that SCEV is able to determine `KnownNonNegative` for all intermediate expressions, except the last one:

  BTC: (-1 + %N)
  BTC KnownNonNegative: 1
  elemcount: %N
  elemcount + vlen-1: (3 + %N)<nuw><nsw>
  KnownNonNegative: 1
  Ceil: ((3 + %N)<nuw><nsw> /u 4)
  Ceil KnownNonNegative: 1
  TripCount: (1 + ((-4 + (4 * ((3 + %N)<nuw><nsw> /u 4))<nuw>) /u 4))<nuw><nsw>
  TripCount KnownNonNegative: 1
  ECMinusTC: (-1 + (-1 * ((-4 + (4 * ((3 + %N)<nuw><nsw> /u 4))<nuw>) /u 4))<nsw> + ((3 + %N)<nuw><nsw> /u 4))
  KnownNonNegative: 0

When I request signed integer ranges for rounded element count (Ceil) and the trip count (TC) I see this:

  Range Ceil: [0,1073741824)
  Range TC: [1,1073741825)

And that looks very sensible and promising. I wanted to add support for this here, but then discovered a case that worked slightly differently, and it needs some more thinking and investigation, and probably best be added as a helper somewhere to looputils/SCEV. I have traced SCEV and its decision making, and roughly see where it is rejecting this, but need to investigate that further.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79175/new/

https://reviews.llvm.org/D79175

Files:
  llvm/lib/Target/ARM/MVETailPredication.cpp
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/basic-tail-pred.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/clear-maskedinsts.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/cond-vector-reduce-mve-codegen.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/extending-loads.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/fast-fp-loops.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/nested.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-const.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-widen.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-reduce.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/vector-arith-codegen.ll
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/vector-reduce-mve-tail.ll
  llvm/test/CodeGen/Thumb2/mve-fma-loops.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D79175.269250.patch
Type: text/x-patch
Size: 166715 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200608/cd4dbfdf/attachment-0001.bin>


More information about the llvm-commits mailing list