[all-commits] [llvm/llvm-project] 585742: [SCEV] When computing trip count, only zext if nec...

Joshua Cao via All-commits all-commits at lists.llvm.org
Mon Apr 10 19:58:17 PDT 2023


  Branch: refs/heads/scevzext
  Home:   https://github.com/llvm/llvm-project
  Commit: 585742cbfccd734b19c75dff9709b20367506668
      https://github.com/llvm/llvm-project/commit/585742cbfccd734b19c75dff9709b20367506668
  Author: Joshua Cao <cao.joshua at yahoo.com>
  Date:   2023-04-10 (Mon, 10 Apr 2023)

  Changed paths:
    M llvm/lib/Analysis/LoopCacheAnalysis.cpp
    M llvm/lib/Analysis/ScalarEvolution.cpp

  Log Message:
  -----------
  [SCEV] When computing trip count, only zext if necessary

This patch improves on https://reviews.llvm.org/D110587. To summarize
the patch, given backedge-taken count BC, trip count TC is `BC + 1`.
However, we don't know if BC we might overflow. So the patch modifies TC
computation to `1 + zext(BC)`.

This patch only adds the zext if necessary by looking at the constant
range. If we can determine that BC cannot be the max value for its
bitwidth, then we know adding 1 will not overflow, and the zext is not
needed. We apply loop guards before computing TC to get more data.

The primary motivation is to support my work on more precise trip
multiples in https://reviews.llvm.org/D141823. For example:

```
void test(unsigned n)
  __builtin_assume(n % 6 == 0);
  for (unsigned i = 0; i < n; ++i)
    foo();
```

Prior to this patch, we had `TC = 1 + zext(-1 + 6 * ((6 umax %n) /u
6))<nuw>`. SCEV range computation is able to determine that the BC
cannot be the max value, so the zext is not needed. The result is `TC
-> (6 * ((6 umax %n) /u 6))<nuw>`. From here, we would be able to
determine that %n is a multiple of 6.

There was one change in LoopCacheAnalysis/LoopInterchange required.
Before this patch, if a loop has BC = false, it would compute `TC -> 1 +
zext(false) -> 1`, which was fine. After this patch, it computes `TC -> 1
+ false = true`. CacheAnalysis would then sign extend the `true`, which
was not the intended the behavior. I modified CacheAnalysis such that
it would only zero extend trip counts.

This patch is not NFC, but also does not change any SCEV outputs. I
would like to get this patch out first to make work with trip multiples
easier.

Differential Revision: https://reviews.llvm.org/D147117




More information about the All-commits mailing list