[PATCH] D48853: [SCEV] Add zext(C + x + ...) -> D + zext(C-D + x + ...)<nuw> transform

Tue Jul 10 00:11:36 PDT 2018

rtereshin added inline comments.

================
Comment at: lib/Analysis/ScalarEvolution.cpp:1816
+        const DataLayout &DL = getDataLayout();
+        KnownBits Known = computeKnownBits(OpV, DL, 0, &AC, nullptr, &DT);
+        MinValue = Known.One.ugt(MinValue) ? Known.One : MinValue;
----------------
rtereshin wrote:
> mkazantsev wrote:
> > I don't understand why we need this. `computeKnownBits` is used to deduce ranges of SCEVUnknown. All other SCEV nodes are supposed to propagate range information (e.g. range of sum is a range from sum of min to sum of max, and so on). Thus, in theory, we should be able to identify the range of any SCEV correctly, unless we have some missing logic in range calculation.
> > 
> > What is `OpV` in the example you're trying to improve, and why SCEV was unable to deduce its range via `getUnsignedRange(getSCEV(OpV))`?
> > What is OpV in the example you're trying to improve
> 
> One of the examples is given in the comment nearby:
> ```
>       // (zext (add (shl X, C1), C2)), for instance, (zext (5 + (4 * X))).
>       // ConstantRange is unable to prove that 1 + (4 + 4 * X) doesn't wrap in
>       // such cases:
>       //
>       // | Expression |     ConstantRange      |       KnownBits       |
>       // |------------|------------------------|-----------------------|
>       // | 4 * X      | [L: 0, U: 253)         | XXXX XX00             |
>       // |            |   => Min: 0, Max: 252  |   => Min: 0, Max: 252 |
>       // |            |                        |                       |
>       // | 4 * X + 4  | [L: 4, U: 1) (wrapped) | YYYY YY00             |
>       // |            |   => Min: 0, Max: 255  |   => Min: 0, Max: 252 |
> ```
> see also lldb session running a similar example, also present in `test/Analysis/ScalarEvolution/no-wrap-add-exprs.ll` updated in this patch:
> ```
>    1814	      if (OpV) {
>    1815	        const DataLayout &DL = getDataLayout();
>    1816	        KnownBits Known = computeKnownBits(OpV, DL, 0, &AC, nullptr, &DT);
> -> 1817	        MinValue = Known.One.ugt(MinValue) ? Known.One : MinValue;
>    1818	      }
>    1819	      APInt C = SC->getAPInt();
>    1820	      APInt D = MinValue.ugt(C) ? C : MinValue;
> Target 0: (opt) stopped.
> (lldb) p OpV->dump()
>   %t1 = add i8 %t0, 5
> (lldb) p SA->dump()
> (5 + (4 * %x))
> (lldb) p MinValue
> (llvm::APInt) $1 = {
>   U = {
>     VAL = 0
>     pVal = 0x0000000000000000
>   }
>   BitWidth = 8
> }
> (lldb) p Known.One.dump()
> APInt(8b, 1u 1s)
> (lldb) p getUnsignedRange(SA)
> (llvm::ConstantRange) $2 = {
>   Lower = {
>     U = {
>       VAL = 5
>       pVal = 0x0000000000000005
>     }
>     BitWidth = 8
>   }
>   Upper = {
>     U = {
>       VAL = 2
>       pVal = 0x0000000000000002
>     }
>     BitWidth = 8
>   }
> }
> ```
> 
> > and why SCEV was unable to deduce its range via getUnsignedRange(getSCEV(OpV))?
> 
> As you mentioned, the range information is calculated using the
> > e.g. range of sum is a range from sum of min to sum of max, and so on
> principle. `ConstantRange` keeps track of min/max boundaries, but it completely loses any periodic information, like "the range contains only values divisible by 4". `KnownBits` behavior is exact opposite: its imprecise when it comes to boundaries, but it keeps track of the periodic information.
> 
> For instance, the only thing that is known about `4 * x` is that the 2 least significant bits of the value are 0s. From `ConstantRange` perspective it only means that the value doesn't exceed `2^32 - 4` if treated as unsigned `i32`. It's completely unaware of the fact that `4 * x` could not be 7, for instance. If we shift the range by adding 5 (`4 * x + 5`) min/max recomputation of the range leads to the wrapped around range `[5, 2)`, that gives us no useful information about the minimum and maximum values (minimum is `0`, maximum is `2^32 - 1`). While from known bits we know that `4 * x + 5` looks like `XXX...XX01`, therefore the minimum value is `1`, and the maximum value is `2^32 - 3`.
> 
> Ranges and KnownBits are complementary to each other, neither is more precise than the other in all cases. If we want a value range analysis with good precision, we need to maintain and update both simultaneously.
@mkazantsev I think I see what's the source of the confusion: apparently, the current implementation tries to utilize knownbits-like information in a limited form of "number of trailing zeros", which is computed for `Add` the following way:

```
  if (const SCEVAddExpr *A = dyn_cast<SCEVAddExpr>(S)) {
    // The result is the min of all operands results.
    uint32_t MinOpRes = GetMinTrailingZeros(A->getOperand(0));
    for (unsigned i = 1, e = A->getNumOperands(); MinOpRes && i != e; ++i)
      MinOpRes = std::min(MinOpRes, GetMinTrailingZeros(A->getOperand(i)));
    return MinOpRes;
  }
```
https://github.com/llvm-mirror/llvm/blob/650cfa6dc060acb5b4c9571d454ec2b990aad648/lib/Analysis/ScalarEvolution.cpp#L5375-L5381

So it does work for expressions like `4 + 4 * x` (well, sometimes, somehow that kind of information is there for unsigned ranges, but not for signed ranges), and that makes my comment inaccurate. I will change it to `5 + 4 * x` example.

For `5 + 4 * x` it doesn't work, of course, as the number of trailing zeroes is 0 (`5 + 4 * x` ~= `XXX...XX01`).

Repository:
  rL LLVM

https://reviews.llvm.org/D48853