[all-commits] [llvm/llvm-project] 36ea18: [NFC][CVP] Add tests for srem with potentially dif...

Tue Sep 22 11:38:06 PDT 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 36ea18b06430e0a1094f9b0994e4abb5cc2175c9
      https://github.com/llvm/llvm-project/commit/36ea18b06430e0a1094f9b0994e4abb5cc2175c9
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-09-22 (Tue, 22 Sep 2020)

  Changed paths:
    M llvm/test/Transforms/CorrelatedValuePropagation/srem.ll

  Log Message:
  -----------
  [NFC][CVP] Add tests for srem with potentially different sigdness domains

  Commit: 4eeeb356fc41babf46797b062f74f978b818622b
      https://github.com/llvm/llvm-project/commit/4eeeb356fc41babf46797b062f74f978b818622b
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-09-22 (Tue, 22 Sep 2020)

  Changed paths:
    M llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
    M llvm/test/Transforms/CorrelatedValuePropagation/srem.ll

  Log Message:
  -----------
  [CVP] Enhance SRem -> URem fold to work not just on non-negative operands

This is a continuation of 8d487668d09fb0e4e54f36207f07c1480ffabbfd,
the logic is pretty much identical for SRem:

Name: pos pos
Pre: C0 >= 0 && C1 >= 0
%r = srem i8 C0, C1
  =>
%r = urem i8 C0, C1

Name: pos neg
Pre: C0 >= 0 && C1 <= 0
%r = srem i8 C0, C1
  =>
%r = urem i8 C0, -C1

Name: neg pos
Pre: C0 <= 0 && C1 >= 0
%r = srem i8 C0, C1
  =>
%t0 = urem i8 -C0, C1
%r = sub i8 0, %t0

Name: neg neg
Pre: C0 <= 0 && C1 <= 0
%r = srem i8 C0, C1
  =>
%t0 = urem i8 -C0, -C1
%r = sub i8 0, %t0

https://rise4fun.com/Alive/Vd6

Now, this new logic does not result in any new catches
as of vanilla llvm test-suite + RawSpeed.
but it should be virtually compile-time free,
and it may be important to be consistent in their handling,
because if we had a pair of sdiv-srem, and only converted one of them,
-divrempairs will no longer see them as a pair,
and thus not "merge" them.

  Commit: b38d897e802664034c7e6e4654328256ed370a61
      https://github.com/llvm/llvm-project/commit/b38d897e802664034c7e6e4654328256ed370a61
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-09-22 (Tue, 22 Sep 2020)

  Changed paths:
    M llvm/include/llvm/IR/ConstantRange.h
    M llvm/lib/IR/ConstantRange.cpp
    M llvm/unittests/IR/ConstantRangeTest.cpp

  Log Message:
  -----------
  [ConstantRange] binaryXor(): special-case binary complement case - the result is precise

Use the fact that `~X` is equivalent to `-1 - X`, which gives us
fully-precise answer, and we only need to special-handle the wrapped case.

This fires ~16k times for vanilla llvm test-suite + RawSpeed.

  Commit: 2ed9c4c70bbb36fa12d48a73abc2d89c0af80060
      https://github.com/llvm/llvm-project/commit/2ed9c4c70bbb36fa12d48a73abc2d89c0af80060
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-09-22 (Tue, 22 Sep 2020)

  Changed paths:
    M llvm/include/llvm/IR/ConstantRange.h
    M llvm/lib/IR/ConstantRange.cpp
    M llvm/unittests/IR/ConstantRangeTest.cpp

  Log Message:
  -----------
  [ConstantRange] Introduce getActiveBits() method

Much like APInt::getActiveBits(), computes how many bits are needed
to be able to represent every value in this constant range,
treating the values as unsigned.

  Commit: ba5afe5588ded61052c8727dbcb0407b5de4410c
      https://github.com/llvm/llvm-project/commit/ba5afe5588ded61052c8727dbcb0407b5de4410c
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-09-22 (Tue, 22 Sep 2020)

  Changed paths:
    M llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp

  Log Message:
  -----------
  [NFC][CVP] processUDivOrURem(): refactor to use ConstantRange::getActiveBits()

As an exhaustive test shows, this logic is fully identical to the old
implementation, with exception of the case where both of the operands
had empty ranges:

```
TEST_F(ConstantRangeTest, CVP_UDiv) {
  unsigned Bits = 4;
  EnumerateConstantRanges(Bits, [&](const ConstantRange &CR0) {
    if(CR0.isEmptySet())
      return;
    EnumerateConstantRanges(Bits, [&](const ConstantRange &CR1) {
      if(CR0.isEmptySet())
        return;

      unsigned MaxActiveBits = 0;
      for (const ConstantRange &CR : {CR0, CR1})
        MaxActiveBits = std::max(MaxActiveBits, CR.getActiveBits());

      ConstantRange OperandRange(Bits, /*isFullSet=*/false);
      for (const ConstantRange &CR : {CR0, CR1})
        OperandRange = OperandRange.unionWith(CR);
      unsigned NewWidth = OperandRange.getUnsignedMax().getActiveBits();

      EXPECT_EQ(MaxActiveBits, NewWidth) << CR0 << " " << CR1;
    });
  });
}
```

  Commit: b85395f309890bac5f2d3296ce08dc46c24ef77f
      https://github.com/llvm/llvm-project/commit/b85395f309890bac5f2d3296ce08dc46c24ef77f
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-09-22 (Tue, 22 Sep 2020)

  Changed paths:
    M llvm/include/llvm/ADT/APInt.h

  Log Message:
  -----------
  [NFC][APInt] Refactor getMinSignedBits() in terms of getNumSignBits()

This is fully identical to the old implementation, just easier to read.

  Commit: 7465da2077c2b8def7440094e15ac1199226bc25
      https://github.com/llvm/llvm-project/commit/7465da2077c2b8def7440094e15ac1199226bc25
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-09-22 (Tue, 22 Sep 2020)

  Changed paths:
    M llvm/include/llvm/IR/ConstantRange.h
    M llvm/lib/IR/ConstantRange.cpp
    M llvm/unittests/IR/ConstantRangeTest.cpp

  Log Message:
  -----------
  [ConstantRange] Introduce getMinSignedBits() method

Similar to the ConstantRange::getActiveBits(), and to similarly-named
methods in APInt, returns the bitwidth needed to represent
the given signed constant range

  Commit: 4977eadee56f81377049fb8763350a66cfd2d078
      https://github.com/llvm/llvm-project/commit/4977eadee56f81377049fb8763350a66cfd2d078
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-09-22 (Tue, 22 Sep 2020)

  Changed paths:
    M llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp

  Log Message:
  -----------
  [NFC][CVP] Give a better name STATISTIC() counting udiv i16 -> udiv i8 xforms

  Commit: cb10d5d714e9ae83cfd392dd127e13c51f4d299d
      https://github.com/llvm/llvm-project/commit/cb10d5d714e9ae83cfd392dd127e13c51f4d299d
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-09-22 (Tue, 22 Sep 2020)

  Changed paths:
    M llvm/test/Transforms/CorrelatedValuePropagation/sdiv.ll
    M llvm/test/Transforms/CorrelatedValuePropagation/srem.ll

  Log Message:
  -----------
  [NFC][CVP] Add tests for SDiv/SRem narrowing

  Commit: b289dc530632613edb3eb067895c1981cb77ccd0
      https://github.com/llvm/llvm-project/commit/b289dc530632613edb3eb067895c1981cb77ccd0
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-09-22 (Tue, 22 Sep 2020)

  Changed paths:
    M llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
    M llvm/test/Transforms/CorrelatedValuePropagation/sdiv.ll
    M llvm/test/Transforms/CorrelatedValuePropagation/srem.ll

  Log Message:
  -----------
  [CVP] Narrow SDiv/SRem to the smallest power-of-2 that's sufficient to contain its operands

This is practically identical to what we already do for UDiv/URem:
  https://rise4fun.com/Alive/04K

Name: narrow udiv
Pre: C0 u<= 255 && C1 u<= 255
%r = udiv i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = udiv i8 %t0, %t1
%r = zext i8 %t2 to i16

Name: narrow exact udiv
Pre: C0 u<= 255 && C1 u<= 255
%r = udiv exact i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = udiv exact i8 %t0, %t1
%r = zext i8 %t2 to i16

Name: narrow urem
Pre: C0 u<= 255 && C1 u<= 255
%r = urem i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = urem i8 %t0, %t1
%r = zext i8 %t2 to i16

... only here we need to look for 'min signed bits', not 'active bits',
and there's an UB to be aware of:
  https://rise4fun.com/Alive/KG86
  https://rise4fun.com/Alive/LwR

Name: narrow sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = sdiv i16 C0, C1
  =>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = sdiv i9 %t0, %t1
%r = sext i9 %t2 to i16

Name: narrow exact sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = sdiv exact i16 C0, C1
  =>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = sdiv exact i9 %t0, %t1
%r = sext i9 %t2 to i16

Name: narrow srem
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = srem i16 C0, C1
  =>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = srem i9 %t0, %t1
%r = sext i9 %t2 to i16

Name: narrow sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = sdiv i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = sdiv i8 %t0, %t1
%r = sext i8 %t2 to i16

Name: narrow exact sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = sdiv exact i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = sdiv exact i8 %t0, %t1
%r = sext i8 %t2 to i16

Name: narrow srem
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = srem i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = srem i8 %t0, %t1
%r = sext i8 %t2 to i16

The ConstantRangeTest.losslessSignedTruncationSignext test sanity-checks
the logic, that we can losslessly truncate ConstantRange to
`getMinSignedBits()` and signext it back, and it will be identical
to the original CR.

On vanilla llvm test-suite + RawSpeed, this fires 1262 times,
while the same fold for UDiv/URem only fires 384 times. Sic!

Additionally, this causes +606.18% (+1079) extra cases of
aggressive-instcombine.NumDAGsReduced, and +473.14% (+1145)
of aggressive-instcombine.NumInstrsReduced folds.

Compare: https://github.com/llvm/llvm-project/compare/e16d10b7535a...b289dc530632