[all-commits] [llvm/llvm-project] 36ea18: [NFC][CVP] Add tests for srem with potentially dif...
Roman Lebedev via All-commits
all-commits at lists.llvm.org
Tue Sep 22 11:38:06 PDT 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: 36ea18b06430e0a1094f9b0994e4abb5cc2175c9
https://github.com/llvm/llvm-project/commit/36ea18b06430e0a1094f9b0994e4abb5cc2175c9
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-09-22 (Tue, 22 Sep 2020)
Changed paths:
M llvm/test/Transforms/CorrelatedValuePropagation/srem.ll
Log Message:
-----------
[NFC][CVP] Add tests for srem with potentially different sigdness domains
Commit: 4eeeb356fc41babf46797b062f74f978b818622b
https://github.com/llvm/llvm-project/commit/4eeeb356fc41babf46797b062f74f978b818622b
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-09-22 (Tue, 22 Sep 2020)
Changed paths:
M llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
M llvm/test/Transforms/CorrelatedValuePropagation/srem.ll
Log Message:
-----------
[CVP] Enhance SRem -> URem fold to work not just on non-negative operands
This is a continuation of 8d487668d09fb0e4e54f36207f07c1480ffabbfd,
the logic is pretty much identical for SRem:
Name: pos pos
Pre: C0 >= 0 && C1 >= 0
%r = srem i8 C0, C1
=>
%r = urem i8 C0, C1
Name: pos neg
Pre: C0 >= 0 && C1 <= 0
%r = srem i8 C0, C1
=>
%r = urem i8 C0, -C1
Name: neg pos
Pre: C0 <= 0 && C1 >= 0
%r = srem i8 C0, C1
=>
%t0 = urem i8 -C0, C1
%r = sub i8 0, %t0
Name: neg neg
Pre: C0 <= 0 && C1 <= 0
%r = srem i8 C0, C1
=>
%t0 = urem i8 -C0, -C1
%r = sub i8 0, %t0
https://rise4fun.com/Alive/Vd6
Now, this new logic does not result in any new catches
as of vanilla llvm test-suite + RawSpeed.
but it should be virtually compile-time free,
and it may be important to be consistent in their handling,
because if we had a pair of sdiv-srem, and only converted one of them,
-divrempairs will no longer see them as a pair,
and thus not "merge" them.
Commit: b38d897e802664034c7e6e4654328256ed370a61
https://github.com/llvm/llvm-project/commit/b38d897e802664034c7e6e4654328256ed370a61
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-09-22 (Tue, 22 Sep 2020)
Changed paths:
M llvm/include/llvm/IR/ConstantRange.h
M llvm/lib/IR/ConstantRange.cpp
M llvm/unittests/IR/ConstantRangeTest.cpp
Log Message:
-----------
[ConstantRange] binaryXor(): special-case binary complement case - the result is precise
Use the fact that `~X` is equivalent to `-1 - X`, which gives us
fully-precise answer, and we only need to special-handle the wrapped case.
This fires ~16k times for vanilla llvm test-suite + RawSpeed.
Commit: 2ed9c4c70bbb36fa12d48a73abc2d89c0af80060
https://github.com/llvm/llvm-project/commit/2ed9c4c70bbb36fa12d48a73abc2d89c0af80060
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-09-22 (Tue, 22 Sep 2020)
Changed paths:
M llvm/include/llvm/IR/ConstantRange.h
M llvm/lib/IR/ConstantRange.cpp
M llvm/unittests/IR/ConstantRangeTest.cpp
Log Message:
-----------
[ConstantRange] Introduce getActiveBits() method
Much like APInt::getActiveBits(), computes how many bits are needed
to be able to represent every value in this constant range,
treating the values as unsigned.
Commit: ba5afe5588ded61052c8727dbcb0407b5de4410c
https://github.com/llvm/llvm-project/commit/ba5afe5588ded61052c8727dbcb0407b5de4410c
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-09-22 (Tue, 22 Sep 2020)
Changed paths:
M llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
Log Message:
-----------
[NFC][CVP] processUDivOrURem(): refactor to use ConstantRange::getActiveBits()
As an exhaustive test shows, this logic is fully identical to the old
implementation, with exception of the case where both of the operands
had empty ranges:
```
TEST_F(ConstantRangeTest, CVP_UDiv) {
unsigned Bits = 4;
EnumerateConstantRanges(Bits, [&](const ConstantRange &CR0) {
if(CR0.isEmptySet())
return;
EnumerateConstantRanges(Bits, [&](const ConstantRange &CR1) {
if(CR0.isEmptySet())
return;
unsigned MaxActiveBits = 0;
for (const ConstantRange &CR : {CR0, CR1})
MaxActiveBits = std::max(MaxActiveBits, CR.getActiveBits());
ConstantRange OperandRange(Bits, /*isFullSet=*/false);
for (const ConstantRange &CR : {CR0, CR1})
OperandRange = OperandRange.unionWith(CR);
unsigned NewWidth = OperandRange.getUnsignedMax().getActiveBits();
EXPECT_EQ(MaxActiveBits, NewWidth) << CR0 << " " << CR1;
});
});
}
```
Commit: b85395f309890bac5f2d3296ce08dc46c24ef77f
https://github.com/llvm/llvm-project/commit/b85395f309890bac5f2d3296ce08dc46c24ef77f
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-09-22 (Tue, 22 Sep 2020)
Changed paths:
M llvm/include/llvm/ADT/APInt.h
Log Message:
-----------
[NFC][APInt] Refactor getMinSignedBits() in terms of getNumSignBits()
This is fully identical to the old implementation, just easier to read.
Commit: 7465da2077c2b8def7440094e15ac1199226bc25
https://github.com/llvm/llvm-project/commit/7465da2077c2b8def7440094e15ac1199226bc25
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-09-22 (Tue, 22 Sep 2020)
Changed paths:
M llvm/include/llvm/IR/ConstantRange.h
M llvm/lib/IR/ConstantRange.cpp
M llvm/unittests/IR/ConstantRangeTest.cpp
Log Message:
-----------
[ConstantRange] Introduce getMinSignedBits() method
Similar to the ConstantRange::getActiveBits(), and to similarly-named
methods in APInt, returns the bitwidth needed to represent
the given signed constant range
Commit: 4977eadee56f81377049fb8763350a66cfd2d078
https://github.com/llvm/llvm-project/commit/4977eadee56f81377049fb8763350a66cfd2d078
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-09-22 (Tue, 22 Sep 2020)
Changed paths:
M llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
Log Message:
-----------
[NFC][CVP] Give a better name STATISTIC() counting udiv i16 -> udiv i8 xforms
Commit: cb10d5d714e9ae83cfd392dd127e13c51f4d299d
https://github.com/llvm/llvm-project/commit/cb10d5d714e9ae83cfd392dd127e13c51f4d299d
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-09-22 (Tue, 22 Sep 2020)
Changed paths:
M llvm/test/Transforms/CorrelatedValuePropagation/sdiv.ll
M llvm/test/Transforms/CorrelatedValuePropagation/srem.ll
Log Message:
-----------
[NFC][CVP] Add tests for SDiv/SRem narrowing
Commit: b289dc530632613edb3eb067895c1981cb77ccd0
https://github.com/llvm/llvm-project/commit/b289dc530632613edb3eb067895c1981cb77ccd0
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-09-22 (Tue, 22 Sep 2020)
Changed paths:
M llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
M llvm/test/Transforms/CorrelatedValuePropagation/sdiv.ll
M llvm/test/Transforms/CorrelatedValuePropagation/srem.ll
Log Message:
-----------
[CVP] Narrow SDiv/SRem to the smallest power-of-2 that's sufficient to contain its operands
This is practically identical to what we already do for UDiv/URem:
https://rise4fun.com/Alive/04K
Name: narrow udiv
Pre: C0 u<= 255 && C1 u<= 255
%r = udiv i16 C0, C1
=>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = udiv i8 %t0, %t1
%r = zext i8 %t2 to i16
Name: narrow exact udiv
Pre: C0 u<= 255 && C1 u<= 255
%r = udiv exact i16 C0, C1
=>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = udiv exact i8 %t0, %t1
%r = zext i8 %t2 to i16
Name: narrow urem
Pre: C0 u<= 255 && C1 u<= 255
%r = urem i16 C0, C1
=>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = urem i8 %t0, %t1
%r = zext i8 %t2 to i16
... only here we need to look for 'min signed bits', not 'active bits',
and there's an UB to be aware of:
https://rise4fun.com/Alive/KG86
https://rise4fun.com/Alive/LwR
Name: narrow sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = sdiv i16 C0, C1
=>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = sdiv i9 %t0, %t1
%r = sext i9 %t2 to i16
Name: narrow exact sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = sdiv exact i16 C0, C1
=>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = sdiv exact i9 %t0, %t1
%r = sext i9 %t2 to i16
Name: narrow srem
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = srem i16 C0, C1
=>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = srem i9 %t0, %t1
%r = sext i9 %t2 to i16
Name: narrow sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = sdiv i16 C0, C1
=>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = sdiv i8 %t0, %t1
%r = sext i8 %t2 to i16
Name: narrow exact sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = sdiv exact i16 C0, C1
=>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = sdiv exact i8 %t0, %t1
%r = sext i8 %t2 to i16
Name: narrow srem
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = srem i16 C0, C1
=>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = srem i8 %t0, %t1
%r = sext i8 %t2 to i16
The ConstantRangeTest.losslessSignedTruncationSignext test sanity-checks
the logic, that we can losslessly truncate ConstantRange to
`getMinSignedBits()` and signext it back, and it will be identical
to the original CR.
On vanilla llvm test-suite + RawSpeed, this fires 1262 times,
while the same fold for UDiv/URem only fires 384 times. Sic!
Additionally, this causes +606.18% (+1079) extra cases of
aggressive-instcombine.NumDAGsReduced, and +473.14% (+1145)
of aggressive-instcombine.NumInstrsReduced folds.
Compare: https://github.com/llvm/llvm-project/compare/e16d10b7535a...b289dc530632
More information about the All-commits
mailing list