[PATCH] D136015: [InstCombine] Fold series of instructions into mull
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 19 07:40:48 PDT 2022
spatel added inline comments.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineInternal.h:550
+ /// Tries to simplify a few sequence operations into MULL
+ Value *SimplifyMull(BinaryOperator &I);
+
----------------
There's no need to make a class function for this transform. Just create a static function above InstCombinerImpl::visitAdd().
Use the raw BinaryOperator::CreateMul() to return an Instruction, so we don't need to pass the Builder or use replaceInstUsesWith().
================
Comment at: llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:856
+Value *InstCombinerImpl::SimplifyMull(BinaryOperator &I) {
+ if (!I.getType()->isIntegerTy())
+ return nullptr;
----------------
The type check is insufficient in at least 2 ways and over-restrictive in other.
So we need at least 3 more tests like this:
```
define i9 @mul9_low(i9 %in0, i9 %in1) {
%In0Lo = and i9 %in0, 15
%In0Hi = lshr i9 %in0, 4
%In1Lo = and i9 %in1, 15
%In1Hi = lshr i9 %in1, 4
%m10 = mul i9 %In1Hi, %In0Lo
%m01 = mul i9 %In1Lo, %In0Hi
%m00 = mul i9 %In1Lo, %In0Lo
%addc = add i9 %m10, %m01
%shl = shl i9 %addc, 4
%addc9 = add i9 %shl, %m00
ret i9 %addc9
}
define <2 x i8> @mul_v2i8_low(<2 x i8> %in0, <2 x i8> %in1) {
%In0Lo = and <2 x i8> %in0, <i8 15, i8 15>
%In0Hi = lshr <2 x i8> %in0, <i8 4, i8 4>
%In1Lo = and <2 x i8> %in1, <i8 15, i8 15>
%In1Hi = lshr <2 x i8> %in1, <i8 4, i8 4>
%m10 = mul <2 x i8> %In1Hi, %In0Lo
%m01 = mul <2 x i8> %In1Lo, %In0Hi
%m00 = mul <2 x i8> %In1Lo, %In0Lo
%addc = add <2 x i8> %m10, %m01
%shl = shl <2 x i8> %addc, <i8 4, i8 4>
%addc9 = add <2 x i8> %shl, %m00
ret <2 x i8> %addc9
}
define i128 @mul128_low(i128 %in0, i128 %in1) {
%In0Lo = and i128 %in0, 18446744073709551615
%In0Hi = lshr i128 %in0, 64
%In1Lo = and i128 %in1, 18446744073709551615
%In1Hi = lshr i128 %in1, 64
%m10 = mul i128 %In1Hi, %In0Lo
%m01 = mul i128 %In1Lo, %In0Hi
%m00 = mul i128 %In1Lo, %In0Lo
%addc = add i128 %m10, %m01
%shl = shl i128 %addc, 64
%addc9 = add i128 %shl, %m00
ret i128 %addc9
}
```
================
Comment at: llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:866
+ // ResLo = m00 + (addc >> 32);
+ bool IsMulLow =
+ match(&I, m_c_Add(m_Value(M00),
----------------
The structure of these matches is confusing. I'd prefer to organize it more like this:
```
// R = (CrossSum << HalfBits) + (XLo * YLo)
Value *XLo, *YLo;
Value *CrossSum;
if (!match(&I, m_c_Add(m_Shl(m_Value(CrossSum), m_SpecificInt(HalfBits)),
m_Mul(m_Value(XLo), m_Value(YLo)))))
return nullptr;
// XLo = X & HalfMask
// YLo = Y & HalfMask
Value *X, *Y;
if (!match(XLo, m_And(m_Value(X), m_SpecificInt(HalfMask))) ||
!match(YLo, m_And(m_Value(Y), m_SpecificInt(HalfMask))))
return nullptr;
// CrossSum = (X' * (Y >> Halfbits)) + (Y' * (X >> HalfBits))
...
```
IIUC, X' can be either X or XLo in the pattern (and the same for Y'). You can probably use `m_CombineOr(m_Specific(), m_Specific())` to match that with minimal code.
================
Comment at: llvm/test/Transforms/InstCombine/mul.ll:1578
+
+define i64 @mul64_low(i64 noundef %in0, i64 noundef %in1) {
+; CHECK-LABEL: @mul64_low(
----------------
The tests are incomplete for commutative patterns. As I said earlier, I think we need at least 16 tests to verify that the matching is working as expected.
Once we have the right tests in place, please pre-commit the baseline tests (CHECK lines without the code change), so we will only show diffs in this patch.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D136015/new/
https://reviews.llvm.org/D136015
More information about the llvm-commits
mailing list