[PATCH] D136015: [InstCombine] Fold series of instructions into mull

Allen zhong via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 20 07:36:29 PDT 2022


Allen added inline comments.


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp:1277
+  // Skip the odd bitwidth types and large bitwidth types
+  if ((BitWidth & 0x1) || (BitWidth > 128))
+    return nullptr;
----------------
spatel wrote:
> Similarly, why exclude wide widths? We're already using APInt::getMaxValue(), so just use that APInt in the m_SpecificInt() calls?
exclude the wide/vectors widths as hey are unusual get the IR from C/C++ code, and can be expand when needed later? or a seperate patch, now we already need too many cases to cover the pattern?


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineInternal.h:550
+  /// Tries to simplify a few sequence operations into MULL
+  Value *SimplifyMull(BinaryOperator &I);
+
----------------
spatel wrote:
> There's no need to make a class function for this transform. Just create a static function above InstCombinerImpl::visitAdd(). 
> 
> Use the raw BinaryOperator::CreateMul() to return an Instruction, so we don't need to pass the Builder or use replaceInstUsesWith().
Done, thanks for detail suggestions


================
Comment at: llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:856
+Value *InstCombinerImpl::SimplifyMull(BinaryOperator &I) {
+  if (!I.getType()->isIntegerTy())
+    return nullptr;
----------------
spatel wrote:
> The type check is insufficient in at least 2 ways and over-restrictive in other. 
> 
> So we need at least 3 more tests like this:
> 
> 
> ```
> define i9 @mul9_low(i9 %in0, i9 %in1) {
>   %In0Lo = and i9 %in0, 15
>   %In0Hi = lshr i9 %in0, 4
>   %In1Lo = and i9 %in1, 15
>   %In1Hi = lshr i9 %in1, 4
>   %m10 = mul i9 %In1Hi, %In0Lo
>   %m01 = mul i9 %In1Lo, %In0Hi
>   %m00 = mul i9 %In1Lo, %In0Lo
>   %addc = add i9 %m10, %m01
>   %shl = shl i9 %addc, 4
>   %addc9 = add i9 %shl, %m00
>   ret i9 %addc9
> }
> 
> define <2 x i8> @mul_v2i8_low(<2 x i8> %in0, <2 x i8> %in1) {
>   %In0Lo = and <2 x i8> %in0, <i8 15, i8 15>
>   %In0Hi = lshr <2 x i8> %in0, <i8 4, i8 4>
>   %In1Lo = and <2 x i8> %in1, <i8 15, i8 15>
>   %In1Hi = lshr <2 x i8> %in1, <i8 4, i8 4>
>   %m10 = mul <2 x i8> %In1Hi, %In0Lo
>   %m01 = mul <2 x i8> %In1Lo, %In0Hi
>   %m00 = mul <2 x i8> %In1Lo, %In0Lo
>   %addc = add <2 x i8> %m10, %m01
>   %shl = shl <2 x i8> %addc, <i8 4, i8 4>
>   %addc9 = add <2 x i8> %shl, %m00
>   ret <2 x i8> %addc9
> }
> 
> define i128 @mul128_low(i128 %in0, i128 %in1) {
>   %In0Lo = and i128 %in0, 18446744073709551615
>   %In0Hi = lshr i128 %in0, 64
>   %In1Lo = and i128 %in1, 18446744073709551615
>   %In1Hi = lshr i128 %in1, 64
>   %m10 = mul i128 %In1Hi, %In0Lo
>   %m01 = mul i128 %In1Lo, %In0Hi
>   %m00 = mul i128 %In1Lo, %In0Lo
>   %addc = add i128 %m10, %m01
>   %shl = shl i128 %addc, 64
>   %addc9 = add i128 %shl, %m00
>   ret i128 %addc9
> }
> ```
> 
Thanks for detail examples.


================
Comment at: llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:866
+  // ResLo = m00 + (addc >> 32);
+  bool IsMulLow =
+      match(&I, m_c_Add(m_Value(M00),
----------------
spatel wrote:
> The structure of these matches is confusing. I'd prefer to organize it more like this:
> 
> 
> ```
>   // R = (CrossSum << HalfBits) + (XLo * YLo)
>   Value *XLo, *YLo;
>   Value *CrossSum;
>   if (!match(&I, m_c_Add(m_Shl(m_Value(CrossSum), m_SpecificInt(HalfBits)),
>                          m_Mul(m_Value(XLo), m_Value(YLo)))))
>     return nullptr;
> 
>   // XLo = X & HalfMask
>   // YLo = Y & HalfMask
>   Value *X, *Y;
>   if (!match(XLo, m_And(m_Value(X), m_SpecificInt(HalfMask))) ||
>       !match(YLo, m_And(m_Value(Y), m_SpecificInt(HalfMask))))
>     return nullptr;
> 
>   // CrossSum = (X' * (Y >> Halfbits)) + (Y' * (X >> HalfBits))
> ...
> 
> ```
> IIUC, X' can be either X or XLo in the pattern (and the same for Y'). You can probably use `m_CombineOr(m_Specific(), m_Specific())` to match that with minimal code.
Apply your comment, thanks


================
Comment at: llvm/test/Transforms/InstCombine/mul.ll:1578
+
+define i64 @mul64_low(i64 noundef %in0, i64 noundef %in1) {
+; CHECK-LABEL: @mul64_low(
----------------
spatel wrote:
> The tests are incomplete for commutative patterns. As I said earlier, I think we need at least 16 tests to verify that the matching is working as expected. 
> 
> Once we have the right tests in place, please pre-commit the baseline tests (CHECK lines without the code change), so we will only show diffs in this patch.
Addressed in D136340


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136015/new/

https://reviews.llvm.org/D136015



More information about the llvm-commits mailing list