[PATCH] D50941: [CodeGen] unwrap truncated subtraction operand for rol generation in matchRotateSub

Sat Aug 18 18:05:20 PDT 2018

jianye.chen created this revision.
jianye.chen added a reviewer: bogner.
Herald added a subscriber: llvm-commits.

https://bugs.llvm.org/show_bug.cgi?id=15880

This patch addresses the issue where if the subtraction operand is ISD::TRUNCATE, it might not get matched for rotation instruction generation. 
This issue reproduces if an truncation is generated before subtraction for its operand, while Pos is not truncated twice. 
i8/i32/i64 does not generate truncates before subtraction (t10), which avoids this issue.

For actual demonstration of the issue, see, https://godbolt.org/g/vsM24w (commented by Simon Pilgrim)
Another issue in 15880 regarding i64 seems already fixed. "if you perform two back-to-back rotates to an expression, the 8 and 32 bit versions use back-to-back rol instructions, the 16 bit version uses two back-to-back shldw instructions, and the 64-bit version performs a shldq followed by a rolq"

define i16 @rol16_1(i16, i16) {
entry:

  %2 = shl i16 %0, %1
  %3 = sub i16 16, %1
  %4 = lshr i16 %0, %3
  %5 = or i16 %2, %4
  ret i16 %5

}

Initial selection DAG: %bb.0 'rol16_1:entry'
SelectionDAG has 18 nodes:

  t0: ch = EntryToken
    t2: i32,ch = CopyFromReg t0, Register:i32 %0
  t3: i16 = truncate t2
  t5: i32,ch = CopyFromReg t0, Register:i32 %1
        t7: i8 = truncate t5
      t8: i16 = shl t3, t7
            t6: i16 = truncate t5
          t10: i16 = sub Constant:i16<16>, t6
        t11: i8 = truncate t10
      t12: i16 = srl t3, t11
    t13: i16 = or t8, t12
  t16: ch,glue = CopyToReg t0, Register:i16 $ax, t13
  t17: ch = X86ISD::RET_FLAG t16, TargetConstant:i32<0>, Register:i16 $ax, t16:1

t7 and t11 gets unwrapped before matchRotateSub. t7->t5(Pos; llvm::ISD::NodeType::CopyFromReg); t11->t10.
When matching for rol, current approach verifies if t10 is an ADD or SUB, and if it is a SUB, then its operand(t6) must match Pos(t5), or Pos must be an ADD t6, xxx (not relevant here).

This patch adds an additional unwrap for t6 if t6 != t5 and t10 is a SUB. (which is the case here, llvm::ISD::TRUNCATE != llvm::ISD::NodeType::CopyFromReg)


Repository:
  rL LLVM

https://reviews.llvm.org/D50941

Files:
  lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  test/CodeGen/X86/rot16.ll


Index: test/CodeGen/X86/rot16.ll
===================================================================

--- test/CodeGen/X86/rot16.ll
+++ test/CodeGen/X86/rot16.ll
@@ -13,7 +13,7 @@
 ; X64-LABEL: foo:
 ; X64:       # %bb.0:
 ; X64-NEXT:    movl %edx, %ecx
-; X64-NEXT:    shldw %cl, %di, %di
+; X64-NEXT:    rolw %cl, %di
 ; X64-NEXT:    movl %edi, %eax
 ; X64-NEXT:    retq
 	%t0 = shl i16 %x, %z
@@ -56,7 +56,7 @@
 ; X64-LABEL: un:
 ; X64:       # %bb.0:
 ; X64-NEXT:    movl %edx, %ecx
-; X64-NEXT:    shrdw %cl, %di, %di
+; X64-NEXT:    rorw %cl, %di
 ; X64-NEXT:    movl %edi, %eax
 ; X64-NEXT:    retq
 	%t0 = lshr i16 %x, %z
Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp
===================================================================
--- lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -5378,6 +5378,17 @@
     return false;
   SDValue NegOp1 = Neg.getOperand(1);
 
+  // If the src operand for subtraction is trunc/sign/zext/any-extended, strip it
+  // for truncation, this is,
+  // trunc((trunc(i32 -> i16) - i16) -> i16) == trunc((i32 - i16) -> i16)
+  bool NegOp1Peel = false;
+  if (NegOp1.getOpcode() == ISD::SIGN_EXTEND ||
+      NegOp1.getOpcode() == ISD::ZERO_EXTEND ||
+      NegOp1.getOpcode() == ISD::ANY_EXTEND ||
+      NegOp1.getOpcode() == ISD::TRUNCATE) {
+    NegOp1Peel = true;
+  }
+
   // On the RHS of [A], if Pos is Pos' & (EltSize - 1), just replace Pos with
   // Pos'.  The truncation is redundant for the purpose of the equality.
   if (MaskLoBits && Pos.getOpcode() == ISD::AND) {
@@ -5401,7 +5412,7 @@
   //
   // (because "x & Mask" is a truncation and distributes through subtraction).
   APInt Width;
-  if (Pos == NegOp1)
+  if (Pos == NegOp1 || (NegOp1Peel && Pos == NegOp1.getOperand(0)))
     Width = NegC->getAPIntValue();
 
   // Check for cases where Pos has the form (add NegOp1, PosC) for some PosC.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D50941.161383.patch
Type: text/x-patch
Size: 1895 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180819/bf138c1c/attachment.bin>