[PATCH] D123652: Add use condition for combine SetCCMOVMSK

Xiang Zhang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 13 20:51:13 PDT 2022


xiangzhangllvm added a comment.

In D123652#3447672 <https://reviews.llvm.org/D123652#3447672>, @RKSimon wrote:

> Please can you add test coverage - preferably for both cases?

I want to, if it is easy to reproduced by a small case, I had add the lit test  : )
This combine directly works on DAG, hard to control/tune the "input" DAG for this combine.
I spend hours to write a workable small reproducer, but failed.
I tried to fetch the problem code from the perf-drop project :

   1 ; llc -mtriple=x86_64-unknown -mattr=+avx2
   2
   3 define i1 @zx(<16 x i8> %x, <16 x i8>%y) {
   4   %a = icmp ne <16 x i8> %x, %y
   5   %b = bitcast <16 x i1> %a to i16
   6   %c = icmp eq i16 %b, 0
   7   br i1 %c, label %zx1, label %zx2
   8
   9 zx1:
  10   ret i1 1
  11
  12 zx2:
  13  ; call void @fun(i16 %b)
  14   ret i1 0
  15 }
  16
  17 declare void @fun(i16 %b)

It can reproduce the "combine allof(cmpeq(x,y)) -> ptest(sub(x,y))", out put:

   8 # %bb.0:
   9         vpsubb  %ymm1, %ymm0, %ymm0
  10         vptest  %ymm0, %ymm0
  11         je      .LBB0_1
  12 # %bb.2:                                # %zx2
  13         xorl    %eax, %eax
  14         vzeroupper
  15         retq
  16 .LBB0_1:                                # %zx1
  17         movb    $1, %al
  18         vzeroupper
  19         retq

When I uncomment the test's line 13 " ; call void @fun(i16 %b)"  to " call void @fun(i16 %b)", let the cmp result has more uses.
The previous optimization will change the DAG
from

  SelectionDAG has 14 nodes:
    t0: ch = EntryToken
              t2: v32i8,ch = CopyFromReg t0, Register:v32i8 %0
              t4: v32i8,ch = CopyFromReg t0, Register:v32i8 %1
            t26: v32i8 = X86ISD::PCMPEQ t2, t4
          t31: i32 = X86ISD::MOVMSK t26
        t34: i32,i32 = X86ISD::SUB t31, Constant:i32<-1>
      t36: ch = X86ISD::BRCOND t0, BasicBlock:ch<zx2 0xb5ed448>, TargetConstant:i8<5>, t34:1
    t16: ch = br t36, BasicBlock:ch<zx1 0xb5ed360>

to  (change the SUB to XOR )

  SelectionDAG has 16 nodes:
    t0: ch = EntryToken
          t2: v32i8,ch = CopyFromReg t0, Register:v32i8 %1
          t4: v32i8,ch = CopyFromReg t0, Register:v32i8 %2
        t28: v32i8 = X86ISD::PCMPEQ t2, t4
      t33: i32 = X86ISD::MOVMSK t28
    t35: i32,i32 = X86ISD::XOR t33, Constant:i32<-1>
        t9: ch = CopyToReg t0, Register:i32 %0, t35
      t37: ch = X86ISD::BRCOND t9, BasicBlock:ch<zx2 0xb5ed288>, TargetConstant:i8<5>, t35:1
    t18: ch = br t37, BasicBlock:ch<zx1 0xb5ed1a0>

So, it fail to meet the logic to do such combine. (fail to reproduce).

But from the DAG level, take combine "MOVMSK(PCMPEQ(X,Y)) == -1 -> PTESTZ(SUB(X,Y),SUB(X,Y))" for example:
This is a obvious bug to combine the "mid-node" without considering if the "mid-node" has other uses.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123652/new/

https://reviews.llvm.org/D123652



More information about the llvm-commits mailing list